SlideShare uma empresa Scribd logo
1 de 121
Baixar para ler offline
The Mysteries of Metadata
Workshop at Content World 2001, Burlingame, CA. May 15, 2001




                                         Amit Sheth
                                       amit@taalee.com
                     Founder/CEO, Taalee (www.taalee.com)
                   [Taalee is now Semagix: www.semagix.com ]
        Also, Director, Large Scale Distributed Information Systems (LSDIS) Lab, University Of Georgia
                                                (lsdis.cs.uga.edu)




                        Metadata Extraction is a patented technology of Taalee, Inc.
                        Semantic Engine and WorldModel are trademarks of Taale. Inc.
                                                      Confidential                                       HP
Workshop Agenda

What is Metadata ?
Metadata Descriptions and Standards
Metadata Storage/Exchange/Infrastructure
(Automated) Metadata Creation/Extraction/Tagging
Metadata Usage/Applications




                                                   HP 2
What is Metadata?


  Data about data
      Statements, contexts
      Recursive – data about “data about data”
  Applications
      Content management
      Cataloguing
      Information retrieval, search
      …
  "A Web content repository without metadata is like a
  library without an index," - Jack Jia, IWOV
                                                         HP 3
Information Interoperability:
key metadata objective and benefit



System

Syntax

Structure

Semantics
                    Protocols Metadata Domain Modeling,
                                         Ontologies




                                                    HP 4
Semantics

Meaning, Understanding
Facts, Context, Reasoning
Related to: exchange, usage, application




                                           HP 5
A metadata classification

                                       User
                                    Ontologies
                                  Classifications
           Move in this           Domain Models
        direction to
                              Domain Specific Metadata
          tackle           area, population (Census),
 information             land-cover, relief (GIS),metadata
overload!!            concept descriptions from ontologies
                    Domain Independent (structural) Metadata
                 (C++ class-subclass relationships, HTML/SGML
             Document Type Definitions, C program structure...)
                           Direct Content Based Metadata
                  (inverted lists, document vectors, WAIS, Glimpse, LSI)
     Content Dependent Metadata (size, max colors, rows, columns...)
 Content Independent Metadata (creation-date, location, type-of-sensor...)
                          Data (Heterogeneous Types/Media)
                                                                             HP 6
Types of Metadata for digital media

Media type-specific metadata
  eg.,texture of images,font size…
Media processing-specific metadata
  eg.,search, retrieval, personalized filtering
Content Specific metadata
  eg.,rocket related video and documents




                                                  HP 7
Metadata for Digital Data

Metadata                                                Data Type            Metadata Type
Q-Features [Jain and Ham papur]                         Im age, Video        Dom ain Specific
R-Features [Jain and Ham papur]                         Im age, Video        Dom ain Independent
M eta-Features [Jain and Ham papur]                     Im age, Video        Content Independent
Im pression Vector [Kiyoki et al.]                      Im age               Content Descriptive
NDVI, Spatial Registration [Anderson and Stonebraker]   Im age               Dom ain Specific
Speech Feature Index [Glavitsch et al.]                 Audio                Direct Content Based
Topic Change Indices [Chen et al.]                      Audio                Direct Content Based
Docum ent Vectors [ Deerwester et al.]                  Text                 Direct Content Based
Inverted Indices [Kahle and M edlar]                    Text                 Direct Content Based
Content Classification M etadata [Bohm and Rakow]       M ultiM edia         Dom ain Specific
Docum ent Com position M etadata [Bohm and Rakow]       M ultiM edia         Dom ain Independent
M etadata Tem plates [Ordille and M iller]              M edia Independent   Dom ain Specific
Land Cover, Relief [Sheth and Kashyap]                  M edia Independent   Dom ain Specific
Parent Child Relationships [Shklar et al.]              Text                 Dom ain Independent
Contexts [Sciore et al., Kashyap and Sheth]             Structured           Dom ain Specific
Concepts from Cyc [Collet et al.]                       Structured           Dom ain Specific
User’s Data Attributes [Shoens et al.]                  Text, Structured     Dom ain Specific
Dom ain Specific Ontologies [M ena et al.]              M edia Independent   Dom ain Specific
                                                                                                   HP 8
Types of Specs and Standards
(or MetaModels)




Domain Independent: (MCF), RDF, MOF, DublinCore
Media Specific: MPEG4, MPEG7, VoiceXML
Domain/Industry Specific (metamodels): MARC (Library),
FGDC and UDK (Geographic), NewsML (News), PRISM
(Publishing)
Application Specific: ICE (Syndication)
Exchange/Sharing: XCM, XMI
Orthogonal/(Other): RDFS, namespaces, ontologies,
domain models, (DAML, OIL)
                                                     HP 9
what RDF can do for metadata ?


Designed to impose structural constraint on syntax to
support consistent encoding, exchange and processing
of metadata.
Domain Independent Metadata standard.




                                                        HP 10
RDF (Resource Description Format)


                         Property
     Resource                                Value



•RDF data consists of nodes and attached attribute/value pairs
   •Nodes can be any web resources (pages, servers,
   basically anything for which you can give a URI), even
   other instances of metadata.
   •Attributes are named properties of the nodes, and their
   values are either atomic (text strings, numbers, etc.) or
   other resources or metadata instances.
                                                               HP 11
RDF Example 1


                         dc:title
                                     Mysteries of Metadata
            URI:TALK

                       dc:creator
                                    URI:AMIT

<?XML version=‘1.0’?>
<rdf:RDF xmlns:rdf = “http://www.w3.org/TR/REC-rdf-syntax#”
xmlns:dc = “http://purl.org/dc/elements/1.0”>
<rdf:Description rdf:about = “URI:TALK”>
<dc:title>Mysteries of Metadata</dc:title>
<dc:creator rdf:resource = “URI:AMIT”/>
</rdf:Description>
</rdf:RDF>
                                                              HP 12
RDF Example 2


              dc:title
                           Mysteries of Metadata
 URI:TALK

            dc:creator
                         URI:AMIT

        BIB:Aff                        BIB:Email
                     BIB:Name

  URI:LIB                              amit@taalee.com
                         Amit Sheth


                                                         HP 13
RDFS (RDF Schema)



Enables resource description communities to define
(and share) vocabularies (museum, library, e-
commerce…)
Vocabulary (in RDFS) = the meaning, characteristics,
and relationships of a set of properties.




                                                       HP 14
RDF Based Web




    RDF
    Schemas



    RDF/XML
    Descriptions



    Resources

                                        HTML




                   Source:http://www.w3c.rl.ac.uk   HP 15
Dublin Core Metadata Initiative

Simple element set designed for resource description
International, inter-discipline, W3C community
consensus
“Semantic” interface among resource description
communities (very limited form of semantics)




                  Source:www.desire.org                HP 16
Dublin Core RDF



<xml>
<?namespace href = "http://w3.org/rdf-schema" as = "RDF">
<?namespace href = "http://metadata.net/DC" as = "DC">
<RDF:Abbreviated>
<RDF:Assertion RDF:HREF = http://www.mysite.com/mydoc.html
DC:Title = "I've Never Metadata I've Never Liked“
DC:Creator = "Mary Crystal“
DC:Subject = "Metadata, Dublin Core, Stuff"/>
</RDF:Abbreviated>
</xml>




                                                             HP 17
MOF (Metadata Object Facility) and XMI


MOF models metadata using a subset of UML that is

relevant to modeling metadata (class models - classes,

associations and subtyping), a set of rules for mapping

the elements of the MOF Core to CORBA IDL

XML Metadata Interchange (XMI) is an extension of the

MOF into the XML space


                                                          HP 18
NewsML

NewsML is a packaging and metadata format for news
content.
NewsML is developed by the International Press
Telecommunications Council (IPTC), a consortium of
news providers, mostly in the print or wire-service
industries.
Since it deals only with packaging and metadata,
NewsML is complementary both to news content
formats like NITF and to syndication protocols like ICE.


                                                           HP 19
NewsML…

 It can be used by news providers to combine their
 pictures, video, text, graphics and audio files in news
 output available on web sites, mobile phones, high end
 desktops interactive television and any other device.
 accurate, objective set of description tools, which help
 qualify the information and make the search more
 precise.
 NewsML allows a range of metadata to be attached to a
 multi-media story, including a detailed computer-
 readable description of what an item is about.


                                                       HP 20
Example of the end-to-end flow -
        NewsML




The content provider                  The operator receives          Consumers sign up for the
supplies NewsML packaged              NewsML data from the           news service directly on the
media content to the                  content provider. The          device. When using the news
operator. The content is              content server automatically   service, the user browses
categorized as current                pushes updated news articles   through the categories and
events, finance, sport, etc.          to all news service            reads the news articles. The
and updated hourly.                   subscribers.                   news articles are presented in a
                                                                     continuous flow (one after the
                                                                     other) without end-user
                                                                     interaction.
  Source:http://www.mediabricks.com                                                               HP 21
PRISM

Publishing Requirements for Industry Standard
Metadata
Version: 1.0, April 2001
Authors: IDEAlliance (Adobe, Vignette, Kinecta et al.)
Idea: “a standard for interoperable content
description, interchange, and reuse in both
traditional and electronic publishing contexts”
Web site: http://www.prismstandard.org




                                                     HP 22
PRISM Design

Built on existing standards like Dublin Core (DC),
RDF, XML
Designed to be used in a simple, straightforward way
over the Internet
Compatible with NewsML
Integrates easily with ICE (for syndication)
Vocabulary:
  Basic: DC
  Extensions: “Controlled Vocabularies”, e.g., “North
  American Industrial Classification System“ (NAICS)



                                                        HP 23
PRISM Example

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:prism="http://prismstandard.org/1.0#"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/">
  <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg">
    <dc:identifier rdf:resource="http://wanderlust.com/content/2357845" />
    <dc:description>Photograph taken at 6:00 am on Corfu with two models
    </dc:description>
    <dc:title>Walking on the Beach in Corfu</dc:title>
    <dc:creator>John Peterson</dc:creator>
    <dc:contributor>Sally Smith, lighting</dc:contributor>
    <dc:format>image/jpeg</dc:format>
  </rdf:Description>
</rdf:RDF>




              (Source: PRISM spec v. 1; http://www.prismstandard.org/techdev/prismspec1.asp)
                                                                                               HP 24
VoiceXML


 A language for specifying voice dialogs.
   Voice dialogs use audio prompts and text- to- speech
  (TTS) for output; touch- tone keys (DTMF) and automatic
  speech recognition (ASR) for input.

 Goal is to bring the advantages of web-based
development and content delivery to interactive
voice response applications.
 High- level voice-specific language simplifies
application development.
                   Source: http://www.voicexml.org     HP 25
Voice Based Internet
Applications




             Source: http://www.voicexml.org   HP 26
Voice XML Metadata

Voice Specific metadata
Supports Syntactic interoperablity
  Text data to voice data
Voice XML = XML + Voice Metadata




                                     HP 27
VoiceXML – Possible Services

 Information retrieval – News, sports, traffic, stock quotes.
 e- Transactions (e- commerce, e- tailing, etc.)
    Financial: banking, stock trading.
    Catalog browsing (generally as an adjunct to paper).
 Telephone services
     Personal voice dialing, One- number find- me services.
 Intranet – Inventory, HR services, corporate portals.
 Unification – My Whatever: personal portals, personal
agents, unified messaging.

                      Source: http://www.voicexml.org       HP 28
MPEG7

set of description scheme and descriptors to describe
the content of multimedia data.
Provides a language to specify description schemes
A scheme for coding the description




                                                        HP 29
Application Examples for MPEG7

A few application examples are:
  Digital libraries (image catalog, musical dictionary,...)
  Multimedia directory services (e.g. yellow pages)
  Broadcast media selection (radio channel, TV
  channel,...)




                                                              HP 30
Information and Content
Exchange (ICE)

Main Goal: efficient and extensible Content Syndication
protocol for the Internet, using XML syntax
Authors: Adobe, Kinecta, MS, Sun, Vignette et al.
Status: latest spec version 1.1, May 2000; submitted to
W3C for review
Implementations: Vignette Syndication Server, MS
BizTalk, Kinecta Interact, …
Web Site: http://www.icestandard.org



                                                          HP 31
What is the ICE Protocol?


Syndication Protocol for communication between

Syndicators and Subscribers

Metadata to define
  roles and responsibilities of involved parties: Subscriber vs.
  Syndicator, Requestor vs. Responder, Sender vs. Receiver
  format and method of content exchange (e.g., sequenced
  packages, pull vs. push model)


                                                                   HP 32
ICE Applications

ICE vocabulary + domain vocabulary = complete
application
ICE
  establishes and manages the syndication
  delivers data
  logs events
  => content-independent metadata
industry-specific vocabulary defines the content =>
domain-specific metadata


                  Source: http://www.icestandard.org   HP 33
ICE Explained

ICE: Information and Content Exchange protocol
Syndicator: A content aggregator and distributor
Subscriber: A content consumer
Subscription: An agreement between a subscriber and a syndicator
for the delivery of content according to the delivery policy and other
parameters in the agreement
Collection: The current content of a subscription
ICE Package: A delivery of commands to update a collection such
as the addition of content items
ICE Payload: The XML document used by ICE to carry protocol
information. Examples include requests for packages, catalogs of
subscription offers, usage logs and other management information

           Sources: InternetWeek; "ICE Cookbook, version 1.0"
           http://www.internetweek.com/ebizapps01/ebiz050701-3.htm
                                                                         HP 34
<?xml version="1.0"?>
<!DOCTYPE ice-payload SYSTEM "http://.../ice.dtd">
<ice-payload payload-id="ipl-80a56cfe"
              timestamp="05-15-2001T11:00:01"
              ice.version="1.0" >
   <ice-response response-id="irp-20010515181600">
      <ice-item-group group-id= "grp-8610">
      <ice-item item-id="4321"
                 subscription-element="4321"
                 name="Cartoon" filename="demo.gif"
                 content-type="application/xml" >
         <comic-strip title="Looney City"
                      author="Amito Pateru"
                      copyright="Taalee Makeups"
                      pubdate="20010515">
            PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC …
            (ASCII-encoded image)
         </comic-strip>
      </ice-item>                                   Content
      </ice-item-group>                         (domain-specific
   </ice-response>
</ice-payload>                                      metadata)
XCM (eXtended Content Management)

a framework that allows customers to classify content
management offerings according to the business problems
they address. The segments of XCM are
  Content Development - Developing static content and managing the
  process of its subsequent approval, versioning, storage, and retrieval.
  Application Content Management (Vignette) - Deploying content
  dynamically to a Web site and managing that content throughout its
  online lifecycle.
  Content Delivery - Delivering content through multiple channels to
  minimize customer waiting time and improve Web site stability and
  scalability.

    Source :http://www.vignette.com/CDA/Site/0,2097,1-1-30-1458-1146-1743,00.html   HP 36
XCM



                      eXtended Content Management

Content Development          Application Content              Content Delivery
    Management                  Management
    Content Authoring         Metadata Management               Edge Network
Digital Asset Management          Recombination                   Delivery
 Software Configuration           Personalization              Streaming Media
        Management                                                 Delivery
   Document Process                                                Caching
        Management




                           Source :http://www.vignette.com/                      HP 37
Multiple heterogeneous metadata models with different
              tag names for the same data in the same GIS domain




                                           Kansas State




    FGDC Metadata Model                                     UDK Metadata Model
  Theme keywords: digital line graph,                      Search terms: digital line graph,
    hydrography, transportation...                          hydrography, transportation...

         Title: Dakota Aquifer
         Title                                                  Topic: Dakota Aquifer

              Online linkage:                                          Adress Id:
   http://gisdasc.kgs.ukans.edu/dasc/                     http://gisdasc.kgs.ukans.edu/dasc/

Direct Spatial Reference Method: Vector                    Measuring Techniques: Vector

Horizontal Coordinate System Definition:                       Co-ordinate System:
     Universal Transverse Mercator                         Universal Transverse Mercator
             … … … ...                                              … … … ...
                                                                                               HP 38
Different views of Metadata

           Domain Independent Specifications (RDF)
               Frameworks/Infrastructures (XCM)


Application Specific                         Media Specific
                            Metadata
        ICE                                MPEG7, VoiceXML


                         Domain Specific
                       NewsML, FGDC/UDK

                                                              HP 39
Creating and Serving Metadata to
    Power the Life-cycle of Content


    Taalee Infrastructure Services           Taalee Content Applications

  Produce        Catalog/          Integrate                            Interactive
                                                    Personalize
 Aggregate        Index            Syndicate                            Marketing

Where is the                      What other      What is the right    What is the
                                                                       best way to
 content?                         content is it   content for this    monetize this
Whose is it?                      related to?         user?           interaction?




                                                                                      Broadcast,
                                                                                      Wireline,
                            Taalee Semantic MetaBase                                  Wireless,
                                                                                      Interactive TV

                                                                                                HP 40
Taalee’s Intelligent Content Process




                                       HP 41
Metadata Creation and
Semanticization



• Automatic Content
 Classification/Categorization
• Metadata Creation/Extraction:
 Types of metadata created



           Semantic Engine and WorldModel are trademarks of Taalee, Inc.
           Metadata Extraction is a patented technology of Taalee, Inc.
                                                                           HP 42
Forms/Types/Ingest of Content


Sources: Web Sites, Content Feeds and Private
Repositories
Types: Text, Graphics, Audio, Video, Multimedia
Forms: Unstructured text, Semi-structured text,
Structured text (+Media); Static or Dynamic
Ingest: Feed (push), Web (pull),
Repository/Database (usually pull)



                                                  HP 43
Content Handling/Ingest

Infrastructure/Exchange
  Feed Handlers
  Crawlers/Screen Scrapers/Bots
  Software Agents

Centralized, Distributed, Mobile/Migratory




                                             HP 44
Information Extraction for Metadata Creation


                                Nexis        Digital Videos
                                UPI
                                AP         ...                  ...
                               Documents                              Data Stores
           Global/Enterprise                     Digital Maps
           Web Repositories
                                                      ...
                                  Digital Images            Digital Audios




                                        EXTRACTORS


                                     METADATA
                                                                                    HP 45
Extracting a Text Document:
        Syntactic approach



                        INCIDENT MANAGEMENT SITUATION REPORT

                LAYOUT           Friday August 1, 1997 - 0530 MDT

                                  NATIONAL PREPAREDNESS LEVEL II

CURRENT SITUATION: Alaska continues to experience large fire activity. Additional fires have been
staffed for structure protection.

SIMELS, Galena District, BLM. This fire is on the east side of the Innoko Flats, between Galena and McGr
The fore is active on the southern perimeter, which is burning into a continuous stand of black spruce. The
                                              Date => day month int ‘,’ int
fire has increased in size, but was not mapped due to thick smoke. The slopover on the eastern perimeter is
35% contained, while protection of the historic cabit continues.

CHINIKLIK MOUNTAIN, Galena District, BLM. A Type II Incident Management Team (Wehking) is
assigned to the Chiniklik fire. The fire is contained. Major areas of heat have been mopped up. The fire is
contained. Major areas of heat have been mopped-up. All crews and overhead will mop-up where the fire
burned beyond the meadows. No flare-ups occurred today. Demobilization is planned for this weekend,
                                                                                                       HP 46
Traditional Text
  Categorization

                    Customer
                     Training                    Statistical/AI
                        Set                       Techniques


                                    d
                                fee
                                          Classify      Place in
                                                        a taxonomy




                                                                     Routing/Distribution


 Customer
Article Feed
    4715

                                                              Standard Metadata
                                        Classification of
                                          Article 4715        Feed Source: iSyndicate
                                                              Posted Date: 11/20/2000
Taalee’s Categorization & Automatic Metadata Creation

                                                                Knowledge-base &
                                                             Statistical/AI Techniques

    Taalee
   Training                                   Place in
                                                                                              Automated Content
                                                                       Catalog    Metadata
      Set                          Classify   a taxonomy                                       Enrichment (ACE)
                                                                                                      FTE
                                                                                                Company Analysis
                                                                                                 Conference Calls
                                                              Article 4715 Metadata
                                                                                                     Earnings
   Customer                                       Standard    Feed Source: iSyndicate             Stock Analysis
    Training        ed                            metadata    Posted Date: 11/20/2000
       Set                                                    Company Name: France Telecom,           ENT
                   fe

                                                                                Equant          Company Analysis
                                                  Semantic                                       Conference Calls
                                                  metadata    Ticker Symbol: FTE, ENT
                                                                                                     Earnings
                                                              Exchange: NYSE                      Stock Analysis
                                                              Topic: Company News
                                                                                                     NYSE
                                                                                               Member Companies
                                                                                                 Market News
                                                                                                    IPOs


                          Classification
                          of Article 4715
                                                                                                 Taalee Enterprise
                                                                   Content Manager              Customization Suite



                                                                                 Precise
                                                                                 syndication/filtering
                         Article Feed
                             4715                                                  Routing/Distribution
                                                                                 Map to another taxonomy
Automatic Categorization & Metadata
      Tagging (unstructured text/transcript of A/V)


                                                        Video Segment
                                                     with Associated Text
         ABSOLUTE CONTROL OF THE SENATE IS
         STILL IN QUESTION. AS OF TONIGHT, THE
         REPUBLICANS HAVE 50 SENATE SEATS AND
         THE DEMOCRATS 49. IN WASHINGTON STATE,
         THE SENATE RACE REMAINS TOO CLOSE TO
         CALL. IF THE DEMOCRATIC CHALLENGER
         UNSEATS THE REPUBLICAN IUMBENT THE
         SENATE WILL BE EVENLY DIVIDED. IN
                                                       Segment Description
         MISSOURI, REPUBLICAN SENATOR JOHN
         ASHCROFT SAYS HE WILL NOT CHALLENGE
    Auto HIS LOSS TO GOVERNOR MEL CARNAHAN
Categorization
         WHO DIED IN A CRASH THREE WEEKS AGO.
         GOVERNOR CARNAHAN'S WIFE IS EXPECTED
         TO TAKE HIS PLACE. IN THE HIGHEST PROFILE
         SENATE EVENT OF THE NIGHT, HILLARY
         CLINTON WON THE NEW YORK SENATE SEAT.
         SHE IS THE FIRST FIRST LADY TO RUN MUCH
         LESS WIN.


                                                                        Semantic
                                                                        Metadata

                                                                                   HP 49
Automatic Categorization & Metadata
Tagging (Web page)




                         Video with
                         Editorialized
                         Text on the Web
    Auto
     Auto
Categorization
Categorization




                                   Semantic Metadata
                                   Semantic Metadata

                                                       HP 50
Automatic Categorization & Metadata
      Tagging (Feed)




                                         Text
                                         From
                                         Bllomberg
    Auto
     Auto
Categorization
Categorization




                    Semantic Metadata
                     Semantic Metadata



                                                     HP 51
Taalee Extraction and Knowledgebase
   Enhancement
Web Page                Enhanced Metadata Asset




           Extraction
             Agent




                                                  HP 52
Basis for Semantics

A. Facts/Concepts/Terms/Entities
  Dictionary, Thesaurus, Reference Data,
  Vocabulary
B. Facts with Relationships
  Taxonomy/(Categories), Ontology
  Domain Modeling (e.g., Golf = golfer, tournament name, golf
  course, event)
  Knowledge Base


                                                           HP 53
Basis for Semantics

C. Reasoning/Inference
  (Statistical)
  (Information Retrieval)
  Statistical Learning/AI (Bayesian, Neural
  Networks, HMM,…)
  Logic Based (Description Logic)
  Natural Language/Grammar (part of speech,..)



                                                 HP 54
Alternatives for Metadata
  Extraction

                          Statistical methods/Cluster Analysis

                          Learning/AI and Collab. Filtering

Word or Phrase            Reference data/Concept-terms/
                          Dictionary/Thesaurus
                          By topic/industry/subject/domain
                          Ontologies/Domain Models

          deeper          KnowledgeBase
          understanding   By Entities and Relationships

                                                              HP 55
Open Directory Project (ODP):
Classification/Taxonomy & Directory




                                      HP 56
Ontology


  Standardize meaning, description,
  representation of involved attributes
  Capture the semantics involved via domain
  characteristics
  Allow knowledge sharing and reuse
  (Ontological Commitment)



                                              HP 57
Ontology


   Description includes
     Attributes
     Domain Rules
     Functional Dependencies




                               HP 58
An Ontology




              HP 59
Example: Interrelated ontologies
                                                                                                 RECREATIONAL            MILITARY
                                                          LANDFILL
                                LAND                        SITE
                                 (SITE)
    CULTIVATED
      AREA
                                                                                                        LAND                  AGRICULTURAL
             GREENLAND                              ZONING                                               USE
               AREA                                                              COMERCIAL
                                     LAND
                                     BANK



                                                                                          INDUSTRIAL                        RESIDENTIAL
                                                WASTE                                                    RURAL
                                               DISPOSAL


                                                                                                                STORM
                          SOLID                                      SEWAGE                  FLOOD
HAZARDOUS                                                                                                                           TSUNAMI

                                            RESOURCE REC.                        FIRE
            LANDFILL                                                                                                                      causes

                                                                                                       NATURAL
                           RECYCLING                                                                                                VOLCANO
                                                                                                       DISASTER
                                                                              AVALANCHE
                                                washing
            shredding                                                                                                                 causes
                                                                                                                causes
                         magnetic     screening
                        separation                                                                                         LANDSLIDE
                                                                                               EARTHQUAKE
                                                                                                            causes
Large Vocabularies/
Taxonomies/Ontologies
 WordNet
 The Medical Subject Headings (MeSH): NLM's
 controlled vocabulary used for indexing articles, for
 cataloging books and other holdings, and for searching
 MeSH-indexed databases, including MEDLINE. MeSH
 terminology provides a consistent way to retrieve
 information that may use different terminology for the
 same concepts. Year 2000 MeSH includes more than
 19,000 main headings, 110,000 Supplementary Concept
 Records (formerly Supplementary Chemical Records),
 and an entry vocabulary of over 300,000 terms.


                                                          HP 61
Metadata enabled
Applications




     Confidential   HP
Metadata Usage:
Impact on Search & Query processing




     traditional queries based on keywords
     attribute based queries
     content-based queries




                                             HP 63
Oingo.com

Oingo Ontology – ODP based(?), the database of millions
of concepts and relationships that powers Oingo's
semantic technology
Oingo Seek - the database of millions of concepts and
relationships that powers Oingo's semantic technology
Oingo Sense - the knowledge extraction tool that
uncovers the essential meaning of information by sensing
concepts and context
Oingo Lingua - the language of meaning used to state
intent. The basis for intelligent interaction
Assets catalogued are Web sites or Web pages.


                                                       HP 64
Use of Categories for Search




     After 3 or 4 clicks




                               HP 65
Metadata is the basis of making
Content Intelligent


  Precisely what the user asked for
  Closely-related, high-value information beyond what
  was requested
  Ability to explore any dimension around the immediate
  point of interest
                 Intelligent content helps the user
 “think” about and fulfill their information needs with less effort.

                     Intelligent content can be
       more effectively managed, packaged and distributed
                                                                       HP 66
Metadata and Intelligent Content
Taalee makes content more “intelligent” through automatic analysis of every
individual asset to generate a catalog containing:
    • Context of the Content
    • Semantic Metadata describing entities (i.e., Company, Industry, etc.), and
    • Relationships (semantic associations) among all entities

Based on a “Semantic” or “domain” model describing how the user thinks
about the subject matter, supported by a knowledgebase.

“Normal” Content can only be “found” if the
 user enters a keyword that exists within it
                        +                            =      Intelligent Content

Adding related metadata and relationships
   dramatically increases the ability to
 automatically access needed content via
           multiple dimensions                                                     HP 67
More than metadata


Taalee makes content more “intelligent” through automatic analysis of
   every individual content item to create:
       Context of the Content
       Semantic Metadata describing entities (i.e., Company, Industry,
      etc.), and
       Relationships (semantic associations) among all entities

Based on a “Semantic” or “domain” model describing how the user
  thinks about the subject matter, supported by a knowledgebase.




                                                                     HP 68
Metadata & Search

Metadata can improve search significantly, but
metadata enables much more than search
Alternatives for improving search: clustering, link
and other analysis (e.g., Google’s Link Flux
analysis), classification as context, ontologies,
metadata, knowledgebases …




                                                  HP 69
Metadata Usage: Keyword, Attribute
and Content Based Access




                                     HP 70
Keyword Search vs Attribute
       Search with Semantic metadata

                                                                                  Taalee Metadata on
                                                                                    Football Assets


Metadata from Typical
  Virage Search on                              Rich Media Reference Page
Cataloging of Football
 football touchdown                             Baltimore 31, Pit 24
        Assets
                                                                   http://www.nfl.com


           Brian Griese Interview Part Four     Quandry Ismail and Tony Banks hook up for their third long
           Brian Griese talks about the         touchdown, this time on a 76-yarder to extend the Raven’s
           first touchdown he ever threw.       lead to 31-24 in the third quarter.
           URL: http://cbs.sportsline...           League: Professional
                                                    Teams: Ravens, Steelers
           Jimmy Smith Interview Part Seven          Score: Bal 31, Pit 24
           Jimmy Smith explains his                Players: Quandry Ismail, Tony Banks
           philosophy on showboating.                Event: Touchdown
           URL: http://cbs.sportsline...      Produced by: NFL.com
                                               Posted date: 2/02/2000
                                                                                                         HP 71
Taalee’s Semantic Search

       Highly customizable, precise and freshest A/V search




                                                                                Delightful, relevant information,
                                                                                exceptional targeting opportunity


                                         Uniform Metadata for Content from Multiple
Context and Domain Specific Attributes
                                         Sources, Can be sorted by any field
                                                                                                            HP 72
What can a context do?




          Creating a Web of
         related information
HP 73
Taalee Directory

                    Georgia Bulldogs




System recognizes ENTITY & CATEGORY
Taalee Directory

Careless whisper
Semantic Relationships




                         HP 76
Metadata Application Example




  Semantic Applications for highly relevant
  and fresh content:
  Personalization and
  Targeting/interactive marketing




                Please contact Taalee for live demonstrations



                                                                HP 77
Personalized Directory

                                                                Change
                                                                Context




Obtain a whole universe of information (that you may not even
have thought of) about some entities that have always been of
interest to you.
Please enter such semantic keywords below.
Personalized Queries & Hot Topics
                           Personalized Queries

                            1. My Stock Portfolio
                              Microsoft suffers serious hack attack
                              Cisco Systems Inc

         PERSONALIZATION      Analyst Safa Rashtchy on Yahoo!
                              PeopleSoft, Inc
                              AT&T Corp.
                                                                       more…
                            2. My Football Fantasy Team
                              Gators' Spurrier ready for 'big' game
                              Tech's Vick looks to become complete QB
                              Bucs excited about Hamilton
                                         HOT Topics!!!
                              Jasper Sanks rumbles into the end zone…
                               Edwards explains reasons for leaving BYU
                                           1. Election 2000
                                                                     more…
                                              Video: Explaining the electoral map
                            3. Julia Roberts Collection
                                              Race for White House hots up
                              Movie Trailer: "Notting Hill" Gore Florida Edge
                                              Seniors Give                          more…

                              Trailer - Runaway Bride
                                           2. Middle East Peace Conflict
                              Patrick
                              Movie Trailer: "Stepmom" Israel steps up security
                                               More die as
                                               Israel braces for suicide bombs
                               Conspiracy Theory                       more…
                                               Pentagon probes Cole's security      more…
                            4. Pink Floyd Collection
                                            3. Napster Controversy
                               Set the Controls for the Heart of the Sun…
                               Wish You Were Here Brain Behind Napster
                                               The
                                            Napster Lawsuit
                              Round And Around
                              Keep Talking  Creative Nomad II                       more…
                              The Post War Dream
                                                                      more…
Metadata: Targeting




                      HP 80
Semantic/Interactive Targeting




                  Buy Al Pacino Videos
                  Buy Russell Crowe Videos
                  Buy Christopher Plummer Videos
                  Buy Diane Venora Videos
                  Buy Philip Baker Hall Videos
                  Buy The Insider Video

Precisely targeted through the use of Structured Metadata and integration from multiple sources
Web: Extreme Personalization

Realtime                                Interests,
 Feeds                                 Preferences

Web sites
                Time-Shifted
and Pages    Content Aggregator
 Content
                                       Personalized
Databases
                                         Content
            Content
                        Personalized
                          Content

             Semantic EngineTM


                           Structured,
                           Hi-Quality
                        Semantic Metabase
                                                      HP 82
Application of Semantic Metadata and
Automatic Content Enrichment



                User has already completed Web
 MyMedia        Based registration and
 $   MyStocks
                personalization at Voquette’s
     News
     Sports
                Enterprise Customer site.
     Music
                User’s “Wireless Home page”
                shows the categories for his
                interests. There is an alert (new
                content) for his stock and sports
                categories.

                                                    HP 83
Application of Semantic Metadata and
Automatic Content Enrichment


                            Clicking on MyStocks brings
                My Stocks
                            down user’s Personal Portfolio
 MyMedia
                            list. The user wants to see news
 $   MyStocks   CSCO        items about Cisco (see next
     News
                NT          slide).
     Sports
                IBM         Search at the bottom is a
     Music
                Market      semantic search that
                            understands the financial
                            domain, and the knowledge of
                            user’s portfolio. Typically
                            search can be done by typing
                            one word or selecting from a
                            dynamic, personalized menu.

                                                               HP 84
Application of Semantic Metadata and
Automatic Content Enrichment
                                           Different types of recent
                                           audio content about
                            CSCO           Cisco are available.
                My Stocks
 MyMedia                    Analyst Call   The user clicks to see a
 $   MyStocks   CSCO        Conf Call      listing of Analyst Calls
     News       NT          Earnings       on Cisco (next slide).
     Sports
                IBM
     Music
                Market
                                           Icons at the bottom of
                                           the screen enable
                                           contextually relevant
                                           functions: listen, set
                                           alert on story, add to
                                           playlist.


                                                                    HP 85
Application of Semantic Metadata and
 Automatic Content Enrichment


                                                  CSCO Analysis
                                  CSCO
                    My Stocks                    11/08 ON24 Payne
     MyMedia
                                  Analyst Call
                                                 11/07 ON24 H&Q
     $   MyStocks   CSCO
                                  Conf Call      11/06 CBS Langlesis
         News       NT
                                  Earnings
         Sports
                    IBM
         Music
                    Market




Clicking on the link for Cisco Analyst Calls displays a listing
sorted by date. Semantic filtering uses just the right metadata to
meet screen and other constrains. E.g., Analyst Call focuses on
the source and analyst name or company. The icon denote
additional metadata, such as “Strong Buy” by H&Q Analyst.
                                                                       HP 86
iTV: Taalee’s Extreme Personalization

                                          Immediate
                                           Interests,
             Content                     Preferences,
             Provider

     (DBS, DISH, Wink,
         AOL-TV)                        Personalized
                                      Content Capsules,
 Content,                               Redirects and
“Programs”         Meta-Data            Programming
                    Tagged
                    Content

    Semantic EngineTM

                        Structured,
                        Hi-Quality
                         Semantic
                         Metabase
                                                          HP 87
Metadata for Automatic Content
      Enrichment
  Interactive Television
                                                                          Part of the screen can be
                                                                          automatically customized to
This screen is customizable                                               show conference call specific
with interactivity feature                                                information– including transcript,
using metadata such as whether                                            participation, etc. all of which are
there is a new Conference                                                 relevant metadata
Call video on CSCO.
                                                                          Conference Call itself can have
                                                                          embedded metadata to
                                                                          support personalization and
                                                                          interactivity.




                                 This segment has embedded or referenced metadata that is
                                 used by personalization application to show only the stocks
                                 that user is interested in.

                                                                                                             HP 88
Metadata in Enterprise Apps
Collection   Processing    Production Support
Sony
Network
Content
              Categorize
Affiliate
Feeds         Catalog
              Integrate
Public
Sources
             Rich Data
             Metabase
                           Filter, Search, Consolidate,
                           Personalize, Archive,
                           Licensing, Syndication
                                                          HP 89
Customize: Page Settings | Content | Layout | Color                 Video                    A leaking gasoline pipeline burst into flames Thursday, killing
      -- Breaking News for 11/30/2000 --                                                       more than 60 people near Nigeria's commercial capital of Lagos.
                                                                                               Many of the dead were fisherman in wooden canoes engulfed in
     Gore Demands That Recount Restart (9:40 PM)                                               the inferno.
     Gore Says Fla. Can't Name Electors (4:50 PM)
      Bush Meets Colin Powell at Ranch (1:22 PM)                                               More than a dozen burned bodies lay on a beach at the village
     Market Tumbles on Earnings Warning (9:27 AM)                                              of Ebute-Oko facing the central business district of Lagos across
                                                                                               a lagoon.
        Barak Outlines His Peace Plan (6:30 AM)
                                                                                               "At least 60 people died in this needless fire," senior local official
                                                                                               Karimu Alabi said.

                                                                                               Fire crews from state-run Nigerian National Petroleum Corp
                                                                                               (NNPC), which owns the pipeline, were joined by other firemen
                                                                                               from construction company Julius Berger in battling the blaze.

                                                                  t                            Residents said the fire started near Ebute-Oko at daybreak and
                                                                                               spread rapidly along the line of the oil leak, ravaging a cluster of
                                                                                               huts and log houses.
                                                           Sixty Die In Nigeria Blast
                                                                                               At about the same time, a second fire razed Makoko shantytown
                                                        Produced by: Euronews                  where thousands of fishermen and their families live in wood
                                                        Posted Date: 11/30/2000                cabins erected on stilts in the lagoon near Lagos University.
                                                        Event : Election 2000
                                                        Location : Tallahassee, Florida, USA   Residents said fishermen from Makoko had been scavenging for
                                                        People : Al Gore, George W. Bush       gasoline from the leaking pipeline and storing it in cans in the
                                                                                               wooden huts for days. Many victims of the Ebute-Oke fire were



                        • Greatly enhances news-room productivity and time-to-market


                                  • Value-add for production, broadcast & syndication


• Taalee’s semantic metadata enables powerful access to content used by Enterprise’s customers
                                                                                                                                                              HP 90
Description
                                                                                    Produced by : CNN
                                                                                    Posted Date : 12/07/2000
                                                                                    Reporter       : David Lewis
                                                                                    Event          : Election 2000
                                                                                    Location        : Tallahassee, Florida, USA
                                      (1.33) – 12/06/00 - ABC                       People         : Al Gore
                                                                                    TALLAHASSEE, Florida (CNN) –
                                                                                    Though the two presidential candidates
                                      (2.53) - 12/06/00 - CBS                       have until noon Wednesday to file briefs in
                                                                                    Al Gore's appeal to the Florida Supreme
                                      (5.16) - 12/06/00 - ABC                       Court, the outcome of two trials set on the
                                                                                    same day in Leon County, Florida, may
                                                                                    offer Gore his best hope for the presidency.
                                      (2.46) - 12/06/00 - FOX
                                                                                    Democrats in Seminole County are seeking
                                                                                    to have 15,000 absentee ballots thrown out
                                      (1.33) - 12/06/00 - NBC                       in that heavily Republican jurisdiction -- a
                                                                                    move that would give Gore a lead of up to
                                                                (5.33) - 12/06/00
     -- Breaking News --                                                            5,000 votes statewide.
Gore Demands That Recount Restart     (1.33) - 12/06/00 - CBS                       Lawyers for the plaintiff, Harry Jacobs, claim
                                                                                    the ballots should be rejected because they
(1.33) - 12/06/00 - ABC                                                             say County Elections Supervisor Sandra
Gore Says Fla. Can't Name Electors    (3.57) - 12/06/00 - CBS                       Goard allowed Republican workers to fill out
(2.33) - 12/06/00 - CBS
                                                                                    voter identification numbers on 2,126
                                                                                    incomplete absentee ballot applications sent
Bush Meets Colin Powell at Ranch     (4.27) - 12/06/00 - ABC                        in by GOP voters, while refusing to allow
(3.12) - 12/06/00 - NNS                                                             Democratic workers to do the same thing for
                                                                                    Democratic voters.
Market Tumbles on Earnings Warning    (3.44) - 12/06/00 - FOX
(0.32) - 12/06/00 - CBS
                                                                                    The GOP says that suit, and one similar to it
Barak Outlines His Peace Plan         (7.24) - 12/06/00 - CBS                       from Martin County, demonstrates
(1.33) - 12/06/00 - CBS                                                             Democratic Party politics at its most
                                                                                    desperate. Gore is not a party to either of
                                                                                    those lawsuits. On Tuesday, the judge in the
                                                                                                                               HP 91
Metadata’s role in emerging
                                           iTV infrastructure
                         Video                                                      Enhanced
                                                                                   Digital Cable


                                                   MPEG-2/4/7
                        MPEG                                                        MPEG                               ☺☺☺
                                                                                                                        GREAT
                       Encoder                                                     Decoder                               USER
                                                                                                                      EXPERIENCE
                   Create Scene Description Tree                      Retrieve Scene Description Track

           Channel sales                                 Node = AVO Object             License metadata decoder and
   through Video Server Vendors,                                                          semantic applications to
Video App Servers, and Broadcasters                                                            device makers
                                                                             Scene
                                                                             Description
                                                                             Tree
                                                                                                                      Enhanced
                                                                                                                        XML
                                                           Produced by: Fox Sports                                    Description
                                                           Creation Date: 12/05/2000
                                                           League: NFL
                                  Taalee                   Teams: Seattle Seahawks,
“Cisco Systems”                  Semantic                         Atlanta Falcons                        “Cisco Systems”
                                  Engine                   Players: John Kitna
     Node                                                  Coaches: Mike Holmgren,
                                                                     Dan Reeves                            Metadata-rich
                                                           Location: Atlanta                             Value-added Node
                                                     Object Content Information (OCI)

                                                                                                                                HP 92
Intelligent Metadata Creation



                             Usage


            Metadata for Intelligent Content
Content which does        Content which does not         Content the user did
 contain the words
 the user asked for
                      +      contain the words
                          the user asked for, but
                                                    +   not think to ask for, but
                                                          which he needs to
                          is about what he asked                 know.
                                    for.

 Extractor Agents         Value-added Metadata          Semantic Associations

                                                                                    HP 93
Intelligent Content
           via
Value-Added Metadata


                        HP 94
Value-added Metadata
 Traditional methods rely solely on (syntactic) indexing of keywords to enable
 users to access content
     • If a keyword is not in the content, it cannot be found.
     • The burden is on the user to think of and ask for the “right” keyword.

  For example: If a story is about “Roger Clemens” but does not contain the
 words “New York Yankees”, that story cannot and will not be found if the user
 searches for “New York Yankees” or “Yankees”.

     Understanding of the content is needed to create new metadata.

 Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
             Baseball for a TEAM from New York called the Yankees.
Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
         to add missing metadata to describe content more completely.
                                                                                 HP 95
Guided Demo for Value Added Metadata –
    Example one
• Go to http://www.mediaanywhere.com/Football.html & search for Player = Jamal Anderson.
• Click on the first result (titled “Week 3 Top10: Anderson TD Run”) and view the metadata
 on the following RMR page
• Here is what you see:
     Produced by: NFL.com           Posted Date: 9/20/2000         League : NFL
     Teams : Atlanta Falcons        Players : Jamal Anderson

• Now click on the button to play the asset (button marked “REAL”)
• View the source HTML page that has the original story, and locate this story with the
 heading “Week 3 top 10: Anderson TD run”
• Verify that Team=Atlanta Falcons or League=NFL was not present in the source content.
• Taalee attached this value-added metadata to this asset’s existing metadata so that a user
 searching for Atlanta Falcons will find this story on Jamal Anderson, who is a player of
 Atlanta Falcons team
                                                                                               HP 96
Guided Demo for Value Added Metadata –
    Example Two
• Go to http://www.mediaanywhere.com/Baseball.html & search for Player = Gary Sheffield
• Click on the first result (titled “I want out!”) & view the metadata on the following RMR page
• Here is what you see:
     Produced by: ESPN               Posted Date: 3/03/2001          League : National League
     Teams : Los Angeles Dodgers     Players : Gary Sheffield

• Now click on the button to play the asset (button marked “REAL”)
• View the source HTML page that has the original story, and locate this story with the
 heading “I want out!”
• Verify that Team=Los Angeles Dodgers or League=National League was not present in
 the source content.
• Taalee attached this value-added metadata to this asset’s existing metadata so that a user
 searching for Los Angeles Dodgers will find this story on Gary Sheffield, who is a player of
 Los Angeles Dodgers team
                                                                                                   HP 97
Example 1 – Snapshots (“Jamal Anderson”)




  Search for ‘Jamal
Anderson’ in ‘Football’


                                                         Click on first result for
                                                            Jamal Anderson




                           View the original source
                            HTML page. Verify that
                          the source page contains
                          no mention of Team name
                           and League name. They
                             were Taalee’s value-
                          additions to the metadata
                          to facilitate easier search.




                                                         View metadata. Note that
                                                         Team name and League
                                                          name are also included
                                                             in the metadata

                                                                                     HP 98
Example 2 – Snapshots (“Gary Sheffield”)




  Search for ‘Gary
Sheffield’ in ‘Baseball’


                                                          Click on first result for
                                                              Gary Sheffield
                            View the original source
                             HTML page. Verify that
                           the source page contains
                           no mention of Team name
                            and League name. They
                              were Taalee’s value-
                           additions to the metadata
                           to facilitate easier search.




                                                          View metadata. Note that
                                                          Team name and League
                                                           name are also included
                                                              in the metadata

                                                                                      HP 99
Intelligent Content – Value-Added Metadata
  Some Metadata are obtained explicitly from the
  asset. Others (not present in the asset) are added
    by Taalee using its semantic relationships.        League Name of league to which the
                                                       Name   payer’s team belongs – Not
                                                              mentioned explicitly in asset – Value-
  The asset is richly, fully described in the many
                                                                   added by Taalee’s processing based on
         ways the users chose to interact.                         semantic associations.

        Posted                           Rich Media
         Date                                                                      Team Name
                                        Sports Asset
  Date of asset posting –                                                       Name of team for which
  Extracted automatically                                                       player plays – Not
                                                                                mentioned explicitly in asset
                                                                                – Value-added using Taalee’s
                                                                 Sport          semantic relationships

Name of content                                        Name of
provider that       Producer                            sport
produced the         Name
asset
                                                                              Legend:
             Name of players                                                  X            Y means
             mentioned explicitly in        Player                            Taalee uses X to add Y
             the asset – Extracted          Names                             as value-added metadata
                                                                              to the asset
             automatically                                                                              HP 100
Intelligent Content
           via
Semantic Associations


                        HP 101
Semantic Associations

• Traditional search engines rely solely on (syntactic) keywords to find content.
• They do not understand the meaning, context, or relationships of keywords.


For example: a search engine may see that the word “Commerce One” occurs,
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate, Professional & Financial Software INDUSTRY and COMPETES
WITH Ariba.

As a result, search engines cannot go beyond returning a list (or directory view)
of what the user has asked for. Their ability to provide associated information is
extremely limited, static, and difficult to scale.
              Taalee’s Semantic Content Model
    goes beyond indexing keywords and classifying assets to
       Understand and Associate all content it catalogs.                             HP 102
Example (test on http://directory.mediaanywhere.com)

                                                  Links to news on companies
                                                      that compete against
                                                         Commerce One




                                                                    Crucial news on
                                                Links to news on companies
                                                                    Commerce One’s
                                                  Commerce One competes
                                                                competitors (Ariba) can
                                                           against
                     Search for company                         be accessed easily and
                                                (To view news on Ariba, click
                      ‘Commerce One’                                 automatically
                                                    on the link for Ariba)




                                                                                     HP 103
ASP/Enterprise
                                hosted

 Internal Source 1
     Research
                         Extractor                                  2
                          Agent 1          World Model                       Semantic               Semantic
                                                            Consults          Engine               Application
                                                            Knowledge
                                                            Base
                                                            for Cisco’s
                                                            competition
                                                                                Lucent story
                                                                              from external    4
                                                                            feeds picked for
 Internal Source 2                                                             publishing as
                                                  Returns result:
                         Extractor                Lucent is a                  “semantically
                          Agent 2
                                              3   competitor of            related” to Cisco
                                                  Cisco                      story – passed
                                                                           on to Dashboard
                                Story on
                                 Cisco                              1
                                                               Cisco story from
                                                               PW Source 1
                                                               passed on to add
                                                               semantic
External feeds/Web
                                                               associations
   (e.g. Reuters)
                  Extractor    Story on
                   Agent 3      Lucent       Taalee                                             Third-party
                                            Metabase                                           Content Mgmt
                                                                                                    And
                                                                                                Syndication
                                                                    XCM-compliant
Metadata centric                                                    metadata, XML or
                                                                    other format
Content Management Architecture                                                                            HP 104
Semantic Associations
  supported by Taalee Semantic Engine




Intelligent Content = What You Asked for + What you need to know!

 Related
  Stock
                             COMPANY                   Competition
                                                     COMPANIES in
  News
                                                     INDUSTRY with
COMPANIES in Same or
                                                     Competing PRODUCTS
Related INDUSTRY


                                              Regulations
Technology                                        Impacting INDUSTRY
  Products                              EPA
                                        EPA       or Filed By COMPANY
   Important to INDUSTRY    Industry           SEC
   or COMPANY
                              News

                                                                        HP 105
Semantic Web Application Example:
       Financial Advisor Research Dashboard


Automatic
Collation of
semantically                                               Research
related digital                                            Inferred
media information                                          Automatically
from Multiple
Sources




Semantically
Related News
Not                                        Semantic Search/
Specifically                               Personalization, etc.
Asked For
                                                                           HP 106
A vision for future
Semantic Web, Complex Relationships
    and Knowledge Discovery,
   E.g., InfoQuilt project at LSDIS Lab, Univ. of Georgia
Beyond RDF
– one proposal (cf: Ora Lassila)
Structural modeling obviously not enough
   we need a “logic layer” on top of RDF
   some type of description logic is a possibility

Exposing a wide variety of data sources as RDF is
useful, particularly if we have logic/rules which allow us
to draw inference from this data

RDF + DL = “Frame System for WWW”


                Source : www.ontoknowledge.org/oil

                                                             HP 108
Semantic Web - next step in Web evolution

“A Web in which machine reasoning will be
  ubiquitous and devastatingly powerful.” [Berners-Lee]
 “A place where the whim of a human being and the
   reasoning of a machine coexist in an ideal, powerful
   mixture.” [Berners-Lee]

  “A semantic Web would permit more accurate and
    efficient Web searches, which are among the most
    important Web-based activities.” [Berners-Lee]

 A personal definition
 Semantic Web: The concept that Web-accessible
   content can be organized semantically, rather than
   though syntactic and structural methods.


                                                          HP 109
What is DAML (DARPA Agent Markup Language)

   a proposal to create technologies that will enable
   software agents to dynamically identify and
   understand information sources, and to provide
   interoperability between agents in a semantic
   manner.
   Based on RDF+XML
   Agent readable Tags




                   www.daml.org
DAML Example




Source: http://www.zdnet.com/pcweek/stories/jumps/0,4270,2432946,00.html
Three layered Architecture Of
Semantic Web



    Logical Layer
          Formal Semantics and Reasoning
          Support – OIL, DAML-O
    Schema Layer
          Definition of Vocabulary
          RDF Schema
    Data Layer
          Simple data model and syntax for
          metadata - RDF
OIL – as RDF Extension

<rdfs:Class rdf:ID=”herbivore”>
    <rdf:type
  rdf:resource=”http://www.ontoknowledge.org/#DefinedClass”/>
    <rdfs:subClassOf rdf:resource=”#animal”/>
    <rdfs:subClassOf>
        <oil:NOT>
            <oil:hasOperand rdf:resource=”#carnivore”/>
        </oil:NOT>
    </rdfs:subClassOf>
</rdfs:Class>
DAML and OIL – Evolving
towards Semantic Web

OIL Mission
  OIL is a Web-based representation and inference
  layer for ontologies, which combines the widely used
  modeling primitives from frame-based languages with
  the formal semantics and reasoning services provided
  by description logics
Knowledge Discovery -
 Example




Earthquake Sources                          Nuclear Test Sources
    (USGS, NEIC)                           (Oklahoma Observatory, etc.)
              Nuclear Test May Cause Earthquakes



                      Is it really true?
Complex Relationships


A nuclear test could have caused an earthquake
if the earthquake occurred some time after the
nuclear test was conducted and in a nearby region.


     NuclearTest Causes Earthquake
      <= dateDifference( NuclearTest.eventDate,
                         Earthquake.eventDate ) < 30
         AND distance( NuclearTest.latitude,
                       NuclearTest.longitude,
                       Earthquake,latitude,
                       Earthquake.longitude ) < 10000
Knowledge Discovery -
   Example

    When was the first recorded nuclear test conducted?
                             1950
Find the total number of earthquakes with a magnitude
5.8 or higher on the Richter scale per year starting from 1900



                                         Increase in number of
                                         earthquakes since 1945
Knowledge Discovery -
   Example…

For each group of earthquakes with magnitudes in the ranges
5.8-6, 6-7, 7-8, 8-9, and >9 on the Richter scale per year
starting from 1900, find average number of earthquakes


                                 Number of earthquakes with
                                 magnitude > 7 almost constant.
                                 So nuclear tests probably only
                                 cause earthquakes with
                                 magnitude < 7
Knowledge Discovery -
    Example…

Find pairs of nuclear tests and earthquakes such that the earthequake
occurred within 30 days after the test was conducted and in a radius of
10000 miles from the epicenter of the earthquake




                                                            Demo
Resources/References
RDF:www.w3.org/TR/REC-rdf-syntax/
ICE: www.icestandard.org
Meta Object Facility (MOF) Specification, Version 1.3, September 27, 1999:
http://cgi.omg.org/cgi-bin/doc?ad/99-09-05
XML Metadata Interchange (XMI) Specification, Version 1.1, October 25, 1999:
http://cgi.omg.org/cgi-bin/doc?ad/9910-02
http://cgi.omg.org/cgi-bin/doc?ad/99-10-03
DAML: www.daml.org
NEWSML: newsshowcase.reuters.com
PRISM: www.prismstandard.org/techdev/prismspec1.asp
XCM: www.vignette.com
OIL: www.ontoknowledge.org/oil
SEMANTICWEB: www.semanticweb.org
VOICEXML: www.voicexml.org
MPEG7: www.darmstadt.gmd.de/mobile/MPEG7/
Taalee: www.taalee.com
Oingo: www.oingo.com
Multimedia Data Management: Using
Metadata to Integrate and Apply
Digital Media,
Amit Sheth and Wolfgang Klas, Eds.,
McGraw Hill, ISBN: 0-07-057735-8,
1998.

Mais conteúdo relacionado

Mais procurados

Object models and object representation
Object models and object representationObject models and object representation
Object models and object representationJulie Allinson
 
Building a Digital Library
Building a Digital LibraryBuilding a Digital Library
Building a Digital Librarytomasz
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Researchadameq
 
Tutorial on Semantic Digital Libraries (WWW'2007)
Tutorial on Semantic Digital Libraries (WWW'2007)Tutorial on Semantic Digital Libraries (WWW'2007)
Tutorial on Semantic Digital Libraries (WWW'2007)Sebastian Ryszard Kruk
 
Semantic Web Technologies For Digital Libraries
Semantic Web Technologies For Digital LibrariesSemantic Web Technologies For Digital Libraries
Semantic Web Technologies For Digital LibrariesNikesh Narayanan
 
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperabilityMapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperabilityGordon Dunsire
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In PracticeMarcia Zeng
 
Metadata and Tagging
Metadata and TaggingMetadata and Tagging
Metadata and Taggingpauloshea
 
JeromeDL - the Semantic Digital Library
JeromeDL - the Semantic Digital LibraryJeromeDL - the Semantic Digital Library
JeromeDL - the Semantic Digital LibrarySebastian Ryszard Kruk
 
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAPKnowledge Engineering for TELDAP
Knowledge Engineering for TELDAPAAT Taiwan
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD MicrothesauriMarcia Zeng
 
Webinar slides: Interoperability between resources involved in TDM at the lev...
Webinar slides: Interoperability between resources involved in TDM at the lev...Webinar slides: Interoperability between resources involved in TDM at the lev...
Webinar slides: Interoperability between resources involved in TDM at the lev...openminted_eu
 

Mais procurados (20)

Object models and object representation
Object models and object representationObject models and object representation
Object models and object representation
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
Semantic Digital Libraries
Semantic Digital LibrariesSemantic Digital Libraries
Semantic Digital Libraries
 
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research DataNISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
 
Building a Digital Library
Building a Digital LibraryBuilding a Digital Library
Building a Digital Library
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
Tutorial on Semantic Digital Libraries (WWW'2007)
Tutorial on Semantic Digital Libraries (WWW'2007)Tutorial on Semantic Digital Libraries (WWW'2007)
Tutorial on Semantic Digital Libraries (WWW'2007)
 
Metadata: A concept
Metadata: A conceptMetadata: A concept
Metadata: A concept
 
Semantic Web Technologies For Digital Libraries
Semantic Web Technologies For Digital LibrariesSemantic Web Technologies For Digital Libraries
Semantic Web Technologies For Digital Libraries
 
Semantic web
Semantic web Semantic web
Semantic web
 
JeromeDL Tutorial
JeromeDL TutorialJeromeDL Tutorial
JeromeDL Tutorial
 
Understanding data -latest
Understanding data  -latestUnderstanding data  -latest
Understanding data -latest
 
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperabilityMapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability
 
Dublin Core In Practice
Dublin Core In PracticeDublin Core In Practice
Dublin Core In Practice
 
Metadata and Tagging
Metadata and TaggingMetadata and Tagging
Metadata and Tagging
 
JeromeDL - the Semantic Digital Library
JeromeDL - the Semantic Digital LibraryJeromeDL - the Semantic Digital Library
JeromeDL - the Semantic Digital Library
 
Knowledge Engineering for TELDAP
Knowledge Engineering for TELDAPKnowledge Engineering for TELDAP
Knowledge Engineering for TELDAP
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD Microthesauri
 
Webinar slides: Interoperability between resources involved in TDM at the lev...
Webinar slides: Interoperability between resources involved in TDM at the lev...Webinar slides: Interoperability between resources involved in TDM at the lev...
Webinar slides: Interoperability between resources involved in TDM at the lev...
 

Semelhante a The Mysteries of Metadata

Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Artificial Intelligence Institute at UofSC
 
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Artificial Intelligence Institute at UofSC
 
Understanding RDF: the Resource Description Framework in Context (1999)
Understanding RDF: the Resource Description Framework in Context  (1999)Understanding RDF: the Resource Description Framework in Context  (1999)
Understanding RDF: the Resource Description Framework in Context (1999)Dan Brickley
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackMike Bergman
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic WebPeter Mika
 
Semantic Interoperability and Information Brokering in Global Information Sys...
Semantic Interoperability and Information Brokering in Global Information Sys...Semantic Interoperability and Information Brokering in Global Information Sys...
Semantic Interoperability and Information Brokering in Global Information Sys...Amit Sheth
 
Metadata lecture(9 17-14)
Metadata lecture(9 17-14)Metadata lecture(9 17-14)
Metadata lecture(9 17-14)mhb120
 
Understanding Information Architecture
Understanding Information ArchitectureUnderstanding Information Architecture
Understanding Information ArchitectureScott Abel
 
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...dannyijwest
 
Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008askamy
 
Linked Data Planet Key Note
Linked Data Planet Key NoteLinked Data Planet Key Note
Linked Data Planet Key Noterumito
 
Semantic Technolgy
Semantic TechnolgySemantic Technolgy
Semantic TechnolgyTalat Fakhri
 
Web 3 Mark Greaves
Web 3 Mark GreavesWeb 3 Mark Greaves
Web 3 Mark GreavesMediabistro
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Dr. Haxel Consult
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Dan Brickley
 

Semelhante a The Mysteries of Metadata (20)

Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
 
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
Relationships at the Heart of Semantic Web: Modeling, Discovering, Validating...
 
It's all semantics! -The premises and promises of the semantic web
It's all semantics! -The premises and promises of the semantic webIt's all semantics! -The premises and promises of the semantic web
It's all semantics! -The premises and promises of the semantic web
 
Semantic Web in Action
Semantic Web in ActionSemantic Web in Action
Semantic Web in Action
 
Understanding RDF: the Resource Description Framework in Context (1999)
Understanding RDF: the Resource Description Framework in Context  (1999)Understanding RDF: the Resource Description Framework in Context  (1999)
Understanding RDF: the Resource Description Framework in Context (1999)
 
Metadata
MetadataMetadata
Metadata
 
Structured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product StackStructured Dynamics' Semantic Technologies Product Stack
Structured Dynamics' Semantic Technologies Product Stack
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic Web
 
Semantic Interoperability and Information Brokering in Global Information Sys...
Semantic Interoperability and Information Brokering in Global Information Sys...Semantic Interoperability and Information Brokering in Global Information Sys...
Semantic Interoperability and Information Brokering in Global Information Sys...
 
Metadata 101public
Metadata 101publicMetadata 101public
Metadata 101public
 
Metadata lecture(9 17-14)
Metadata lecture(9 17-14)Metadata lecture(9 17-14)
Metadata lecture(9 17-14)
 
Understanding Information Architecture
Understanding Information ArchitectureUnderstanding Information Architecture
Understanding Information Architecture
 
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
Semantic - Based Querying Using Ontology in Relational Database of Library Ma...
 
Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008Metadata Workshop - Utrecht - November 5, 2008
Metadata Workshop - Utrecht - November 5, 2008
 
Semantic web
Semantic webSemantic web
Semantic web
 
Linked Data Planet Key Note
Linked Data Planet Key NoteLinked Data Planet Key Note
Linked Data Planet Key Note
 
Semantic Technolgy
Semantic TechnolgySemantic Technolgy
Semantic Technolgy
 
Web 3 Mark Greaves
Web 3 Mark GreavesWeb 3 Mark Greaves
Web 3 Mark Greaves
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001
 

Último

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdfssuserdda66b
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 

Último (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 

The Mysteries of Metadata

  • 1. The Mysteries of Metadata Workshop at Content World 2001, Burlingame, CA. May 15, 2001 Amit Sheth amit@taalee.com Founder/CEO, Taalee (www.taalee.com) [Taalee is now Semagix: www.semagix.com ] Also, Director, Large Scale Distributed Information Systems (LSDIS) Lab, University Of Georgia (lsdis.cs.uga.edu) Metadata Extraction is a patented technology of Taalee, Inc. Semantic Engine and WorldModel are trademarks of Taale. Inc. Confidential HP
  • 2. Workshop Agenda What is Metadata ? Metadata Descriptions and Standards Metadata Storage/Exchange/Infrastructure (Automated) Metadata Creation/Extraction/Tagging Metadata Usage/Applications HP 2
  • 3. What is Metadata? Data about data Statements, contexts Recursive – data about “data about data” Applications Content management Cataloguing Information retrieval, search … "A Web content repository without metadata is like a library without an index," - Jack Jia, IWOV HP 3
  • 4. Information Interoperability: key metadata objective and benefit System Syntax Structure Semantics Protocols Metadata Domain Modeling, Ontologies HP 4
  • 5. Semantics Meaning, Understanding Facts, Context, Reasoning Related to: exchange, usage, application HP 5
  • 6. A metadata classification User Ontologies Classifications Move in this Domain Models direction to Domain Specific Metadata tackle area, population (Census), information land-cover, relief (GIS),metadata overload!! concept descriptions from ontologies Domain Independent (structural) Metadata (C++ class-subclass relationships, HTML/SGML Document Type Definitions, C program structure...) Direct Content Based Metadata (inverted lists, document vectors, WAIS, Glimpse, LSI) Content Dependent Metadata (size, max colors, rows, columns...) Content Independent Metadata (creation-date, location, type-of-sensor...) Data (Heterogeneous Types/Media) HP 6
  • 7. Types of Metadata for digital media Media type-specific metadata eg.,texture of images,font size… Media processing-specific metadata eg.,search, retrieval, personalized filtering Content Specific metadata eg.,rocket related video and documents HP 7
  • 8. Metadata for Digital Data Metadata Data Type Metadata Type Q-Features [Jain and Ham papur] Im age, Video Dom ain Specific R-Features [Jain and Ham papur] Im age, Video Dom ain Independent M eta-Features [Jain and Ham papur] Im age, Video Content Independent Im pression Vector [Kiyoki et al.] Im age Content Descriptive NDVI, Spatial Registration [Anderson and Stonebraker] Im age Dom ain Specific Speech Feature Index [Glavitsch et al.] Audio Direct Content Based Topic Change Indices [Chen et al.] Audio Direct Content Based Docum ent Vectors [ Deerwester et al.] Text Direct Content Based Inverted Indices [Kahle and M edlar] Text Direct Content Based Content Classification M etadata [Bohm and Rakow] M ultiM edia Dom ain Specific Docum ent Com position M etadata [Bohm and Rakow] M ultiM edia Dom ain Independent M etadata Tem plates [Ordille and M iller] M edia Independent Dom ain Specific Land Cover, Relief [Sheth and Kashyap] M edia Independent Dom ain Specific Parent Child Relationships [Shklar et al.] Text Dom ain Independent Contexts [Sciore et al., Kashyap and Sheth] Structured Dom ain Specific Concepts from Cyc [Collet et al.] Structured Dom ain Specific User’s Data Attributes [Shoens et al.] Text, Structured Dom ain Specific Dom ain Specific Ontologies [M ena et al.] M edia Independent Dom ain Specific HP 8
  • 9. Types of Specs and Standards (or MetaModels) Domain Independent: (MCF), RDF, MOF, DublinCore Media Specific: MPEG4, MPEG7, VoiceXML Domain/Industry Specific (metamodels): MARC (Library), FGDC and UDK (Geographic), NewsML (News), PRISM (Publishing) Application Specific: ICE (Syndication) Exchange/Sharing: XCM, XMI Orthogonal/(Other): RDFS, namespaces, ontologies, domain models, (DAML, OIL) HP 9
  • 10. what RDF can do for metadata ? Designed to impose structural constraint on syntax to support consistent encoding, exchange and processing of metadata. Domain Independent Metadata standard. HP 10
  • 11. RDF (Resource Description Format) Property Resource Value •RDF data consists of nodes and attached attribute/value pairs •Nodes can be any web resources (pages, servers, basically anything for which you can give a URI), even other instances of metadata. •Attributes are named properties of the nodes, and their values are either atomic (text strings, numbers, etc.) or other resources or metadata instances. HP 11
  • 12. RDF Example 1 dc:title Mysteries of Metadata URI:TALK dc:creator URI:AMIT <?XML version=‘1.0’?> <rdf:RDF xmlns:rdf = “http://www.w3.org/TR/REC-rdf-syntax#” xmlns:dc = “http://purl.org/dc/elements/1.0”> <rdf:Description rdf:about = “URI:TALK”> <dc:title>Mysteries of Metadata</dc:title> <dc:creator rdf:resource = “URI:AMIT”/> </rdf:Description> </rdf:RDF> HP 12
  • 13. RDF Example 2 dc:title Mysteries of Metadata URI:TALK dc:creator URI:AMIT BIB:Aff BIB:Email BIB:Name URI:LIB amit@taalee.com Amit Sheth HP 13
  • 14. RDFS (RDF Schema) Enables resource description communities to define (and share) vocabularies (museum, library, e- commerce…) Vocabulary (in RDFS) = the meaning, characteristics, and relationships of a set of properties. HP 14
  • 15. RDF Based Web RDF Schemas RDF/XML Descriptions Resources HTML Source:http://www.w3c.rl.ac.uk HP 15
  • 16. Dublin Core Metadata Initiative Simple element set designed for resource description International, inter-discipline, W3C community consensus “Semantic” interface among resource description communities (very limited form of semantics) Source:www.desire.org HP 16
  • 17. Dublin Core RDF <xml> <?namespace href = "http://w3.org/rdf-schema" as = "RDF"> <?namespace href = "http://metadata.net/DC" as = "DC"> <RDF:Abbreviated> <RDF:Assertion RDF:HREF = http://www.mysite.com/mydoc.html DC:Title = "I've Never Metadata I've Never Liked“ DC:Creator = "Mary Crystal“ DC:Subject = "Metadata, Dublin Core, Stuff"/> </RDF:Abbreviated> </xml> HP 17
  • 18. MOF (Metadata Object Facility) and XMI MOF models metadata using a subset of UML that is relevant to modeling metadata (class models - classes, associations and subtyping), a set of rules for mapping the elements of the MOF Core to CORBA IDL XML Metadata Interchange (XMI) is an extension of the MOF into the XML space HP 18
  • 19. NewsML NewsML is a packaging and metadata format for news content. NewsML is developed by the International Press Telecommunications Council (IPTC), a consortium of news providers, mostly in the print or wire-service industries. Since it deals only with packaging and metadata, NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE. HP 19
  • 20. NewsML… It can be used by news providers to combine their pictures, video, text, graphics and audio files in news output available on web sites, mobile phones, high end desktops interactive television and any other device. accurate, objective set of description tools, which help qualify the information and make the search more precise. NewsML allows a range of metadata to be attached to a multi-media story, including a detailed computer- readable description of what an item is about. HP 20
  • 21. Example of the end-to-end flow - NewsML The content provider The operator receives Consumers sign up for the supplies NewsML packaged NewsML data from the news service directly on the media content to the content provider. The device. When using the news operator. The content is content server automatically service, the user browses categorized as current pushes updated news articles through the categories and events, finance, sport, etc. to all news service reads the news articles. The and updated hourly. subscribers. news articles are presented in a continuous flow (one after the other) without end-user interaction. Source:http://www.mediabricks.com HP 21
  • 22. PRISM Publishing Requirements for Industry Standard Metadata Version: 1.0, April 2001 Authors: IDEAlliance (Adobe, Vignette, Kinecta et al.) Idea: “a standard for interoperable content description, interchange, and reuse in both traditional and electronic publishing contexts” Web site: http://www.prismstandard.org HP 22
  • 23. PRISM Design Built on existing standards like Dublin Core (DC), RDF, XML Designed to be used in a simple, straightforward way over the Internet Compatible with NewsML Integrates easily with ICE (for syndication) Vocabulary: Basic: DC Extensions: “Controlled Vocabularies”, e.g., “North American Industrial Classification System“ (NAICS) HP 23
  • 24. PRISM Example <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:prism="http://prismstandard.org/1.0#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> <dc:identifier rdf:resource="http://wanderlust.com/content/2357845" /> <dc:description>Photograph taken at 6:00 am on Corfu with two models </dc:description> <dc:title>Walking on the Beach in Corfu</dc:title> <dc:creator>John Peterson</dc:creator> <dc:contributor>Sally Smith, lighting</dc:contributor> <dc:format>image/jpeg</dc:format> </rdf:Description> </rdf:RDF> (Source: PRISM spec v. 1; http://www.prismstandard.org/techdev/prismspec1.asp) HP 24
  • 25. VoiceXML A language for specifying voice dialogs. Voice dialogs use audio prompts and text- to- speech (TTS) for output; touch- tone keys (DTMF) and automatic speech recognition (ASR) for input. Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications. High- level voice-specific language simplifies application development. Source: http://www.voicexml.org HP 25
  • 26. Voice Based Internet Applications Source: http://www.voicexml.org HP 26
  • 27. Voice XML Metadata Voice Specific metadata Supports Syntactic interoperablity Text data to voice data Voice XML = XML + Voice Metadata HP 27
  • 28. VoiceXML – Possible Services Information retrieval – News, sports, traffic, stock quotes. e- Transactions (e- commerce, e- tailing, etc.) Financial: banking, stock trading. Catalog browsing (generally as an adjunct to paper). Telephone services Personal voice dialing, One- number find- me services. Intranet – Inventory, HR services, corporate portals. Unification – My Whatever: personal portals, personal agents, unified messaging. Source: http://www.voicexml.org HP 28
  • 29. MPEG7 set of description scheme and descriptors to describe the content of multimedia data. Provides a language to specify description schemes A scheme for coding the description HP 29
  • 30. Application Examples for MPEG7 A few application examples are: Digital libraries (image catalog, musical dictionary,...) Multimedia directory services (e.g. yellow pages) Broadcast media selection (radio channel, TV channel,...) HP 30
  • 31. Information and Content Exchange (ICE) Main Goal: efficient and extensible Content Syndication protocol for the Internet, using XML syntax Authors: Adobe, Kinecta, MS, Sun, Vignette et al. Status: latest spec version 1.1, May 2000; submitted to W3C for review Implementations: Vignette Syndication Server, MS BizTalk, Kinecta Interact, … Web Site: http://www.icestandard.org HP 31
  • 32. What is the ICE Protocol? Syndication Protocol for communication between Syndicators and Subscribers Metadata to define roles and responsibilities of involved parties: Subscriber vs. Syndicator, Requestor vs. Responder, Sender vs. Receiver format and method of content exchange (e.g., sequenced packages, pull vs. push model) HP 32
  • 33. ICE Applications ICE vocabulary + domain vocabulary = complete application ICE establishes and manages the syndication delivers data logs events => content-independent metadata industry-specific vocabulary defines the content => domain-specific metadata Source: http://www.icestandard.org HP 33
  • 34. ICE Explained ICE: Information and Content Exchange protocol Syndicator: A content aggregator and distributor Subscriber: A content consumer Subscription: An agreement between a subscriber and a syndicator for the delivery of content according to the delivery policy and other parameters in the agreement Collection: The current content of a subscription ICE Package: A delivery of commands to update a collection such as the addition of content items ICE Payload: The XML document used by ICE to carry protocol information. Examples include requests for packages, catalogs of subscription offers, usage logs and other management information Sources: InternetWeek; "ICE Cookbook, version 1.0" http://www.internetweek.com/ebizapps01/ebiz050701-3.htm HP 34
  • 35. <?xml version="1.0"?> <!DOCTYPE ice-payload SYSTEM "http://.../ice.dtd"> <ice-payload payload-id="ipl-80a56cfe" timestamp="05-15-2001T11:00:01" ice.version="1.0" > <ice-response response-id="irp-20010515181600"> <ice-item-group group-id= "grp-8610"> <ice-item item-id="4321" subscription-element="4321" name="Cartoon" filename="demo.gif" content-type="application/xml" > <comic-strip title="Looney City" author="Amito Pateru" copyright="Taalee Makeups" pubdate="20010515"> PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC … (ASCII-encoded image) </comic-strip> </ice-item> Content </ice-item-group> (domain-specific </ice-response> </ice-payload> metadata)
  • 36. XCM (eXtended Content Management) a framework that allows customers to classify content management offerings according to the business problems they address. The segments of XCM are Content Development - Developing static content and managing the process of its subsequent approval, versioning, storage, and retrieval. Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle. Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability. Source :http://www.vignette.com/CDA/Site/0,2097,1-1-30-1458-1146-1743,00.html HP 36
  • 37. XCM eXtended Content Management Content Development Application Content Content Delivery Management Management Content Authoring Metadata Management Edge Network Digital Asset Management Recombination Delivery Software Configuration Personalization Streaming Media Management Delivery Document Process Caching Management Source :http://www.vignette.com/ HP 37
  • 38. Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain Kansas State FGDC Metadata Model UDK Metadata Model Theme keywords: digital line graph, Search terms: digital line graph, hydrography, transportation... hydrography, transportation... Title: Dakota Aquifer Title Topic: Dakota Aquifer Online linkage: Adress Id: http://gisdasc.kgs.ukans.edu/dasc/ http://gisdasc.kgs.ukans.edu/dasc/ Direct Spatial Reference Method: Vector Measuring Techniques: Vector Horizontal Coordinate System Definition: Co-ordinate System: Universal Transverse Mercator Universal Transverse Mercator … … … ... … … … ... HP 38
  • 39. Different views of Metadata Domain Independent Specifications (RDF) Frameworks/Infrastructures (XCM) Application Specific Media Specific Metadata ICE MPEG7, VoiceXML Domain Specific NewsML, FGDC/UDK HP 39
  • 40. Creating and Serving Metadata to Power the Life-cycle of Content Taalee Infrastructure Services Taalee Content Applications Produce Catalog/ Integrate Interactive Personalize Aggregate Index Syndicate Marketing Where is the What other What is the right What is the best way to content? content is it content for this monetize this Whose is it? related to? user? interaction? Broadcast, Wireline, Taalee Semantic MetaBase Wireless, Interactive TV HP 40
  • 42. Metadata Creation and Semanticization • Automatic Content Classification/Categorization • Metadata Creation/Extraction: Types of metadata created Semantic Engine and WorldModel are trademarks of Taalee, Inc. Metadata Extraction is a patented technology of Taalee, Inc. HP 42
  • 43. Forms/Types/Ingest of Content Sources: Web Sites, Content Feeds and Private Repositories Types: Text, Graphics, Audio, Video, Multimedia Forms: Unstructured text, Semi-structured text, Structured text (+Media); Static or Dynamic Ingest: Feed (push), Web (pull), Repository/Database (usually pull) HP 43
  • 44. Content Handling/Ingest Infrastructure/Exchange Feed Handlers Crawlers/Screen Scrapers/Bots Software Agents Centralized, Distributed, Mobile/Migratory HP 44
  • 45. Information Extraction for Metadata Creation Nexis Digital Videos UPI AP ... ... Documents Data Stores Global/Enterprise Digital Maps Web Repositories ... Digital Images Digital Audios EXTRACTORS METADATA HP 45
  • 46. Extracting a Text Document: Syntactic approach INCIDENT MANAGEMENT SITUATION REPORT LAYOUT Friday August 1, 1997 - 0530 MDT NATIONAL PREPAREDNESS LEVEL II CURRENT SITUATION: Alaska continues to experience large fire activity. Additional fires have been staffed for structure protection. SIMELS, Galena District, BLM. This fire is on the east side of the Innoko Flats, between Galena and McGr The fore is active on the southern perimeter, which is burning into a continuous stand of black spruce. The Date => day month int ‘,’ int fire has increased in size, but was not mapped due to thick smoke. The slopover on the eastern perimeter is 35% contained, while protection of the historic cabit continues. CHINIKLIK MOUNTAIN, Galena District, BLM. A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire. The fire is contained. Major areas of heat have been mopped up. The fire is contained. Major areas of heat have been mopped-up. All crews and overhead will mop-up where the fire burned beyond the meadows. No flare-ups occurred today. Demobilization is planned for this weekend, HP 46
  • 47. Traditional Text Categorization Customer Training Statistical/AI Set Techniques d fee Classify Place in a taxonomy Routing/Distribution Customer Article Feed 4715 Standard Metadata Classification of Article 4715 Feed Source: iSyndicate Posted Date: 11/20/2000
  • 48. Taalee’s Categorization & Automatic Metadata Creation Knowledge-base & Statistical/AI Techniques Taalee Training Place in Automated Content Catalog Metadata Set Classify a taxonomy Enrichment (ACE) FTE Company Analysis Conference Calls Article 4715 Metadata Earnings Customer Standard Feed Source: iSyndicate Stock Analysis Training ed metadata Posted Date: 11/20/2000 Set Company Name: France Telecom, ENT fe Equant Company Analysis Semantic Conference Calls metadata Ticker Symbol: FTE, ENT Earnings Exchange: NYSE Stock Analysis Topic: Company News NYSE Member Companies Market News IPOs Classification of Article 4715 Taalee Enterprise Content Manager Customization Suite Precise syndication/filtering Article Feed 4715 Routing/Distribution Map to another taxonomy
  • 49. Automatic Categorization & Metadata Tagging (unstructured text/transcript of A/V) Video Segment with Associated Text ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION. AS OF TONIGHT, THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49. IN WASHINGTON STATE, THE SENATE RACE REMAINS TOO CLOSE TO CALL. IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED. IN Segment Description MISSOURI, REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE Auto HIS LOSS TO GOVERNOR MEL CARNAHAN Categorization WHO DIED IN A CRASH THREE WEEKS AGO. GOVERNOR CARNAHAN'S WIFE IS EXPECTED TO TAKE HIS PLACE. IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT, HILLARY CLINTON WON THE NEW YORK SENATE SEAT. SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN. Semantic Metadata HP 49
  • 50. Automatic Categorization & Metadata Tagging (Web page) Video with Editorialized Text on the Web Auto Auto Categorization Categorization Semantic Metadata Semantic Metadata HP 50
  • 51. Automatic Categorization & Metadata Tagging (Feed) Text From Bllomberg Auto Auto Categorization Categorization Semantic Metadata Semantic Metadata HP 51
  • 52. Taalee Extraction and Knowledgebase Enhancement Web Page Enhanced Metadata Asset Extraction Agent HP 52
  • 53. Basis for Semantics A. Facts/Concepts/Terms/Entities Dictionary, Thesaurus, Reference Data, Vocabulary B. Facts with Relationships Taxonomy/(Categories), Ontology Domain Modeling (e.g., Golf = golfer, tournament name, golf course, event) Knowledge Base HP 53
  • 54. Basis for Semantics C. Reasoning/Inference (Statistical) (Information Retrieval) Statistical Learning/AI (Bayesian, Neural Networks, HMM,…) Logic Based (Description Logic) Natural Language/Grammar (part of speech,..) HP 54
  • 55. Alternatives for Metadata Extraction Statistical methods/Cluster Analysis Learning/AI and Collab. Filtering Word or Phrase Reference data/Concept-terms/ Dictionary/Thesaurus By topic/industry/subject/domain Ontologies/Domain Models deeper KnowledgeBase understanding By Entities and Relationships HP 55
  • 56. Open Directory Project (ODP): Classification/Taxonomy & Directory HP 56
  • 57. Ontology Standardize meaning, description, representation of involved attributes Capture the semantics involved via domain characteristics Allow knowledge sharing and reuse (Ontological Commitment) HP 57
  • 58. Ontology Description includes Attributes Domain Rules Functional Dependencies HP 58
  • 59. An Ontology HP 59
  • 60. Example: Interrelated ontologies RECREATIONAL MILITARY LANDFILL LAND SITE (SITE) CULTIVATED AREA LAND AGRICULTURAL GREENLAND ZONING USE AREA COMERCIAL LAND BANK INDUSTRIAL RESIDENTIAL WASTE RURAL DISPOSAL STORM SOLID SEWAGE FLOOD HAZARDOUS TSUNAMI RESOURCE REC. FIRE LANDFILL causes NATURAL RECYCLING VOLCANO DISASTER AVALANCHE washing shredding causes causes magnetic screening separation LANDSLIDE EARTHQUAKE causes
  • 61. Large Vocabularies/ Taxonomies/Ontologies WordNet The Medical Subject Headings (MeSH): NLM's controlled vocabulary used for indexing articles, for cataloging books and other holdings, and for searching MeSH-indexed databases, including MEDLINE. MeSH terminology provides a consistent way to retrieve information that may use different terminology for the same concepts. Year 2000 MeSH includes more than 19,000 main headings, 110,000 Supplementary Concept Records (formerly Supplementary Chemical Records), and an entry vocabulary of over 300,000 terms. HP 61
  • 62. Metadata enabled Applications Confidential HP
  • 63. Metadata Usage: Impact on Search & Query processing traditional queries based on keywords attribute based queries content-based queries HP 63
  • 64. Oingo.com Oingo Ontology – ODP based(?), the database of millions of concepts and relationships that powers Oingo's semantic technology Oingo Seek - the database of millions of concepts and relationships that powers Oingo's semantic technology Oingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and context Oingo Lingua - the language of meaning used to state intent. The basis for intelligent interaction Assets catalogued are Web sites or Web pages. HP 64
  • 65. Use of Categories for Search After 3 or 4 clicks HP 65
  • 66. Metadata is the basis of making Content Intelligent Precisely what the user asked for Closely-related, high-value information beyond what was requested Ability to explore any dimension around the immediate point of interest Intelligent content helps the user “think” about and fulfill their information needs with less effort. Intelligent content can be more effectively managed, packaged and distributed HP 66
  • 67. Metadata and Intelligent Content Taalee makes content more “intelligent” through automatic analysis of every individual asset to generate a catalog containing: • Context of the Content • Semantic Metadata describing entities (i.e., Company, Industry, etc.), and • Relationships (semantic associations) among all entities Based on a “Semantic” or “domain” model describing how the user thinks about the subject matter, supported by a knowledgebase. “Normal” Content can only be “found” if the user enters a keyword that exists within it + = Intelligent Content Adding related metadata and relationships dramatically increases the ability to automatically access needed content via multiple dimensions HP 67
  • 68. More than metadata Taalee makes content more “intelligent” through automatic analysis of every individual content item to create: Context of the Content Semantic Metadata describing entities (i.e., Company, Industry, etc.), and Relationships (semantic associations) among all entities Based on a “Semantic” or “domain” model describing how the user thinks about the subject matter, supported by a knowledgebase. HP 68
  • 69. Metadata & Search Metadata can improve search significantly, but metadata enables much more than search Alternatives for improving search: clustering, link and other analysis (e.g., Google’s Link Flux analysis), classification as context, ontologies, metadata, knowledgebases … HP 69
  • 70. Metadata Usage: Keyword, Attribute and Content Based Access HP 70
  • 71. Keyword Search vs Attribute Search with Semantic metadata Taalee Metadata on Football Assets Metadata from Typical Virage Search on Rich Media Reference Page Cataloging of Football football touchdown Baltimore 31, Pit 24 Assets http://www.nfl.com Brian Griese Interview Part Four Quandry Ismail and Tony Banks hook up for their third long Brian Griese talks about the touchdown, this time on a 76-yarder to extend the Raven’s first touchdown he ever threw. lead to 31-24 in the third quarter. URL: http://cbs.sportsline... League: Professional Teams: Ravens, Steelers Jimmy Smith Interview Part Seven Score: Bal 31, Pit 24 Jimmy Smith explains his Players: Quandry Ismail, Tony Banks philosophy on showboating. Event: Touchdown URL: http://cbs.sportsline... Produced by: NFL.com Posted date: 2/02/2000 HP 71
  • 72. Taalee’s Semantic Search Highly customizable, precise and freshest A/V search Delightful, relevant information, exceptional targeting opportunity Uniform Metadata for Content from Multiple Context and Domain Specific Attributes Sources, Can be sorted by any field HP 72
  • 73. What can a context do? Creating a Web of related information HP 73
  • 74. Taalee Directory Georgia Bulldogs System recognizes ENTITY & CATEGORY
  • 77. Metadata Application Example Semantic Applications for highly relevant and fresh content: Personalization and Targeting/interactive marketing Please contact Taalee for live demonstrations HP 77
  • 78. Personalized Directory Change Context Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you. Please enter such semantic keywords below.
  • 79. Personalized Queries & Hot Topics Personalized Queries 1. My Stock Portfolio Microsoft suffers serious hack attack Cisco Systems Inc PERSONALIZATION Analyst Safa Rashtchy on Yahoo! PeopleSoft, Inc AT&T Corp. more… 2. My Football Fantasy Team Gators' Spurrier ready for 'big' game Tech's Vick looks to become complete QB Bucs excited about Hamilton HOT Topics!!! Jasper Sanks rumbles into the end zone… Edwards explains reasons for leaving BYU 1. Election 2000 more… Video: Explaining the electoral map 3. Julia Roberts Collection Race for White House hots up Movie Trailer: "Notting Hill" Gore Florida Edge Seniors Give more… Trailer - Runaway Bride 2. Middle East Peace Conflict Patrick Movie Trailer: "Stepmom" Israel steps up security More die as Israel braces for suicide bombs Conspiracy Theory more… Pentagon probes Cole's security more… 4. Pink Floyd Collection 3. Napster Controversy Set the Controls for the Heart of the Sun… Wish You Were Here Brain Behind Napster The Napster Lawsuit Round And Around Keep Talking Creative Nomad II more… The Post War Dream more…
  • 81. Semantic/Interactive Targeting Buy Al Pacino Videos Buy Russell Crowe Videos Buy Christopher Plummer Videos Buy Diane Venora Videos Buy Philip Baker Hall Videos Buy The Insider Video Precisely targeted through the use of Structured Metadata and integration from multiple sources
  • 82. Web: Extreme Personalization Realtime Interests, Feeds Preferences Web sites Time-Shifted and Pages Content Aggregator Content Personalized Databases Content Content Personalized Content Semantic EngineTM Structured, Hi-Quality Semantic Metabase HP 82
  • 83. Application of Semantic Metadata and Automatic Content Enrichment User has already completed Web MyMedia Based registration and $ MyStocks personalization at Voquette’s News Sports Enterprise Customer site. Music User’s “Wireless Home page” shows the categories for his interests. There is an alert (new content) for his stock and sports categories. HP 83
  • 84. Application of Semantic Metadata and Automatic Content Enrichment Clicking on MyStocks brings My Stocks down user’s Personal Portfolio MyMedia list. The user wants to see news $ MyStocks CSCO items about Cisco (see next News NT slide). Sports IBM Search at the bottom is a Music Market semantic search that understands the financial domain, and the knowledge of user’s portfolio. Typically search can be done by typing one word or selecting from a dynamic, personalized menu. HP 84
  • 85. Application of Semantic Metadata and Automatic Content Enrichment Different types of recent audio content about CSCO Cisco are available. My Stocks MyMedia Analyst Call The user clicks to see a $ MyStocks CSCO Conf Call listing of Analyst Calls News NT Earnings on Cisco (next slide). Sports IBM Music Market Icons at the bottom of the screen enable contextually relevant functions: listen, set alert on story, add to playlist. HP 85
  • 86. Application of Semantic Metadata and Automatic Content Enrichment CSCO Analysis CSCO My Stocks 11/08 ON24 Payne MyMedia Analyst Call 11/07 ON24 H&Q $ MyStocks CSCO Conf Call 11/06 CBS Langlesis News NT Earnings Sports IBM Music Market Clicking on the link for Cisco Analyst Calls displays a listing sorted by date. Semantic filtering uses just the right metadata to meet screen and other constrains. E.g., Analyst Call focuses on the source and analyst name or company. The icon denote additional metadata, such as “Strong Buy” by H&Q Analyst. HP 86
  • 87. iTV: Taalee’s Extreme Personalization Immediate Interests, Content Preferences, Provider (DBS, DISH, Wink, AOL-TV) Personalized Content Capsules, Content, Redirects and “Programs” Meta-Data Programming Tagged Content Semantic EngineTM Structured, Hi-Quality Semantic Metabase HP 87
  • 88. Metadata for Automatic Content Enrichment Interactive Television Part of the screen can be automatically customized to This screen is customizable show conference call specific with interactivity feature information– including transcript, using metadata such as whether participation, etc. all of which are there is a new Conference relevant metadata Call video on CSCO. Conference Call itself can have embedded metadata to support personalization and interactivity. This segment has embedded or referenced metadata that is used by personalization application to show only the stocks that user is interested in. HP 88
  • 89. Metadata in Enterprise Apps Collection Processing Production Support Sony Network Content Categorize Affiliate Feeds Catalog Integrate Public Sources Rich Data Metabase Filter, Search, Consolidate, Personalize, Archive, Licensing, Syndication HP 89
  • 90. Customize: Page Settings | Content | Layout | Color Video A leaking gasoline pipeline burst into flames Thursday, killing -- Breaking News for 11/30/2000 -- more than 60 people near Nigeria's commercial capital of Lagos. Many of the dead were fisherman in wooden canoes engulfed in Gore Demands That Recount Restart (9:40 PM) the inferno. Gore Says Fla. Can't Name Electors (4:50 PM) Bush Meets Colin Powell at Ranch (1:22 PM) More than a dozen burned bodies lay on a beach at the village Market Tumbles on Earnings Warning (9:27 AM) of Ebute-Oko facing the central business district of Lagos across a lagoon. Barak Outlines His Peace Plan (6:30 AM) "At least 60 people died in this needless fire," senior local official Karimu Alabi said. Fire crews from state-run Nigerian National Petroleum Corp (NNPC), which owns the pipeline, were joined by other firemen from construction company Julius Berger in battling the blaze. t Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak, ravaging a cluster of huts and log houses. Sixty Die In Nigeria Blast At about the same time, a second fire razed Makoko shantytown Produced by: Euronews where thousands of fishermen and their families live in wood Posted Date: 11/30/2000 cabins erected on stilts in the lagoon near Lagos University. Event : Election 2000 Location : Tallahassee, Florida, USA Residents said fishermen from Makoko had been scavenging for People : Al Gore, George W. Bush gasoline from the leaking pipeline and storing it in cans in the wooden huts for days. Many victims of the Ebute-Oke fire were • Greatly enhances news-room productivity and time-to-market • Value-add for production, broadcast & syndication • Taalee’s semantic metadata enables powerful access to content used by Enterprise’s customers HP 90
  • 91. Description Produced by : CNN Posted Date : 12/07/2000 Reporter : David Lewis Event : Election 2000 Location : Tallahassee, Florida, USA (1.33) – 12/06/00 - ABC People : Al Gore TALLAHASSEE, Florida (CNN) – Though the two presidential candidates (2.53) - 12/06/00 - CBS have until noon Wednesday to file briefs in Al Gore's appeal to the Florida Supreme (5.16) - 12/06/00 - ABC Court, the outcome of two trials set on the same day in Leon County, Florida, may offer Gore his best hope for the presidency. (2.46) - 12/06/00 - FOX Democrats in Seminole County are seeking to have 15,000 absentee ballots thrown out (1.33) - 12/06/00 - NBC in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to (5.33) - 12/06/00 -- Breaking News -- 5,000 votes statewide. Gore Demands That Recount Restart (1.33) - 12/06/00 - CBS Lawyers for the plaintiff, Harry Jacobs, claim the ballots should be rejected because they (1.33) - 12/06/00 - ABC say County Elections Supervisor Sandra Gore Says Fla. Can't Name Electors (3.57) - 12/06/00 - CBS Goard allowed Republican workers to fill out (2.33) - 12/06/00 - CBS voter identification numbers on 2,126 incomplete absentee ballot applications sent Bush Meets Colin Powell at Ranch (4.27) - 12/06/00 - ABC in by GOP voters, while refusing to allow (3.12) - 12/06/00 - NNS Democratic workers to do the same thing for Democratic voters. Market Tumbles on Earnings Warning (3.44) - 12/06/00 - FOX (0.32) - 12/06/00 - CBS The GOP says that suit, and one similar to it Barak Outlines His Peace Plan (7.24) - 12/06/00 - CBS from Martin County, demonstrates (1.33) - 12/06/00 - CBS Democratic Party politics at its most desperate. Gore is not a party to either of those lawsuits. On Tuesday, the judge in the HP 91
  • 92. Metadata’s role in emerging iTV infrastructure Video Enhanced Digital Cable MPEG-2/4/7 MPEG MPEG ☺☺☺ GREAT Encoder Decoder USER EXPERIENCE Create Scene Description Tree Retrieve Scene Description Track Channel sales Node = AVO Object License metadata decoder and through Video Server Vendors, semantic applications to Video App Servers, and Broadcasters device makers Scene Description Tree Enhanced XML Produced by: Fox Sports Description Creation Date: 12/05/2000 League: NFL Taalee Teams: Seattle Seahawks, “Cisco Systems” Semantic Atlanta Falcons “Cisco Systems” Engine Players: John Kitna Node Coaches: Mike Holmgren, Dan Reeves Metadata-rich Location: Atlanta Value-added Node Object Content Information (OCI) HP 92
  • 93. Intelligent Metadata Creation Usage Metadata for Intelligent Content Content which does Content which does not Content the user did contain the words the user asked for + contain the words the user asked for, but + not think to ask for, but which he needs to is about what he asked know. for. Extractor Agents Value-added Metadata Semantic Associations HP 93
  • 94. Intelligent Content via Value-Added Metadata HP 94
  • 95. Value-added Metadata Traditional methods rely solely on (syntactic) indexing of keywords to enable users to access content • If a keyword is not in the content, it cannot be found. • The burden is on the user to think of and ask for the “right” keyword. For example: If a story is about “Roger Clemens” but does not contain the words “New York Yankees”, that story cannot and will not be found if the user searches for “New York Yankees” or “Yankees”. Understanding of the content is needed to create new metadata. Taalee understands Roger Clemens is a PERSON who Plays a SPORT called Baseball for a TEAM from New York called the Yankees. Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY) to add missing metadata to describe content more completely. HP 95
  • 96. Guided Demo for Value Added Metadata – Example one • Go to http://www.mediaanywhere.com/Football.html & search for Player = Jamal Anderson. • Click on the first result (titled “Week 3 Top10: Anderson TD Run”) and view the metadata on the following RMR page • Here is what you see: Produced by: NFL.com Posted Date: 9/20/2000 League : NFL Teams : Atlanta Falcons Players : Jamal Anderson • Now click on the button to play the asset (button marked “REAL”) • View the source HTML page that has the original story, and locate this story with the heading “Week 3 top 10: Anderson TD run” • Verify that Team=Atlanta Falcons or League=NFL was not present in the source content. • Taalee attached this value-added metadata to this asset’s existing metadata so that a user searching for Atlanta Falcons will find this story on Jamal Anderson, who is a player of Atlanta Falcons team HP 96
  • 97. Guided Demo for Value Added Metadata – Example Two • Go to http://www.mediaanywhere.com/Baseball.html & search for Player = Gary Sheffield • Click on the first result (titled “I want out!”) & view the metadata on the following RMR page • Here is what you see: Produced by: ESPN Posted Date: 3/03/2001 League : National League Teams : Los Angeles Dodgers Players : Gary Sheffield • Now click on the button to play the asset (button marked “REAL”) • View the source HTML page that has the original story, and locate this story with the heading “I want out!” • Verify that Team=Los Angeles Dodgers or League=National League was not present in the source content. • Taalee attached this value-added metadata to this asset’s existing metadata so that a user searching for Los Angeles Dodgers will find this story on Gary Sheffield, who is a player of Los Angeles Dodgers team HP 97
  • 98. Example 1 – Snapshots (“Jamal Anderson”) Search for ‘Jamal Anderson’ in ‘Football’ Click on first result for Jamal Anderson View the original source HTML page. Verify that the source page contains no mention of Team name and League name. They were Taalee’s value- additions to the metadata to facilitate easier search. View metadata. Note that Team name and League name are also included in the metadata HP 98
  • 99. Example 2 – Snapshots (“Gary Sheffield”) Search for ‘Gary Sheffield’ in ‘Baseball’ Click on first result for Gary Sheffield View the original source HTML page. Verify that the source page contains no mention of Team name and League name. They were Taalee’s value- additions to the metadata to facilitate easier search. View metadata. Note that Team name and League name are also included in the metadata HP 99
  • 100. Intelligent Content – Value-Added Metadata Some Metadata are obtained explicitly from the asset. Others (not present in the asset) are added by Taalee using its semantic relationships. League Name of league to which the Name payer’s team belongs – Not mentioned explicitly in asset – Value- The asset is richly, fully described in the many added by Taalee’s processing based on ways the users chose to interact. semantic associations. Posted Rich Media Date Team Name Sports Asset Date of asset posting – Name of team for which Extracted automatically player plays – Not mentioned explicitly in asset – Value-added using Taalee’s Sport semantic relationships Name of content Name of provider that Producer sport produced the Name asset Legend: Name of players X Y means mentioned explicitly in Player Taalee uses X to add Y the asset – Extracted Names as value-added metadata to the asset automatically HP 100
  • 101. Intelligent Content via Semantic Associations HP 101
  • 102. Semantic Associations • Traditional search engines rely solely on (syntactic) keywords to find content. • They do not understand the meaning, context, or relationships of keywords. For example: a search engine may see that the word “Commerce One” occurs, but it does not know that Commerce One is a COMPANY which Participates in the Corporate, Professional & Financial Software INDUSTRY and COMPETES WITH Ariba. As a result, search engines cannot go beyond returning a list (or directory view) of what the user has asked for. Their ability to provide associated information is extremely limited, static, and difficult to scale. Taalee’s Semantic Content Model goes beyond indexing keywords and classifying assets to Understand and Associate all content it catalogs. HP 102
  • 103. Example (test on http://directory.mediaanywhere.com) Links to news on companies that compete against Commerce One Crucial news on Links to news on companies Commerce One’s Commerce One competes competitors (Ariba) can against Search for company be accessed easily and (To view news on Ariba, click ‘Commerce One’ automatically on the link for Ariba) HP 103
  • 104. ASP/Enterprise hosted Internal Source 1 Research Extractor 2 Agent 1 World Model Semantic Semantic Consults Engine Application Knowledge Base for Cisco’s competition Lucent story from external 4 feeds picked for Internal Source 2 publishing as Returns result: Extractor Lucent is a “semantically Agent 2 3 competitor of related” to Cisco Cisco story – passed on to Dashboard Story on Cisco 1 Cisco story from PW Source 1 passed on to add semantic External feeds/Web associations (e.g. Reuters) Extractor Story on Agent 3 Lucent Taalee Third-party Metabase Content Mgmt And Syndication XCM-compliant Metadata centric metadata, XML or other format Content Management Architecture HP 104
  • 105. Semantic Associations supported by Taalee Semantic Engine Intelligent Content = What You Asked for + What you need to know! Related Stock COMPANY Competition COMPANIES in News INDUSTRY with COMPANIES in Same or Competing PRODUCTS Related INDUSTRY Regulations Technology Impacting INDUSTRY Products EPA EPA or Filed By COMPANY Important to INDUSTRY Industry SEC or COMPANY News HP 105
  • 106. Semantic Web Application Example: Financial Advisor Research Dashboard Automatic Collation of semantically Research related digital Inferred media information Automatically from Multiple Sources Semantically Related News Not Semantic Search/ Specifically Personalization, etc. Asked For HP 106
  • 107. A vision for future Semantic Web, Complex Relationships and Knowledge Discovery, E.g., InfoQuilt project at LSDIS Lab, Univ. of Georgia
  • 108. Beyond RDF – one proposal (cf: Ora Lassila) Structural modeling obviously not enough we need a “logic layer” on top of RDF some type of description logic is a possibility Exposing a wide variety of data sources as RDF is useful, particularly if we have logic/rules which allow us to draw inference from this data RDF + DL = “Frame System for WWW” Source : www.ontoknowledge.org/oil HP 108
  • 109. Semantic Web - next step in Web evolution “A Web in which machine reasoning will be ubiquitous and devastatingly powerful.” [Berners-Lee] “A place where the whim of a human being and the reasoning of a machine coexist in an ideal, powerful mixture.” [Berners-Lee] “A semantic Web would permit more accurate and efficient Web searches, which are among the most important Web-based activities.” [Berners-Lee] A personal definition Semantic Web: The concept that Web-accessible content can be organized semantically, rather than though syntactic and structural methods. HP 109
  • 110. What is DAML (DARPA Agent Markup Language) a proposal to create technologies that will enable software agents to dynamically identify and understand information sources, and to provide interoperability between agents in a semantic manner. Based on RDF+XML Agent readable Tags www.daml.org
  • 112. Three layered Architecture Of Semantic Web Logical Layer Formal Semantics and Reasoning Support – OIL, DAML-O Schema Layer Definition of Vocabulary RDF Schema Data Layer Simple data model and syntax for metadata - RDF
  • 113. OIL – as RDF Extension <rdfs:Class rdf:ID=”herbivore”> <rdf:type rdf:resource=”http://www.ontoknowledge.org/#DefinedClass”/> <rdfs:subClassOf rdf:resource=”#animal”/> <rdfs:subClassOf> <oil:NOT> <oil:hasOperand rdf:resource=”#carnivore”/> </oil:NOT> </rdfs:subClassOf> </rdfs:Class>
  • 114. DAML and OIL – Evolving towards Semantic Web OIL Mission OIL is a Web-based representation and inference layer for ontologies, which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
  • 115. Knowledge Discovery - Example Earthquake Sources Nuclear Test Sources (USGS, NEIC) (Oklahoma Observatory, etc.) Nuclear Test May Cause Earthquakes Is it really true?
  • 116. Complex Relationships A nuclear test could have caused an earthquake if the earthquake occurred some time after the nuclear test was conducted and in a nearby region. NuclearTest Causes Earthquake <= dateDifference( NuclearTest.eventDate, Earthquake.eventDate ) < 30 AND distance( NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude ) < 10000
  • 117. Knowledge Discovery - Example When was the first recorded nuclear test conducted? 1950 Find the total number of earthquakes with a magnitude 5.8 or higher on the Richter scale per year starting from 1900 Increase in number of earthquakes since 1945
  • 118. Knowledge Discovery - Example… For each group of earthquakes with magnitudes in the ranges 5.8-6, 6-7, 7-8, 8-9, and >9 on the Richter scale per year starting from 1900, find average number of earthquakes Number of earthquakes with magnitude > 7 almost constant. So nuclear tests probably only cause earthquakes with magnitude < 7
  • 119. Knowledge Discovery - Example… Find pairs of nuclear tests and earthquakes such that the earthequake occurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake Demo
  • 120. Resources/References RDF:www.w3.org/TR/REC-rdf-syntax/ ICE: www.icestandard.org Meta Object Facility (MOF) Specification, Version 1.3, September 27, 1999: http://cgi.omg.org/cgi-bin/doc?ad/99-09-05 XML Metadata Interchange (XMI) Specification, Version 1.1, October 25, 1999: http://cgi.omg.org/cgi-bin/doc?ad/9910-02 http://cgi.omg.org/cgi-bin/doc?ad/99-10-03 DAML: www.daml.org NEWSML: newsshowcase.reuters.com PRISM: www.prismstandard.org/techdev/prismspec1.asp XCM: www.vignette.com OIL: www.ontoknowledge.org/oil SEMANTICWEB: www.semanticweb.org VOICEXML: www.voicexml.org MPEG7: www.darmstadt.gmd.de/mobile/MPEG7/ Taalee: www.taalee.com Oingo: www.oingo.com
  • 121. Multimedia Data Management: Using Metadata to Integrate and Apply Digital Media, Amit Sheth and Wolfgang Klas, Eds., McGraw Hill, ISBN: 0-07-057735-8, 1998.