SlideShare uma empresa Scribd logo
1 de 13
Towards Collaborative
         Annotation for
       Video Accessibility
Pierre-Antoine Champin, Benoît Encelle,
Magali O. Beldame, Yannick Prié
Nick Evans and
Raphaël Troncy <raphael.troncy@eurecom.fr>
The                                         consortium
 Dailymotion (Paris, FR) : video sharing website
    Promotes HTML 5 using the video tag, http://openvideo.dailymotion.com/
 LIRIS (Lyon, FR) : CS research group
    Silex Team: expertise in semantic web, annotation models, video annotation
     and HCI for disabled people
 EURECOM (Sophia Antipolis, FR) : research center in
  communications systems
    Multimedia team: expertise in multimedia analysis (speaker
     diarization/recognition, speech recognition) and semantic web
 INS HEA + school (Lyon, FR)
    Experiences in physical disabilities: blindness, visual impairment, deafness
     and hearing Loss
    Blind and death high-school students



     26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   -2
Goals and Motivations
 What is required to make video accessible on the Web?
 How to increase the number of accessible videos?
 Technologies:
    Annotating: automatic (speech transcription) and manual (social
     collaborative annotation tool)
    Addressing: pointing to, retrieving, transmitting only parts of media
    Rendering: video visualization for the impaired, Braille output

 Expected benefits for:
    disabled people, getting better access to video
    video provider, reaching a wider audience
    the Web in general, using semantic annotations



    26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   -3
Accessibility Features for Visually
Impaired and Blind People

Man’s actions                                                                            Put on his shoes                    Walk in the street

Son’s actions                                           Look his mother

 Characters             The mother, her son            The son, the man                                    The man and his friend

  Scenery                                 In the shop                                                             In the street



                                                                    Annotations multimodal presentation
Annotations                                                                     depends on video context
                                                                                  and user preferences




                                  Audio                           Auditory              Audio                          Braille
                                  track                            icons              description

         26/04/2010 -             Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA        -4
Accessibility Features for Deaf People


Mother‘s dialogues                                                                              How are you ?


 Son’s dialogues                                                   Hi mom                                            Fine and you ?

     Sound                                                                        Car horn



                                                                             Annotations presentation
Annotations                                                                    depends on video cointext
                                                                                 and user preferences




                                        Video                                  Subtitles            Surtitles
                                        track


          26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA     -5
Producing Video Annotations

  Automatic annotations  Social annotations



           Speaker diarization
                 Who spoke and When?                                                           Annotation corrections,
           Speech recognition                                                                   enhancement
                 Transcription                                                                 Audio description
                                                                                                 (for visually impaired)
Annotations
 Mother                              How are you ?                                      Annotations
  Son                     Ho mom                         Fine                               Mother                                         How are you ?

                                                                                             Son                        Hi mom                             Fine and you ?


                                                                                            Sound                               Car horn




           26/04/2010 -            Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA      -6
Braille     Rendering
 The Advene prototype   emulation   views


Enriched
Media Player




Timeline
with typed
annotations




                                7
Preliminary study (1/2)
 Semi-structured interviews with blind users (n=2)
    Participant’s habits when watching programs with audio description
    Audio description process
    Multimodal presentations of descriptions

 Requirements:
    R1: generate additional descriptions and provide unobtrusive access
     to descriptions (tactile access for blind Braille readers)
    R2: descriptions at various level of granularity and verbosity
    R3: use system’s multimodal output to provide two or more
     descriptions (e.g. speech synthesis and Braille display)




    26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   -8
Preliminary study (2/2)
 Goal: see whether we can use auditory icons to convey
  the rhythm of the editing of a movie to blind users
    e.g.: sound of a locomotive arriving from the right to convey the
     concept of a traveling from right to left

 Experiment and questionnaires (n=16+9)
    Viewing with headsets of 5 min of Ratatouille,
     http://www.imdb.com/title/tt0382932/

 Results:
    Rhythm and movie dynamic better perceived
    Usefulness of auditory icons but must be limited (5 max) and be very
     different from the main soundtrack of the movie
          Editing cues: change of scenes, camera movement, flashback (e.g. NCIS)
          Audio zoom (e.g. Survivor)


    26/04/2010 -    Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   -9
ACAV Architecture

                                                                     Benchmarking: Sphinx, HTK,
                                                                     Julius




  26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   - 10
Media Fragments URI

                                                                           Provide URI-based
                                                                           mechanisms for uniquely
                                                                           identifying fragments for
                                                                           media objects on the Web,
                                                                           such as video, audio, and
                                                                           images.


Photo credit: Robert Freund




     26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   - 11
Media Fragments Processing

http://www.example.com/video.ogv#t=10,20




   26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   - 12
Conclusion

 ACAV will bring:
   Dedicated annotation schemas for video accessibility
   Social network model for video annotations
   Web integration of state of the art speech technologies
   GUI models for authoring and rendering video
    annotations
   Media Fragments reference implementation
   Open source Braille plugin for most used Web browsers


                                                 http://www.acavideo.fr/

   26/04/2010 -   Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA   - 13

Mais conteúdo relacionado

Semelhante a Towards Collaborative Annotation for Video Accessibility

Git yo'self video lit'rit (annotated)
Git yo'self video lit'rit (annotated)Git yo'self video lit'rit (annotated)
Git yo'self video lit'rit (annotated)David Evan Harris
 
Bryan J. Hogan
Bryan J. HoganBryan J. Hogan
Bryan J. HoganVideoguy
 
User Needs and Project Plans for Library-Managed Media Assets
User Needs and Project Plans for Library-Managed Media AssetsUser Needs and Project Plans for Library-Managed Media Assets
User Needs and Project Plans for Library-Managed Media AssetsJon W. Dunn
 
Extract the Audio from Video by using python
Extract the Audio from Video by using pythonExtract the Audio from Video by using python
Extract the Audio from Video by using pythonIRJET Journal
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachCSCJournals
 
What can users do for multimedia?
What can users do for multimedia?What can users do for multimedia?
What can users do for multimedia?Lora Aroyo
 
Adding audio and video presentation
Adding audio and video presentationAdding audio and video presentation
Adding audio and video presentationLaura Hollinshead
 
Streaming Video in Academic Libraries: Preliminary Results from a National Su...
Streaming Video in Academic Libraries: Preliminary Results from a National Su...Streaming Video in Academic Libraries: Preliminary Results from a National Su...
Streaming Video in Academic Libraries: Preliminary Results from a National Su...Charleston Conference
 
MediaEval 2018: Eyes and ears together
MediaEval 2018: Eyes and ears togetherMediaEval 2018: Eyes and ears together
MediaEval 2018: Eyes and ears togethermultimediaeval
 
Video Accessibility Toolkit for Success in a Virtual Environment
Video Accessibility Toolkit for Success in a Virtual EnvironmentVideo Accessibility Toolkit for Success in a Virtual Environment
Video Accessibility Toolkit for Success in a Virtual Environment3Play Media
 

Semelhante a Towards Collaborative Annotation for Video Accessibility (20)

Git yo'self video lit'rit
Git yo'self video lit'ritGit yo'self video lit'rit
Git yo'self video lit'rit
 
Git yo'self video lit'rit (annotated)
Git yo'self video lit'rit (annotated)Git yo'self video lit'rit (annotated)
Git yo'self video lit'rit (annotated)
 
Bryan J. Hogan
Bryan J. HoganBryan J. Hogan
Bryan J. Hogan
 
User Needs and Project Plans for Library-Managed Media Assets
User Needs and Project Plans for Library-Managed Media AssetsUser Needs and Project Plans for Library-Managed Media Assets
User Needs and Project Plans for Library-Managed Media Assets
 
Extract the Audio from Video by using python
Extract the Audio from Video by using pythonExtract the Audio from Video by using python
Extract the Audio from Video by using python
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
 
What can users do for multimedia?
What can users do for multimedia?What can users do for multimedia?
What can users do for multimedia?
 
Why Not Video?
Why Not Video?Why Not Video?
Why Not Video?
 
imovie ice 2013
imovie ice 2013imovie ice 2013
imovie ice 2013
 
Video Accessibility
Video Accessibility Video Accessibility
Video Accessibility
 
Adding audio and video presentation
Adding audio and video presentationAdding audio and video presentation
Adding audio and video presentation
 
Streaming Video in Academic Libraries: Preliminary Results from a National Su...
Streaming Video in Academic Libraries: Preliminary Results from a National Su...Streaming Video in Academic Libraries: Preliminary Results from a National Su...
Streaming Video in Academic Libraries: Preliminary Results from a National Su...
 
Feeley i movie islma pp
Feeley i movie islma ppFeeley i movie islma pp
Feeley i movie islma pp
 
On Linked Open Data (LOD)-based Semantic Video Annotation Systems
On Linked Open Data (LOD)-based  Semantic Video Annotation SystemsOn Linked Open Data (LOD)-based  Semantic Video Annotation Systems
On Linked Open Data (LOD)-based Semantic Video Annotation Systems
 
Arneb
ArnebArneb
Arneb
 
MediaEval 2018: Eyes and ears together
MediaEval 2018: Eyes and ears togetherMediaEval 2018: Eyes and ears together
MediaEval 2018: Eyes and ears together
 
Video Accessibility Toolkit for Success in a Virtual Environment
Video Accessibility Toolkit for Success in a Virtual EnvironmentVideo Accessibility Toolkit for Success in a Virtual Environment
Video Accessibility Toolkit for Success in a Virtual Environment
 
Athabasca
AthabascaAthabasca
Athabasca
 
Athabasca
AthabascaAthabasca
Athabasca
 
Accessible Video
Accessible VideoAccessible Video
Accessible Video
 

Mais de Raphael Troncy

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyRaphael Troncy
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentRaphael Troncy
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningRaphael Troncy
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationRaphael Troncy
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...Raphael Troncy
 
Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Raphael Troncy
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Raphael Troncy
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...Raphael Troncy
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Raphael Troncy
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionRaphael Troncy
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Raphael Troncy
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webRaphael Troncy
 
Live topic generation from event streams
Live topic generation from event streamsLive topic generation from event streams
Live topic generation from event streamsRaphael Troncy
 
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdMediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdRaphael Troncy
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentRaphael Troncy
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksRaphael Troncy
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Raphael Troncy
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED OpeningRaphael Troncy
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingRaphael Troncy
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED OpeningRaphael Troncy
 

Mais de Raphael Troncy (20)

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening Ceremony
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things Environment
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learning
 
Location Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip RecommendationLocation Embeddings for Next Trip Recommendation
Location Embeddings for Next Trip Recommendation
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...
 
Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and Description
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social web
 
Live topic generation from event streams
Live topic generation from event streamsLive topic generation from event streams
Live topic generation from event streams
 
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdMediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social Networks
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED Opening
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop opening
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED Opening
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Towards Collaborative Annotation for Video Accessibility

  • 1. Towards Collaborative Annotation for Video Accessibility Pierre-Antoine Champin, Benoît Encelle, Magali O. Beldame, Yannick Prié Nick Evans and Raphaël Troncy <raphael.troncy@eurecom.fr>
  • 2. The consortium  Dailymotion (Paris, FR) : video sharing website  Promotes HTML 5 using the video tag, http://openvideo.dailymotion.com/  LIRIS (Lyon, FR) : CS research group  Silex Team: expertise in semantic web, annotation models, video annotation and HCI for disabled people  EURECOM (Sophia Antipolis, FR) : research center in communications systems  Multimedia team: expertise in multimedia analysis (speaker diarization/recognition, speech recognition) and semantic web  INS HEA + school (Lyon, FR)  Experiences in physical disabilities: blindness, visual impairment, deafness and hearing Loss  Blind and death high-school students 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -2
  • 3. Goals and Motivations  What is required to make video accessible on the Web?  How to increase the number of accessible videos?  Technologies:  Annotating: automatic (speech transcription) and manual (social collaborative annotation tool)  Addressing: pointing to, retrieving, transmitting only parts of media  Rendering: video visualization for the impaired, Braille output  Expected benefits for:  disabled people, getting better access to video  video provider, reaching a wider audience  the Web in general, using semantic annotations 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -3
  • 4. Accessibility Features for Visually Impaired and Blind People Man’s actions Put on his shoes Walk in the street Son’s actions Look his mother Characters The mother, her son The son, the man The man and his friend Scenery In the shop In the street Annotations multimodal presentation Annotations depends on video context and user preferences Audio Auditory Audio Braille track icons description 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -4
  • 5. Accessibility Features for Deaf People Mother‘s dialogues How are you ? Son’s dialogues Hi mom Fine and you ? Sound Car horn Annotations presentation Annotations depends on video cointext and user preferences Video Subtitles Surtitles track 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -5
  • 6. Producing Video Annotations  Automatic annotations  Social annotations  Speaker diarization Who spoke and When?  Annotation corrections,  Speech recognition enhancement Transcription  Audio description (for visually impaired) Annotations Mother How are you ? Annotations Son Ho mom Fine Mother How are you ? Son Hi mom Fine and you ? Sound Car horn 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -6
  • 7. Braille Rendering The Advene prototype emulation views Enriched Media Player Timeline with typed annotations 7
  • 8. Preliminary study (1/2)  Semi-structured interviews with blind users (n=2)  Participant’s habits when watching programs with audio description  Audio description process  Multimodal presentations of descriptions  Requirements:  R1: generate additional descriptions and provide unobtrusive access to descriptions (tactile access for blind Braille readers)  R2: descriptions at various level of granularity and verbosity  R3: use system’s multimodal output to provide two or more descriptions (e.g. speech synthesis and Braille display) 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -8
  • 9. Preliminary study (2/2)  Goal: see whether we can use auditory icons to convey the rhythm of the editing of a movie to blind users  e.g.: sound of a locomotive arriving from the right to convey the concept of a traveling from right to left  Experiment and questionnaires (n=16+9)  Viewing with headsets of 5 min of Ratatouille, http://www.imdb.com/title/tt0382932/  Results:  Rhythm and movie dynamic better perceived  Usefulness of auditory icons but must be limited (5 max) and be very different from the main soundtrack of the movie  Editing cues: change of scenes, camera movement, flashback (e.g. NCIS)  Audio zoom (e.g. Survivor) 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA -9
  • 10. ACAV Architecture Benchmarking: Sphinx, HTK, Julius 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA - 10
  • 11. Media Fragments URI Provide URI-based mechanisms for uniquely identifying fragments for media objects on the Web, such as video, audio, and images. Photo credit: Robert Freund 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA - 11
  • 12. Media Fragments Processing http://www.example.com/video.ogv#t=10,20 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA - 12
  • 13. Conclusion  ACAV will bring:  Dedicated annotation schemas for video accessibility  Social network model for video annotations  Web integration of state of the art speech technologies  GUI models for authoring and rendering video annotations  Media Fragments reference implementation  Open source Braille plugin for most used Web browsers http://www.acavideo.fr/ 26/04/2010 - Towards Collaborative Annotations for Video Accessibility - W4A 2010, Raleigh, USA - 13