SlideShare a Scribd company logo
1 of 13
USE OF
SUPRASEGMENTAL FEATURES
PRESENT IN LP RESIDUAL
FOR
AUDIO CLIP CLASSIFICATION

            - Anvita Bajpai
           anvita@mailcity.com
            Applied Research Group
         Satyam Computer Services Ltd.
                  Bangalore
Exploding information
•Recent studies show
that most of the stored
data is in the form of
multimedia.
•Large volume of
multimedia data makes
it difficult to handle it
manually
•Need to have an             1 hr of TV broadcast across the world is 100 Petabyte.
automatic method to
                            Source: http://www.sims.berkeley.edu/research/projects/how-much-
organize and use it                                info/summary.html#tv

appropriately.

                                                                                               1
Audio indexing
                                                    Audio classification - An
    Reason of choosing audio data
●
                                                    important step in building an
    for study
                                                    audio indexing system
         Easier to process
     –

                                                             An audio indexing system
         Contains significant information
     –

    Indexing – method of
●

    organizing data for further
    search and retrieval.
    Example – book indexing

    Audio Indexing – indexing
●

    non-text data using audio
    part of it
                                            Source: J. Makhoul et. al. “Speech and language technologies for audio
                                            indexing and retrieval”, in Proc. of the IEEE, 88(8), pp. 1338-1353, 2000.
                                                                                                                  2
Audio clip classification
    Closed set problem
●


    To classify a given audio clip in one of the following
●

    predefined categories
         Advertisement, Cartoon, Cricket, Football, News
     –

    Issues in audio clip classification
●

         Feature extraction
     –
              Effective representation of data to capture all significant properties of audio for
          ●

              the task
              Robust under various conditions
          ●


         Classification
     –
              Formulation of a distance measure and rule/models
          ●

                    Training a models for the task
                –
                    Testing – actual classification task
                –
                    Combining evidences from different systems
                –


                                                                                                    5
Levels of information in audio signal
    Subsegmental information
●


        Related to excitation source characteristics
    –

    Segmental information
●


        Related to system / physiological characteristics
    –

    Suprasegmental information
●


        Related to behavioural characteristics of audio
    –




                                                            3
Missing component in existing
        approaches and it's importance
    Features derived based on spectral analysis
●


        Carry significant properties of audio data at segmental level
    –

        Miss information present at subsegmental, suprasegmental level
    –

    Perceptually significant information in linear prediction
●

    (LP) residual of signal
        Complimentary in nature to the spectral information
    –

        Suprasegmental information not being used in current systems
    –




                                                                        4
Presence of audio-specific




                                                     Residual
                                          Original
information in LP residual

                             Aa_res.wav




        Aa1.wav




                              Aa1.wav




                                                          6
uprasegmental information in Hilbert
nvelope of LP residual of audio signal




                                         7
Suprasegmental information in LP
   residual for audio clip classification




                                                                             8
Autocorrelation samples of Hilbert envelope of LP residual for 5 audio classes
Statistics of autocorrelation sequence




                                                                                          9
Correction – here we have statistics of autocorrelation sequence peaks of HE (not LP residual)
Classification results based on
suprasegmental features using SVM
                  # of clips correctly classified
  Audio Class
                   (out of 20 clips for each class)
  Advertisement                  11

    Cartoon                      19

     Cricket                     16

    Football                     04

     News                        10
                                                      11
Conclusions and future work
    Need to organize multimedia data because of its large volume and
●

    need in real-life applications
    Shown presence of audio-specific suprasegmental information in
●

    LP residual, and its Hilbert envelope
    Statistics of autocorrelation sequence of Hilbert envelop is shown
●

    to enhance these features
    Demonstrated the use of SVM to classify audio based on variance
●

    of autocorrelation sequence of Hilbert envelop
    Need to extend the framework for other audio indexing
●

    applications
    Need to explore methods to combine the suprasegmental
●

    information to the systems based on segmental and subsegmental
    features, for the audio clip classification task
    (though little far..) Building a multimedia indexing system
●


                                                                         12
Publications
     Anvita Bajpai and B. Yegnanarayana, “Exploring Suprasegmental Features using LP Residual
1.

     for Audio Clip Classification”, Workshop on Image and Signal Processing (WISP-2007), IIT
     Guwahati, India, 28-29 December 2007.
     Anvita Bajpai, “HTB Security Administration using UMX”, An Oracle Technical White Paper,
2.

     March 2006
     Anvita Bajpai and B. Yegnanarayana, “Audio Clip Classification using LP Residual and Neural
3.

     Networks Models”, European Signal and Image Processing Conference (EUSIPCO-2004),
     Vienna, Austria, 6-10 September 2004
     Anvita Bajpai and B. Yegnanarayana, “Exploring Features for Audio Indexing using LP Residual
4.

     and AANN Models”, accepted for The 17th International FLAIRS Conference (FLAIRS - 2004),
     Miami Beach, Florida, 17-19 May 2004.
      Anvita Bajpai and B. Yegnanarayana, “Exploring Features for Audio Clip Classification using
5.

     LP Residual and Neural Networks Models”, International Conference on Intelligent Signal and
     Image Processing (ICISIP-2004), Chennai, India, 4-7 January 2004
     Gaurav Aggarwal, Anvita Bajpai and B. Yegnanarayana, “Exploring Features for Audio
6.

     Indexing”, in Indian Research Scholar Seminar (IRIS-2002), Indian Institute of Science,
     Bangalore, India, March 2002
     Anvita Bajpai, “State of the art in Web Design and Content Creation for Indian Languages”, in
7.

     National level Workshop on Translation Support Systems (STRANS-2001) IIT Kanpur, February
     2001
     Anvita Bajpai, “Web Developmental Issues – A Case Study of GITASUPERSITE”, in National
8.

     level Workshop on Building Large Websites, IIT Kanpur, November 2000

More Related Content

Viewers also liked

Viewers also liked (6)

Hidden Household Expenses Associated with Homeownership
Hidden Household Expenses Associated with HomeownershipHidden Household Expenses Associated with Homeownership
Hidden Household Expenses Associated with Homeownership
 
Bs presentation slides
Bs presentation slidesBs presentation slides
Bs presentation slides
 
Kemilau Mutiara Pasifik - Visit Our Indonesia Campaign
Kemilau Mutiara Pasifik - Visit Our Indonesia CampaignKemilau Mutiara Pasifik - Visit Our Indonesia Campaign
Kemilau Mutiara Pasifik - Visit Our Indonesia Campaign
 
April 6 Demystifying New Advertising Trends - Palm Springs Gay Desert Guide
April 6 Demystifying New Advertising Trends - Palm Springs Gay Desert GuideApril 6 Demystifying New Advertising Trends - Palm Springs Gay Desert Guide
April 6 Demystifying New Advertising Trends - Palm Springs Gay Desert Guide
 
Indonesia's Paradise Under The Sea: Explore Wakatobi 2016
Indonesia's Paradise Under The Sea: Explore Wakatobi 2016Indonesia's Paradise Under The Sea: Explore Wakatobi 2016
Indonesia's Paradise Under The Sea: Explore Wakatobi 2016
 
What Am I Chopped Lizard?
What Am I Chopped Lizard?What Am I Chopped Lizard?
What Am I Chopped Lizard?
 

Similar to Anvita Ncvpripg 2008 Presentation

Anvita Wisp 2007 Presentation
Anvita Wisp 2007 PresentationAnvita Wisp 2007 Presentation
Anvita Wisp 2007 Presentation
guest6e7a1b1
 
Anvita Eusipco 2004
Anvita Eusipco 2004Anvita Eusipco 2004
Anvita Eusipco 2004
guest6e7a1b1
 
Anvita Audio Classification Presentation
Anvita Audio Classification PresentationAnvita Audio Classification Presentation
Anvita Audio Classification Presentation
guest6e7a1b1
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasets
TERN Australia
 
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseTowards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Hilmar Lapp
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
Herbert Van de Sompel
 
Epics introduction-dec-2010
Epics introduction-dec-2010Epics introduction-dec-2010
Epics introduction-dec-2010
awarenessproject
 
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Ana Luísa Pinho
 

Similar to Anvita Ncvpripg 2008 Presentation (20)

Anvita Wisp 2007 Presentation
Anvita Wisp 2007 PresentationAnvita Wisp 2007 Presentation
Anvita Wisp 2007 Presentation
 
Anvita Eusipco 2004
Anvita Eusipco 2004Anvita Eusipco 2004
Anvita Eusipco 2004
 
Anvita Eusipco 2004
Anvita Eusipco 2004Anvita Eusipco 2004
Anvita Eusipco 2004
 
Shaman Project Hemmje
Shaman Project  HemmjeShaman Project  Hemmje
Shaman Project Hemmje
 
Anvita Audio Classification Presentation
Anvita Audio Classification PresentationAnvita Audio Classification Presentation
Anvita Audio Classification Presentation
 
Straight
StraightStraight
Straight
 
Deep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event DetectionDeep Learning - Speaker Verification, Sound Event Detection
Deep Learning - Speaker Verification, Sound Event Detection
 
Tim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasetsTim Malthus_Towards standards for the exchange of field spectral datasets
Tim Malthus_Towards standards for the exchange of field spectral datasets
 
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic DatabaseTowards a Simple, Standards-Compliant, and Generic Phylogenetic Database
Towards a Simple, Standards-Compliant, and Generic Phylogenetic Database
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
 
IBM Business Analytics and Optimization - Traffic Management with IBM InfoSph...
IBM Business Analytics and Optimization - Traffic Management with IBM InfoSph...IBM Business Analytics and Optimization - Traffic Management with IBM InfoSph...
IBM Business Analytics and Optimization - Traffic Management with IBM InfoSph...
 
IRJET- Audio Genre Classification using Neural Networks
IRJET-  	  Audio Genre Classification using Neural NetworksIRJET-  	  Audio Genre Classification using Neural Networks
IRJET- Audio Genre Classification using Neural Networks
 
Deep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker RecognitionDeep Learning for Automatic Speaker Recognition
Deep Learning for Automatic Speaker Recognition
 
Cloud Technical Challenges
Cloud Technical ChallengesCloud Technical Challenges
Cloud Technical Challenges
 
Epics introduction-dec-2010
Epics introduction-dec-2010Epics introduction-dec-2010
Epics introduction-dec-2010
 
2018 IEEE Big Data Cup Challenge - FEMH ​Voice Data Challenge
2018 IEEE Big Data Cup Challenge - FEMH ​Voice Data Challenge2018 IEEE Big Data Cup Challenge - FEMH ​Voice Data Challenge
2018 IEEE Big Data Cup Challenge - FEMH ​Voice Data Challenge
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
Individual Brain Charting, a high-resolution fMRI dataset for cognitive mappi...
 
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
Leveraging The Open Provenance Model as a Multi-Tier Model for Global Climate...
 
Artificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKSArtificial intelligence NEURAL NETWORKS
Artificial intelligence NEURAL NETWORKS
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Anvita Ncvpripg 2008 Presentation

  • 1. USE OF SUPRASEGMENTAL FEATURES PRESENT IN LP RESIDUAL FOR AUDIO CLIP CLASSIFICATION - Anvita Bajpai anvita@mailcity.com Applied Research Group Satyam Computer Services Ltd. Bangalore
  • 2. Exploding information •Recent studies show that most of the stored data is in the form of multimedia. •Large volume of multimedia data makes it difficult to handle it manually •Need to have an 1 hr of TV broadcast across the world is 100 Petabyte. automatic method to Source: http://www.sims.berkeley.edu/research/projects/how-much- organize and use it info/summary.html#tv appropriately. 1
  • 3. Audio indexing Audio classification - An Reason of choosing audio data ● important step in building an for study audio indexing system Easier to process – An audio indexing system Contains significant information – Indexing – method of ● organizing data for further search and retrieval. Example – book indexing Audio Indexing – indexing ● non-text data using audio part of it Source: J. Makhoul et. al. “Speech and language technologies for audio indexing and retrieval”, in Proc. of the IEEE, 88(8), pp. 1338-1353, 2000. 2
  • 4. Audio clip classification Closed set problem ● To classify a given audio clip in one of the following ● predefined categories Advertisement, Cartoon, Cricket, Football, News – Issues in audio clip classification ● Feature extraction – Effective representation of data to capture all significant properties of audio for ● the task Robust under various conditions ● Classification – Formulation of a distance measure and rule/models ● Training a models for the task – Testing – actual classification task – Combining evidences from different systems – 5
  • 5. Levels of information in audio signal Subsegmental information ● Related to excitation source characteristics – Segmental information ● Related to system / physiological characteristics – Suprasegmental information ● Related to behavioural characteristics of audio – 3
  • 6. Missing component in existing approaches and it's importance Features derived based on spectral analysis ● Carry significant properties of audio data at segmental level – Miss information present at subsegmental, suprasegmental level – Perceptually significant information in linear prediction ● (LP) residual of signal Complimentary in nature to the spectral information – Suprasegmental information not being used in current systems – 4
  • 7. Presence of audio-specific Residual Original information in LP residual Aa_res.wav Aa1.wav Aa1.wav 6
  • 8. uprasegmental information in Hilbert nvelope of LP residual of audio signal 7
  • 9. Suprasegmental information in LP residual for audio clip classification 8 Autocorrelation samples of Hilbert envelope of LP residual for 5 audio classes
  • 10. Statistics of autocorrelation sequence 9 Correction – here we have statistics of autocorrelation sequence peaks of HE (not LP residual)
  • 11. Classification results based on suprasegmental features using SVM # of clips correctly classified Audio Class (out of 20 clips for each class) Advertisement 11 Cartoon 19 Cricket 16 Football 04 News 10 11
  • 12. Conclusions and future work Need to organize multimedia data because of its large volume and ● need in real-life applications Shown presence of audio-specific suprasegmental information in ● LP residual, and its Hilbert envelope Statistics of autocorrelation sequence of Hilbert envelop is shown ● to enhance these features Demonstrated the use of SVM to classify audio based on variance ● of autocorrelation sequence of Hilbert envelop Need to extend the framework for other audio indexing ● applications Need to explore methods to combine the suprasegmental ● information to the systems based on segmental and subsegmental features, for the audio clip classification task (though little far..) Building a multimedia indexing system ● 12
  • 13. Publications Anvita Bajpai and B. Yegnanarayana, “Exploring Suprasegmental Features using LP Residual 1. for Audio Clip Classification”, Workshop on Image and Signal Processing (WISP-2007), IIT Guwahati, India, 28-29 December 2007. Anvita Bajpai, “HTB Security Administration using UMX”, An Oracle Technical White Paper, 2. March 2006 Anvita Bajpai and B. Yegnanarayana, “Audio Clip Classification using LP Residual and Neural 3. Networks Models”, European Signal and Image Processing Conference (EUSIPCO-2004), Vienna, Austria, 6-10 September 2004 Anvita Bajpai and B. Yegnanarayana, “Exploring Features for Audio Indexing using LP Residual 4. and AANN Models”, accepted for The 17th International FLAIRS Conference (FLAIRS - 2004), Miami Beach, Florida, 17-19 May 2004. Anvita Bajpai and B. Yegnanarayana, “Exploring Features for Audio Clip Classification using 5. LP Residual and Neural Networks Models”, International Conference on Intelligent Signal and Image Processing (ICISIP-2004), Chennai, India, 4-7 January 2004 Gaurav Aggarwal, Anvita Bajpai and B. Yegnanarayana, “Exploring Features for Audio 6. Indexing”, in Indian Research Scholar Seminar (IRIS-2002), Indian Institute of Science, Bangalore, India, March 2002 Anvita Bajpai, “State of the art in Web Design and Content Creation for Indian Languages”, in 7. National level Workshop on Translation Support Systems (STRANS-2001) IIT Kanpur, February 2001 Anvita Bajpai, “Web Developmental Issues – A Case Study of GITASUPERSITE”, in National 8. level Workshop on Building Large Websites, IIT Kanpur, November 2000