SlideShare uma empresa Scribd logo
1 de 4
Optimized Audio Classification and Segmentation Algorithm by
Using Ensemble Methods
Abstract:
Audio segmentation is a basis for multimedia content analysis which is the most important
and widely used application nowadays. An optimized audio classification and segmentation
algorithm is presented in this paper that segments a superimposed audio stream on the
basis of its content into four main audio types: pure-speech, music, environment sound,
and silence. An algorithm is proposed that preserves important audio content and reduces
the misclassification rate without using large amount of training data, which handles noise
and is suitable for use for real-time applications. Noise in an audio stream is segmented out
as environment sound. A hybrid classification approach is used, bagged support vector
machines (SVMs) with artificial neural networks (ANNs). Audio stream is classified, firstly,
into speech and nonspeech segment by using bagged support vector machines; nonspeech
segment is further classified into music and environment sound by using artificial neural
networks and lastly, speech segment is classified into silence and pure-speech segments on
the basis of rule-based classifier. Minimum data is used for training classifier; ensemble
methods are used for minimizing misclassification rate and approximately 98% accurate
segments are obtained. A fast and efficient algorithm is designed that can be used with real-
time multimedia applications.
Existing System:
Applications of audio content analysis can be categorized in two parts. One part is to
discriminate an audio stream into homogenous regions and the other part is to discriminate a
speech stream into segments, of different speakers. Lu et al. discriminate an audio stream into
different audio types. Classifier support vector machines and -nearest neighbor integrated with
linear spectral pairs-vector quantization are used respectively. The training is done on 2-hour
data.
An audio signal is divided into homogeneous regions by finding time boundaries also called
change points detection. In audio segmentation, with the help of change detection a sound signal
is segmented in homogenous and continuous temporal regions. The problem arises in defining
the criteria of homogeneity. By computing exact generalized likelihood ratio statistics, the audio
stream segmentation can be done without any prior knowledge of the classes. Mel-frequency
cepstral coefficients are used as feature. For calculating statistics large amount of training data is
required.
Proposed System
Janku and Hyniová proposed that MMI-supervised tree-based vector quantizer and
feedforward neural network can be used on a sound stream in order to detect
environmental sounds and speech. Regularized kernel based method based on kernel
Fisher discriminant can be used for unsupervised change detection.
Speech is not only a mode of transmitting word messages; it also emphasizes emotions,
personality, and so forth. Words contain vowel regions, which are of vital importance in
many speech applications mainly in speech segmentation and verification of speaker.
Vowel regions initiate when the vowel onset point occurs and ends when vowel offset point
occurs. Audio segmentation is also possible, by dividing an audio stream into segments, on
the basis of vowel regions.
Audio segmentation algorithms can be divided into three general categories. In the first
category, classifiers are designed. The features are extracted in time domain and frequency
domain; then classifier is used to discriminate audio signals on the basis of its content. The
second category of audio segmentation extracts features on statistics that is used by
classifier for discrimination. These types of features are called posterior probability based
features. Large amount of training data is required by the classifier to give accurate results.
The third category of audio segmentation algorithm emphasizes setting up effective
classifiers. The classifiers used in this category are Bayesian information criterion,
Gaussian likelihood ratio, and a hidden Markov model (HMM) classifier. These classifiers
also give good results when large training data is provided .
Audio segmentation and classification have many applications. Content-based audio
classification and retrieval are mostly used in entertainment industry, audio archive
management, commercial music usage, surveillance, and so forth. Nowadays, on the World
Wide Web, millions of databases are present; for audio searching and indexing audio
segmentation and classification are used. In monitoring broadcast news programs, audio
classification is used, helping in efficient and accurate navigation through broadcast news
archives.
The analysis of superimposed speech is a complex problem and improved performance
systems are required. In many audio processing applications, audio segmentation plays a
vital role in preprocessing step. It also has a significant impact on speech recognition
performance. That is why a fast and optimized audio classification and segmentation
algorithm is proposed which can be used for real-time applications of multimedia. The
audio input is classified and segmented into four basic audio types: pure-speech, music,
environment sound, and silence. An algorithm is proposed that requires less training data
and from which high accuracy can be achieved; that is, misclassification rate is minimum.
The organization of paper is as follows: Audio classification and segmentation algorithm
(proposed), preclassification step, feature extraction step, hybrid classifier approach
(bagged SVMs (support vector machines) with ANNs (artificial neural networks)), and
steps used for discrimination are discussed. In Results and Discussion the experimental
results are discussed.
Architecture:
SYSTEM CONFIGURATION:
Hardware requirements:
Processer : Any Update Processer
Ram : Min 4 GB
Hard Disk : Min 100 GB
Software requirements:
Operating System : Windows family
Technology : Python 3.6
IDE : PyCharm
Optimized audio classification and segmentation algorithm by using ensemble methods

Mais conteúdo relacionado

Semelhante a Optimized audio classification and segmentation algorithm by using ensemble methods

IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD Editor
 
Ijarcet vol-2-issue-4-1347-1351
Ijarcet vol-2-issue-4-1347-1351Ijarcet vol-2-issue-4-1347-1351
Ijarcet vol-2-issue-4-1347-1351
Editor IJARCET
 
survey on Hybrid recommendation mechanism to get effective ranking results fo...
survey on Hybrid recommendation mechanism to get effective ranking results fo...survey on Hybrid recommendation mechanism to get effective ranking results fo...
survey on Hybrid recommendation mechanism to get effective ranking results fo...
Suraj Ligade
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
AIRCC Publishing Corporation
 

Semelhante a Optimized audio classification and segmentation algorithm by using ensemble methods (20)

A Survey on: Sound Source Separation Methods
A Survey on: Sound Source Separation MethodsA Survey on: Sound Source Separation Methods
A Survey on: Sound Source Separation Methods
 
IRJET- Music Genre Classification using Machine Learning Algorithms: A Compar...
IRJET- Music Genre Classification using Machine Learning Algorithms: A Compar...IRJET- Music Genre Classification using Machine Learning Algorithms: A Compar...
IRJET- Music Genre Classification using Machine Learning Algorithms: A Compar...
 
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
 
Knn a machine learning approach to recognize a musical instrument
Knn  a machine learning approach to recognize a musical instrumentKnn  a machine learning approach to recognize a musical instrument
Knn a machine learning approach to recognize a musical instrument
 
H42045359
H42045359H42045359
H42045359
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
Snorm–A Prototype for Increasing Audio File Stepwise Normalization
Snorm–A Prototype for Increasing Audio File Stepwise NormalizationSnorm–A Prototype for Increasing Audio File Stepwise Normalization
Snorm–A Prototype for Increasing Audio File Stepwise Normalization
 
Dy36749754
Dy36749754Dy36749754
Dy36749754
 
IRJET- A Survey on Sound Recognition
IRJET- A Survey on Sound RecognitionIRJET- A Survey on Sound Recognition
IRJET- A Survey on Sound Recognition
 
Ijarcet vol-2-issue-4-1347-1351
Ijarcet vol-2-issue-4-1347-1351Ijarcet vol-2-issue-4-1347-1351
Ijarcet vol-2-issue-4-1347-1351
 
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
 
Utterance based speaker identification
Utterance based speaker identificationUtterance based speaker identification
Utterance based speaker identification
 
Speech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machineSpeech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machine
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
SPEECH ENHANCEMENT USING KERNEL AND NORMALIZED KERNEL AFFINE PROJECTION ALGOR...
SPEECH ENHANCEMENT USING KERNEL AND NORMALIZED KERNEL AFFINE PROJECTION ALGOR...SPEECH ENHANCEMENT USING KERNEL AND NORMALIZED KERNEL AFFINE PROJECTION ALGOR...
SPEECH ENHANCEMENT USING KERNEL AND NORMALIZED KERNEL AFFINE PROJECTION ALGOR...
 
Audio Features Based Steganography Detection in WAV File
Audio Features Based Steganography Detection in WAV FileAudio Features Based Steganography Detection in WAV File
Audio Features Based Steganography Detection in WAV File
 
Speech Signal Analysis
Speech Signal AnalysisSpeech Signal Analysis
Speech Signal Analysis
 
survey on Hybrid recommendation mechanism to get effective ranking results fo...
survey on Hybrid recommendation mechanism to get effective ranking results fo...survey on Hybrid recommendation mechanism to get effective ranking results fo...
survey on Hybrid recommendation mechanism to get effective ranking results fo...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 

Mais de Venkat Projects

Mais de Venkat Projects (20)

1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
 
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
 
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
 
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
 
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
 
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docxImage Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
 
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
 
WATERMARKING IMAGES
WATERMARKING IMAGESWATERMARKING IMAGES
WATERMARKING IMAGES
 
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
 
Application and evaluation of a K-Medoidsbased shape clustering method for an...
Application and evaluation of a K-Medoidsbased shape clustering method for an...Application and evaluation of a K-Medoidsbased shape clustering method for an...
Application and evaluation of a K-Medoidsbased shape clustering method for an...
 
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
 
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
 
2022 PYTHON MAJOR PROJECTS LIST.docx
2022 PYTHON MAJOR  PROJECTS LIST.docx2022 PYTHON MAJOR  PROJECTS LIST.docx
2022 PYTHON MAJOR PROJECTS LIST.docx
 
2022 PYTHON PROJECTS LIST.docx
2022 PYTHON PROJECTS LIST.docx2022 PYTHON PROJECTS LIST.docx
2022 PYTHON PROJECTS LIST.docx
 
2021 PYTHON PROJECTS LIST.docx
2021 PYTHON PROJECTS LIST.docx2021 PYTHON PROJECTS LIST.docx
2021 PYTHON PROJECTS LIST.docx
 
2021 python projects list
2021 python projects list2021 python projects list
2021 python projects list
 
10.sentiment analysis of customer product reviews using machine learni
10.sentiment analysis of customer product reviews using machine learni10.sentiment analysis of customer product reviews using machine learni
10.sentiment analysis of customer product reviews using machine learni
 
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
9.data analysis for understanding the impact of covid–19 vaccinations on the ...9.data analysis for understanding the impact of covid–19 vaccinations on the ...
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
 
6.iris recognition using machine learning technique
6.iris recognition using machine learning technique6.iris recognition using machine learning technique
6.iris recognition using machine learning technique
 
5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal cluster5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal cluster
 

Último

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 

Último (20)

Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 

Optimized audio classification and segmentation algorithm by using ensemble methods

  • 1. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods Abstract: Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs) with artificial neural networks (ANNs). Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real- time multimedia applications. Existing System: Applications of audio content analysis can be categorized in two parts. One part is to discriminate an audio stream into homogenous regions and the other part is to discriminate a speech stream into segments, of different speakers. Lu et al. discriminate an audio stream into different audio types. Classifier support vector machines and -nearest neighbor integrated with linear spectral pairs-vector quantization are used respectively. The training is done on 2-hour data. An audio signal is divided into homogeneous regions by finding time boundaries also called change points detection. In audio segmentation, with the help of change detection a sound signal is segmented in homogenous and continuous temporal regions. The problem arises in defining the criteria of homogeneity. By computing exact generalized likelihood ratio statistics, the audio stream segmentation can be done without any prior knowledge of the classes. Mel-frequency cepstral coefficients are used as feature. For calculating statistics large amount of training data is required.
  • 2. Proposed System Janku and Hyniová proposed that MMI-supervised tree-based vector quantizer and feedforward neural network can be used on a sound stream in order to detect environmental sounds and speech. Regularized kernel based method based on kernel Fisher discriminant can be used for unsupervised change detection. Speech is not only a mode of transmitting word messages; it also emphasizes emotions, personality, and so forth. Words contain vowel regions, which are of vital importance in many speech applications mainly in speech segmentation and verification of speaker. Vowel regions initiate when the vowel onset point occurs and ends when vowel offset point occurs. Audio segmentation is also possible, by dividing an audio stream into segments, on the basis of vowel regions. Audio segmentation algorithms can be divided into three general categories. In the first category, classifiers are designed. The features are extracted in time domain and frequency domain; then classifier is used to discriminate audio signals on the basis of its content. The second category of audio segmentation extracts features on statistics that is used by classifier for discrimination. These types of features are called posterior probability based features. Large amount of training data is required by the classifier to give accurate results. The third category of audio segmentation algorithm emphasizes setting up effective classifiers. The classifiers used in this category are Bayesian information criterion, Gaussian likelihood ratio, and a hidden Markov model (HMM) classifier. These classifiers also give good results when large training data is provided . Audio segmentation and classification have many applications. Content-based audio classification and retrieval are mostly used in entertainment industry, audio archive management, commercial music usage, surveillance, and so forth. Nowadays, on the World Wide Web, millions of databases are present; for audio searching and indexing audio segmentation and classification are used. In monitoring broadcast news programs, audio classification is used, helping in efficient and accurate navigation through broadcast news archives. The analysis of superimposed speech is a complex problem and improved performance systems are required. In many audio processing applications, audio segmentation plays a vital role in preprocessing step. It also has a significant impact on speech recognition performance. That is why a fast and optimized audio classification and segmentation algorithm is proposed which can be used for real-time applications of multimedia. The audio input is classified and segmented into four basic audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that requires less training data and from which high accuracy can be achieved; that is, misclassification rate is minimum. The organization of paper is as follows: Audio classification and segmentation algorithm (proposed), preclassification step, feature extraction step, hybrid classifier approach (bagged SVMs (support vector machines) with ANNs (artificial neural networks)), and
  • 3. steps used for discrimination are discussed. In Results and Discussion the experimental results are discussed. Architecture: SYSTEM CONFIGURATION: Hardware requirements: Processer : Any Update Processer Ram : Min 4 GB Hard Disk : Min 100 GB Software requirements: Operating System : Windows family Technology : Python 3.6 IDE : PyCharm