SlideShare uma empresa Scribd logo
1 de 10
Baixar para ler offline
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
556
TEXTUAL QUERY BASED SPORTS VIDEO RETRIEVAL BY EMBEDDED
TEXT RECOGNITION
Vilas Naik1
, Sagar Savalagi2
1
Department of CSE, Basaveshwar Engineering College, Bagalkot, India
2
Department of CSE, Basaveshwar Engineering College, Bagalkot, India
ABSTRACT
With growing popularity of sites like YouTube, video sharing and recording has obtained
popularity in last several years. Unlike text documents, these multimedia contents are difficult to
searched and index. Hence content based video retrieval systems are need of the hour. Content-Based
Video Retrieval (CBVR) is an active research discipline focused on computational strategies to
search for relevant videos based on multimodal content analysis in video such as visual, audio, text
to represent and index video. In recent research on Content Based Video Retrieval has presented
many such solutions based on these features. The textual content in the video in the form of
embedded and scene text. They are quite helpful for indexing the videos. Proposed work is a content
based video retrieval system based on textual ques. Text based video retrieval is an approach that
enables search based on the textual information present in the video. Regions of textual information
are identified within the frames of the video. Video is then annotated with the textual content present
in the images. Then traditionally, OCRs are used to extract the text within the video. It also enables
applications such as keyword based search in multimedia databases. With help of this video indexing
and retrieval is done. A result shows that the system is quite efficient with an accuracy of around
90%. A textual query returns higher accuracy than visual queries which proves the concept.
1. INTRODUCTION
With the development of various multimedia compression standards and significant increases
in desktop computer performance and storage, the widespread exchange of multimedia information
is becoming a reality. Video is arguably the most popular means of communication and entertain-
ment. With this popularity comes an increase in the volume of video and an increase need for the
ability to automatically sift through the search for relevant material stored in large video databases.
Even with increase in hardware capabilities, which make video distribution possible, factors such as
algorithms and speed and storage costs are concerns that must still be addressed. Considering this, a
first step should be therefore an attempt to increase speed when using existing compression stan-
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING &
TECHNOLOGY (IJCET)
ISSN 0976 – 6367(Print)
ISSN 0976 – 6375(Online)
Volume 4, Issue 4, July-August (2013), pp. 556-565
© IAEME: www.iaeme.com/ijcet.asp
Journal Impact Factor (2013): 6.1302 (Calculated by GISI)
www.jifactor.com
IJCET
© I A E M E
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
557
dards. Performing analysis in the compressed domain reduces the amount of efforts involved in de-
compression and providing a means of abstracting the data keeps the storage costs of the resulting
feature set low. Both of these problems are active areas of research. The aim of this proposed work is
to develop a new detection algorithm which has the ability of boosting the speed of search and in due
reduces the cost of the storage. Every day, both military and civilian equipment generates giga-bytes
of images. A huge amount of information is out there. However, it is impossible access or makes use
of the information unless it is organized so as to allow efficient browsing, searching, and retrieval.
Image retrieval has been a very active research area since the 1970s, with the thrust from two major
research communities, database management and computer vision. These two research communities
study image retrieval from different angles, one being text-based and the other visual-based. Many
advances, such as data modelling, multidimensional indexing, and query evaluation, have been made
along this research direction. There exist two major difficulties, especially when the size of image
collection is large (tens or hundreds of thousands) and vast amount of labour requirement in manual
image annotation. Other difficulty, which is more essential, results from the rich content in the im-
ages and the subjectivity of human perception. That is, for the same image content different people
may perceive it differently. The perception subjectivity and annotation impreciseness may cause un-
recoverable mismatches in later retrieval processes.
The proposed mechanism is unique scheme in the direction of alleviating these hurdles with a
new detection algorithm with boosting that offer a retrieving system which is based on text. The
work is folded in following steps: Initially frames are collected from video clip. From these frames
text part is segmented. Further, character segmentation identifies the characters. These characters are
recognized by the character recognition process carried by Optical Character Recognition (OCR). In
order to increase the accuracy of identification Color features are additionally extracted from video
clip. These color features are combined with text features and are stored in the database. When user
feeds text query it will be matched against stored characters and displays matching videos.
2. RELATED WORK
The video retrieval is important in multimedia search engine related applications. Recogniz-
ing the text is a crucial task in such applications. In last decade’s most of the researchers proposed
different methods for video retrieval some of the related work are summarized in the following.
An approach that enables search based on the textual information present in the video is in-
troduced in [1]. In this method a Regions of textual information are identified within the frames of
the video. Video is then annotated with the textual content present in the images. An approach that
enables matching at the image-level and thereby avoiding an OCR is also addressed. Videos contain-
ing the query string are retrieved from a video database and sorted based on the relevance. Results
are shown from video collections in English, Hindi and Telugu. In [2] a method to automatically
localize captions in JPEG compressed images and the I-frames of MPEG compressed videos is pro-
posed. In this method a Caption text regions are segmented from background images using their dis-
tinguishing texture characteristics. Unlike previously published methods which fully decompress the
video sequence before extracting the text regions, this method locates candidate caption text regions
directly in the DCT compressed domain using the intensity variation information encoded in the
DCT domain. Therefore, only a very small amount of decoding is required. A method in [3] is a
news video retrieval solution that target specific news videos based on their contents described by
overlay text is addressed. This approach is based on use of overlay text that conveys direct meaning
of video as a source of complementary information. The whole process is divided in to two steps.
Firstly, they build the “metadata labels” by detecting and extracting the overlay text. Secondly, these
labels are then used to index the news videos. The experiments are carried on the news videos from
NDTV News and large data set of video images containing artificial text developed at Image
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
558
Processing Centre (IPC) a research facility at National University of Sciences and Technology
(NUST), Pakistan. FFMPEG Library is used to extract the frames form news videos. Overlay scene
is also inserted on the video scene like the overlay text is, the transition region is also observed at. In
[4] the authors proposed three main factors, 1. The integration of the image and audio analysis re-
sults in identifying news segments. 2. The video OCR technology to detect text from frames, which
provides a good source of textual information for story classification when transcripts and close cap-
tions are not available. 3. Natural language processing (NLP) technologies which are used to per-
form automated categorization of news stories based on the texts obtained from close caption or vid-
eo OCR process. Based on these video structure and content analysis technologies, two advanced
video browsers are developed for home users: intelligent highlight player and HTML-based video
browser. Author has proposed a annotation-based indexing method which allows user to retrieve
video using textual annotations in [5]. This takes a text based query and compares it with tags used
for the indexing the event based video is retrieved from cricket video database. Experiment shows
that annotation based event retrieval based methods can potentially improve retrieval accuracy using
different searching techniques like binary search or indexing when database is very large and hereby
the video retrieval can be efficiently carried out with this type of retrieval system. A technique has
been proposed to address problems regarding extracting text from a video and to design algorithms
for each phase of extracting text from a video using java libraries and classes. In this first the input
video is framed into stream of images using the Java Media Framework (JMF) with the input being a
real time or a video from the database. Then pre processing algorithms are applied to convert the
image to gray scale and remove the disturbances like superimposed lines over the text, discontinuity
removal, and dot removal then continue with the algorithms for localization, segmentation and rec-
ognition for which uses the neural network pattern matching technique. The performance of an ap-
proach is demonstrated by presenting experimental results for a set of static images. Improving Mul-
timedia Retrieval with a Video OCR a set of experiments with a video OCR system (VOCR) tailored
for video information retrieval and establishes its importance in multimedia search in general and for
some specific queries in particular. By the method in [7] analysis of video frames producing candi-
date text regions is detailed. The text regions are then binaries and sent to a commercial OCR result-
ing in ASCII text that is finally used to create search indexes. The system is evaluated using the
TRECVID data. The effectiveness of various textual sources is evaluated on multimedia retrieval by
combining the VOCR outputs with automatic speech recognition (ASR) transcripts. For general
search queries, the VOCR system coupled with ASR sources outperforms the other system by a very
large extent. For search queries that involve named entities, especially people names, the VOCR sys-
tem even outperforms speech transcripts, demonstrating that source selection for particular query
types is extremely essential.
Another important consideration is the quality and complexity of pictures containing text for
evaluation. Some methods consider large fonts in images, advertisements and video clips . The me-
thods also have some limitations as method in [8] does not detect low contrast text and small fonts.
The techniques in [9] use text with deferent complex motions. The method in [10] as well as in [11]
detect only caption text in news video clips.
The work proposed extracts text from video frames by separating text region from back-
ground and employs conventional OCR for text recognition.
3. PROPOSED ALGORITHM FOR VIDEO RETRIEVAL
In this section, overview and detail description of all the blocks of the proposed system is given.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July
3.1 Overview of the Approach
The proposed mechanism is unique scheme that offers a video retrieval system which is
based on embedded text the method uses the information conveyed to embedded text to recognize
the video to be retrieved from collection based on text query .the mechanism matches que
presented in video frame based on feature explained . First extract frames from video. Text part is
segmented. Character segmentation extracts the characters. Character recognition recognizes the
characters. Color features from video scene are
are stored in the database. User can input either text query. If query is in text form, then that is
matched against stored characters and displays matched videos.
Figure 1.
Fig. 1 Proposed algorithm for Video retrieval by aText Query
3.2 The Text Query Based Video Retrieval Algorithm.
This proposed algorithm is summarized into following steps.
Step 1. Input a video and Convert it
Step 2.Apply Median Filter to each frame and perform sobel Edge Detection for detecting an
text region edge from the frame then Calculate Sumgraph. i.e. Adding rows and column
of binay image.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
559
mechanism is unique scheme that offers a video retrieval system which is
bedded text the method uses the information conveyed to embedded text to recognize
the video to be retrieved from collection based on text query .the mechanism matches que
presented in video frame based on feature explained . First extract frames from video. Text part is
segmented. Character segmentation extracts the characters. Character recognition recognizes the
features from video scene are extracted. Color features combined with text features
are stored in the database. User can input either text query. If query is in text form, then that is
matched against stored characters and displays matched videos. The over all flow is as in the
Proposed algorithm for Video retrieval by aText Query
The Text Query Based Video Retrieval Algorithm.
rithm is summarized into following steps.
. Input a video and Convert it in to frames.
.Apply Median Filter to each frame and perform sobel Edge Detection for detecting an
text region edge from the frame then Calculate Sumgraph. i.e. Adding rows and column
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
August (2013), © IAEME
mechanism is unique scheme that offers a video retrieval system which is
bedded text the method uses the information conveyed to embedded text to recognize
the video to be retrieved from collection based on text query .the mechanism matches query the text
presented in video frame based on feature explained . First extract frames from video. Text part is
segmented. Character segmentation extracts the characters. Character recognition recognizes the
features combined with text features
are stored in the database. User can input either text query. If query is in text form, then that is
The over all flow is as in the
.Apply Median Filter to each frame and perform sobel Edge Detection for detecting an
text region edge from the frame then Calculate Sumgraph. i.e. Adding rows and column
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
560
Step 3.Text region segmentation is performed by applying Threshold as
Threshold = (sum(sum(B'))/prod(size(sum(B')))*50 + max(max(sum(B')))*30)/100
Where B`= input image.
Step 4. Apply OCR to recognize the text characters from frames and color feature are stored
in database as text features. Normalize characters to size 32x32.
Step 5. Given a text query, extract characters. Match with character set associated with videos
in one direction. Calculate total character match with respect to each video.
Step 6. Retrieve the videos with highest matches.
3.3 Text region localization
As a first step, extract frames from that are taken from video collection on individual bases.
Convert an video frame into image because an video frame will be compressed format so when it
processes the frame it will be an image, then convert it into greyscale image as show. Now apply an
Median filter to an image the output of median filter is shown in fig 4.2. The median filter considers
each pixel in the image in turn and looks at its nearby neighbours to decide whether or not it is repre-
sentative of its surroundings. Instead of simply replacing the pixel value with the mean of neighbour-
ing pixel values, it replaces it with the median of those values. The median is calculated by first sort-
ing all the pixel values from the surrounding neighborhood into numerical order and then replacing
the pixel being considered with the middle pixel value. Now an sobel operator is used, Its an edge
detection algorithm technique which is applied to an greyscale image that detects an text region
edge from an greyscale image.
3.3 Text detection and Segmentation
After the text region is localized. Text area is to be segmented for further reorganization the
output of this step is a binary image where black text characters appear on a white background. This
stage included extraction of actual text regions as follows. Here again a median filter to an edge de-
tected image that will give us a smooth image now take the vertical and horizontal histogram. The
horizontal and vertical histogram, this represents the column-wise and row-wise histogram respec-
tively. These histograms represent the sum of differences of gray values between neighbouring pix-
els of an image, column-wise and row-wise. In the above step, first the horizontal correction is cal-
culated. To find a horizontal correction, the algorithm traverses through each column of an image. In
each column, the algorithm starts with the second pixel from the top. The difference between second
and first pixel is calculated. If the difference exceeds certain threshold, it is added to total sum of
differences. Then, algorithm will move downwards to calculate the difference between the third and
second pixels. So on, it moves until the end of a column and calculate the total sum of differences
between neighboring pixels. At the end, an array containing the column-wise sum is created. The
same process is carried out to find the vertical correction. In this case, rows are processed instead of
columns .Then calculate an threshold value with normalize sum as shown below.
Threshold= (sum(sum(B'))/prod(size(sum(B')))*50+max(max(sum(B')))*30)/100;
Where B`= input image.
The rows and column which satisfies the threshold value then those column are considered.
And this will gives us the rows and column where an text is appeared, then extraction of an text
block as shown in figure.2 (d) and storing that image into an result folder. Extract all regions sepa-
rately. Perform Sum graph. Extract Maxima to extract the characters and Normalize characters to
size 32x32.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July
(a)
(c)
Fig. 2 Overview of text detection and segmentation (a) original frame. (b) gray scale image
with noise reduction and edge detection.(c ) feature
3.4 Text Reorganization with Optical character reorganization (OCR)
This stage includes actual recognition of extracted characters by combining various fe
extracted in previous stages to give actual text. The output of the segmentation stage is co
and given as a input to this stage. Here an
put image and recognizes character’s. An
undergoes above 4 stage processing they are
processing. In above four stages an important stage is an feature extraction, On basis of feature e
traction an OCR ia possible to recognize. We have
is one of the simplest approaches to patter recognition.
Template matching: This process involves the use of a database of characters or templates. There
exists a template for all possible input
ter is compared to each template to find either an exact match, or the template with the closest r
presentation of the input character. If I(x, y) is the input character, Tn(x, y) is the templ
the matching function s(I, Tn) will return a value indicating how well template n matches the input
character. The generated outputs from the OCR are
for future indexing and retrieval. In Fig
rated out from the rest of the image and binarized. When this detected block is given as input to the
OCR, the corresponding ASCII output is shown in Fig
extraction part system detects the text blocks accurately even in a complex background, the OCR
also recognize 90% text correctly. As seen in Fig
to the presence of noise. Extract mean, standard deviation of
feature extracted is also store in with text database as text feature.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
561
(a) (b)
(c) (d)
Overview of text detection and segmentation (a) original frame. (b) gray scale image
tion and edge detection.(c ) feature vector graph when text detected in frame
(d) detected text
Text Reorganization with Optical character reorganization (OCR)
This stage includes actual recognition of extracted characters by combining various fe
ive actual text. The output of the segmentation stage is co
and given as a input to this stage. Here an Optical Character recognition (OCR) is used takes an i
put image and recognizes character’s. An When a text image is given input to OCR then a i
undergoes above 4 stage processing they are Pre-processing, Feature Extraction, Classification, Post
. In above four stages an important stage is an feature extraction, On basis of feature e
traction an OCR ia possible to recognize. We have used an template matching feature extraction, this
proaches to patter recognition.
This process involves the use of a database of characters or templates. There
ists a template for all possible input characters. For recognition to occur, the current i
ter is compared to each template to find either an exact match, or the template with the closest r
presentation of the input character. If I(x, y) is the input character, Tn(x, y) is the templ
tion s(I, Tn) will return a value indicating how well template n matches the input
ated outputs from the OCR are ASCII characters, which are used as keywords
and retrieval. In Figure. 3 (a) shows an identified as a text block. This it is sep
rated out from the rest of the image and binarized. When this detected block is given as input to the
SCII output is shown in Figure. 3.(c). It is observed that while the text
extraction part system detects the text blocks accurately even in a complex background, the OCR
t correctly. As seen in Figure. 3 (d), the some word was miss recognized due
Extract mean, standard deviation of R,G,B components of frames,
feature extracted is also store in with text database as text feature.
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
August (2013), © IAEME
Overview of text detection and segmentation (a) original frame. (b) gray scale image
vector graph when text detected in frame
This stage includes actual recognition of extracted characters by combining various features
ive actual text. The output of the segmentation stage is considered
Optical Character recognition (OCR) is used takes an in-
When a text image is given input to OCR then a image
processing, Feature Extraction, Classification, Post-
. In above four stages an important stage is an feature extraction, On basis of feature ex-
ing feature extraction, this
This process involves the use of a database of characters or templates. There
characters. For recognition to occur, the current input charac-
ter is compared to each template to find either an exact match, or the template with the closest re-
presentation of the input character. If I(x, y) is the input character, Tn(x, y) is the template n, then
tion s(I, Tn) will return a value indicating how well template n matches the input
ASCII characters, which are used as keywords
fied as a text block. This it is sepa-
rated out from the rest of the image and binarized. When this detected block is given as input to the
(c). It is observed that while the text
extraction part system detects the text blocks accurately even in a complex background, the OCR
(d), the some word was miss recognized due
R,G,B components of frames, color
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July
(a)
(c)
Fig 3 (a) Frame contaning text. (b)Original frame (c) Text extraction by done using
OCR
3.4 Text querying
A text query which is entered by an
text is extracted and recognized and sent to an matching process which is next stage as shown in fig
3.3. In that database an individual video has its own character set which is reco
the matching process which has an direct access to database as shown in fig 3.3. The video character
set associated with a videos which are stored in database with an
mean deviation, at first level while frame extract
racter with an of character set that takes place in one direction.
character ‘C’ followed by ‘R’, like this it matches character form query to character from video text
dataset. Then Calculate total character matches with respect to each video and Display the videos
names with highest matches result as shown in fig
Query
Fig 4. Block di
Matching process
Query text reorganization
Videos names
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976
6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
562
(a) (b)
(c) (d)
(a) Frame contaning text. (b)Original frame (c) Text extraction by done using
OCR. (d) text recognization by OCR
A text query which is entered by an user is processed as shown in figure 4. in which an query
text is extracted and recognized and sent to an matching process which is next stage as shown in fig
3.3. In that database an individual video has its own character set which is recognized by an OC
the matching process which has an direct access to database as shown in fig 3.3. The video character
set associated with a videos which are stored in database with an color feature extracted with std
iation, at first level while frame extraction. The process will start matching an query ch
racter with an of character set that takes place in one direction. The matching process will match an
lowed by ‘R’, like this it matches character form query to character from video text
late total character matches with respect to each video and Display the videos
result as shown in figure 5.
Block diagram of query processing
Database
Recognized text from video
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
August (2013), © IAEME
(a) Frame contaning text. (b)Original frame (c) Text extraction by done using
. in which an query
text is extracted and recognized and sent to an matching process which is next stage as shown in fig
nized by an OCR. In
the matching process which has an direct access to database as shown in fig 3.3. The video character
feature extracted with std
ion. The process will start matching an query cha-
The matching process will match an
lowed by ‘R’, like this it matches character form query to character from video text
late total character matches with respect to each video and Display the videos
Database
Recognized text from video
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
563
Fig 5. Result of query
4. EXPERIMENTAL RESULTS AND DISCUSSIONS
In this section, it presents quantitative results on the performance of the text extraction sys-
tem. The performance can be measured in terms of true positives (TP) - text regions identified cor-
rectly as text regions, false positives (FP) non-text regions identified as text regions and false nega-
tives (FN) - text regions missed by the system. Using these basic definitions, recall and precision of
retrieval can be defined as follows:
Recall = TP/(TP+FN) and
Precision = TP/(TP+FP)
While the above definitions are generic, different researchers use different units of text for
calculating recall and precision. Wong and Chen consider the number of characters while some of
the other authors count the number of text boxes or text regions. Jain and Yu calculate recall and
precision by considering either characters or blocks depending on the type of image. It has adopted
the second definition in which it consider the text regions as units for counting. The ground-truth is
obtained by manually marking the correct text regions. Having calculated recall and precision on a
large number of text-rich images. For video processing, testing the system on different types of mpeg
videos such as news clips, sports clips and commercials. The videos contain both caption texts as
well as scene texts of different font, color and intensity. Table 1 shows the performance of our pro-
posed method on four types of video. It is seen that our method has an overall average recall of 82%
and precision of 87%. The method is able to detect text under a large number of different conditions
like text with small fonts, low intensity, deferent color and cluttered background, text from noisy
video, News caption with horizontal scrolling and both caption text and scene.
Table 1 Recall and precision of text block extraction
No. of text
blocks
TP FP FN Recall % Precession%
SPORTS
VIDEO
780 624 60 24 80% 92%
Where TP= True positive, FP= False positive, FN= False negative
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
564
Table 2 Execution time for retrieval
Videos with
different back-
ground
Text
extraction
OCR Retrieval Total Time
in sec
Complex 57 sec for 100 frames 20 sec 1.55 sec 1:08:55 sec
Plain 23.78 sec for 60
frames
10 sec 1.20 sec 00:34:98 sec
The primary advantage of the proposed method is that it is very fast since most of the compu-
tationally intensive algorithms are applied only on the regions of interests. Table 2 shows processing
time for different types of video clips using a 1.83 GHZ Intel’s core 2 duo machine. As show com-
parative time required by the algorithms including retrieval is 1:08:55 sec for complex background
and for simple it is nearly half a sec. An average is taken over a number of different image sizes..
Since by process every frame which occurs at the rate of about 5.6 per second, and OCR takes about
20 sec for complex background and 10 sec for simple’s per retrieval concern it is with an 1:55 sec.
So it is seen that algorithm requires the least time for processing each frame and Retrieval.
5. CONCLUSION
The proposed work uses a textual contents to present a comprehensive video i.e used as con-
tent for retrieval system that is based on extracting text from video, recognition of text from image
and then matching text from database with query text. Beside this matching, system performs a
matching based on color features, such that irrelevant videos are not extracted. The proposed work
uses Median filter and soble operator for text region localization, an histogram for text segmentation
and on OCR is used for recognition embedded text from sports video. Result shows significant effi-
ciency in detection with a 80 % recall and 92% precession for an text region. Time taken for a re-
trieval for complex background will be 1.55 sec and for simple background will be an 1.20 sec Sys-
tem can be further improved by implementing better OCR technique for 100% accuracy in text rec-
ognition from videos. That will significantly improve the quality of the process.
REFERENCES
[1]. C. V. Jawahar, Balakrishna Chennupati, Balamanohar Paluri, Nataraj Jammalamadaka,2006
“Video Retrieval Based on Textual Queries”
[2]. Yu Zhong, Hongjiang Zhang, and Anil K. Jain, April 2000. “Automatic Caption Localization
in Compressed Video” IEEE transactions on pattern analysis and machine intelligence
[3]. Nilesh Bhojne, Pravinkumar Kamde and Dr. S. P. Algur , 2012 “News Video Indexing and
Retrieval using Overlay Text”.
[4]. Wei Qi, Lie Gu, Hao Jiang, Xiang-Rong Chen and Hong-Jiang Zhang, 1998 “Integrating Vis-
ual, Audio and Text analysis for news video”.
[5]. Shi-Yong Neo, Jin Zhao, Min-Yen Kan, and Tat-Seng Chua, 1998 “Video Retrieval using
High Level Features: Exploiting Query Matching and Confidence-based Weighting”.
[6]. Pranali Kosamkar, Vikram Wathodkar,Rajendra Shinde , April 2012 “Annotation Based
Event Retrieval in Cricket Video”, International Journal of Advances in Computing and In-
formation Researches
[7]. Jayshree Ghorpade, Raviraj Palvankar, Ajinkya Patankar and Snehal Rathi, June 2011 “Ex-
tracting Text From Video” Signal & Image Processing An International Journal (SIPIJ).
International Journal of Computer Engineering and Technology (IJCET), ISSN 0976-
6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME
565
[8] D. Xu and Shih-Fu Chang, 2007 “Visual Event Recognition in News Video using Kernel Me-
thods with Multi-Level Temporal Alignment”, IEEE Conference. on Computer Vision and
Pattern Recognition.
[9] H-K. Kim, , Dec 1996 “Efficient Automatic Text Location Method and Content-Based Index-
ing and Structuring of Video Database”. Journal of Visual Communication and Image Repre-
sentation,
[10] H. Li, D. Doerman and O. Kia, Jan. 2000 “Automatic Text Detection and Tracking in Digital
Video” IEEE Transactions on Image Processing.
[11] T. Sato, T. Kanade, E. Hughes and M. Smith, 1999 “Video OCR Indexing Digital News Li-
braries by Recognition of Superimposed Captions”. Multimedia Systems, Vol. 7,pp. 385-394.
[12] Vilas Naik, Prasanna Patil and Vishwanath Chikaraddi, “Action Event Retrieval from Cricket
Video using Audio Energy Feature for Event Summarization”, International Journal of
Computer Engineering & Technology (IJCET), Volume 4, Issue 4, 2013, pp. 267 - 274,
ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375.
[13] Vilas Naik, Vishwanath Chikaraddi and Prasanna Patil, “Query Clip Genre Recognition using
Tree Pruning Technique for Video Retrieval”, International Journal of Computer Engineering
& Technology (IJCET), Volume 4, Issue 4, 2013, pp. 257 - 266, ISSN Print: 0976 – 6367,
ISSN Online: 0976 – 6375.
[14] Vilas Naik and Raghavendra Havin, “Entropy Features Trained Support Vector Machine
Based Logo Detection Method for Replay Detection and Extraction from Sports Videos”,
International Journal of Graphics and Multimedia (IJGM), Volume 4, Issue 1, 2013,
pp. 20 - 30, ISSN Print: 0976 – 6448, ISSN Online: 0976 –6456.

Mais conteúdo relacionado

Mais procurados

Multimedia Content Understanding: Bringing Context to Content
Multimedia Content Understanding: Bringing Context to ContentMultimedia Content Understanding: Bringing Context to Content
Multimedia Content Understanding: Bringing Context to Content
Benoit HUET
 
Book of abstract volume 8 no 9 ijcsis december 2010
Book of abstract volume 8 no 9 ijcsis december 2010Book of abstract volume 8 no 9 ijcsis december 2010
Book of abstract volume 8 no 9 ijcsis december 2010
Oladokun Sulaiman
 
Ac02417471753
Ac02417471753Ac02417471753
Ac02417471753
IJMER
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
CSCJournals
 
Multidimensional approach in cbmmirs full paper v4.0
Multidimensional approach in cbmmirs  full paper  v4.0Multidimensional approach in cbmmirs  full paper  v4.0
Multidimensional approach in cbmmirs full paper v4.0
Albaar Rubhasy
 

Mais procurados (20)

IRJET- Reversible Image Data Hiding in an Encrypted Domain with High Level of...
IRJET- Reversible Image Data Hiding in an Encrypted Domain with High Level of...IRJET- Reversible Image Data Hiding in an Encrypted Domain with High Level of...
IRJET- Reversible Image Data Hiding in an Encrypted Domain with High Level of...
 
Video copy detection using segmentation method and
Video copy detection using segmentation method andVideo copy detection using segmentation method and
Video copy detection using segmentation method and
 
Mtech Second progresspresentation ON VIDEO SUMMARIZATION
Mtech Second progresspresentation ON VIDEO SUMMARIZATIONMtech Second progresspresentation ON VIDEO SUMMARIZATION
Mtech Second progresspresentation ON VIDEO SUMMARIZATION
 
Simulation based Performance Analysis of Histogram Shifting Method on Various...
Simulation based Performance Analysis of Histogram Shifting Method on Various...Simulation based Performance Analysis of Histogram Shifting Method on Various...
Simulation based Performance Analysis of Histogram Shifting Method on Various...
 
Multimedia Content Understanding: Bringing Context to Content
Multimedia Content Understanding: Bringing Context to ContentMultimedia Content Understanding: Bringing Context to Content
Multimedia Content Understanding: Bringing Context to Content
 
Book of abstract volume 8 no 9 ijcsis december 2010
Book of abstract volume 8 no 9 ijcsis december 2010Book of abstract volume 8 no 9 ijcsis december 2010
Book of abstract volume 8 no 9 ijcsis december 2010
 
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
Overview of Video Concept Detection using (CNN) Convolutional Neural NetworkOverview of Video Concept Detection using (CNN) Convolutional Neural Network
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
 
IRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for BlindIRJET- Wearable AI Device for Blind
IRJET- Wearable AI Device for Blind
 
Impact of Message Size on Least Significant Bit and Chaotic Logistic Mapping ...
Impact of Message Size on Least Significant Bit and Chaotic Logistic Mapping ...Impact of Message Size on Least Significant Bit and Chaotic Logistic Mapping ...
Impact of Message Size on Least Significant Bit and Chaotic Logistic Mapping ...
 
Optimized WES-System with Image Bit Embedding for Enhancing the Security of H...
Optimized WES-System with Image Bit Embedding for Enhancing the Security of H...Optimized WES-System with Image Bit Embedding for Enhancing the Security of H...
Optimized WES-System with Image Bit Embedding for Enhancing the Security of H...
 
76201950
7620195076201950
76201950
 
Ac02417471753
Ac02417471753Ac02417471753
Ac02417471753
 
Android Based Image Steganography
Android Based Image SteganographyAndroid Based Image Steganography
Android Based Image Steganography
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
 
Multidimensional approach in cbmmirs full paper v4.0
Multidimensional approach in cbmmirs  full paper  v4.0Multidimensional approach in cbmmirs  full paper  v4.0
Multidimensional approach in cbmmirs full paper v4.0
 
IRJET- Review on Human Action Detection in Stored Videos using Support Vector...
IRJET- Review on Human Action Detection in Stored Videos using Support Vector...IRJET- Review on Human Action Detection in Stored Videos using Support Vector...
IRJET- Review on Human Action Detection in Stored Videos using Support Vector...
 
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
Iris Encryption using (2, 2) Visual cryptography & Average Orientation Circul...
 
Image Steganography V2 i11 0143
Image Steganography V2 i11 0143Image Steganography V2 i11 0143
Image Steganography V2 i11 0143
 
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural Network
 
IRJET-Securing High Capacity Data Hiding using Combined Data Hiding Techniques
IRJET-Securing High Capacity Data Hiding using Combined Data Hiding TechniquesIRJET-Securing High Capacity Data Hiding using Combined Data Hiding Techniques
IRJET-Securing High Capacity Data Hiding using Combined Data Hiding Techniques
 

Destaque

Media evaluation
Media evaluationMedia evaluation
Media evaluation
sarah
 
Audience Survey Analysis
Audience Survey AnalysisAudience Survey Analysis
Audience Survey Analysis
Laura Davies
 
G4 2013 itinerário e percurso
G4 2013 itinerário e percursoG4 2013 itinerário e percurso
G4 2013 itinerário e percurso
Mauricio Luis
 
Enxugue até 6 kg em 14 dias com a dieta antibarriga
Enxugue até 6 kg em 14 dias com a dieta antibarrigaEnxugue até 6 kg em 14 dias com a dieta antibarriga
Enxugue até 6 kg em 14 dias com a dieta antibarriga
Noma do Brasil
 
Optics presentation
Optics presentationOptics presentation
Optics presentation
Ian Summers
 

Destaque (15)

մաս 2 րդ
մաս 2 րդմաս 2 րդ
մաս 2 րդ
 
Opinión documentada
Opinión documentadaOpinión documentada
Opinión documentada
 
Tutorial de slideshare
Tutorial de slideshareTutorial de slideshare
Tutorial de slideshare
 
Technology Expertise
Technology ExpertiseTechnology Expertise
Technology Expertise
 
Presentación sin título (2)
Presentación sin título (2)Presentación sin título (2)
Presentación sin título (2)
 
Media evaluation
Media evaluationMedia evaluation
Media evaluation
 
Audience Survey Analysis
Audience Survey AnalysisAudience Survey Analysis
Audience Survey Analysis
 
G4 2013 itinerário e percurso
G4 2013 itinerário e percursoG4 2013 itinerário e percurso
G4 2013 itinerário e percurso
 
Opinión documentada
Opinión documentadaOpinión documentada
Opinión documentada
 
I-Succeed Program Overview
I-Succeed Program OverviewI-Succeed Program Overview
I-Succeed Program Overview
 
Enxugue até 6 kg em 14 dias com a dieta antibarriga
Enxugue até 6 kg em 14 dias com a dieta antibarrigaEnxugue até 6 kg em 14 dias com a dieta antibarriga
Enxugue até 6 kg em 14 dias com a dieta antibarriga
 
Optics presentation
Optics presentationOptics presentation
Optics presentation
 
Converter suhu
Converter suhuConverter suhu
Converter suhu
 
Quoziente Familiare - estratto
Quoziente Familiare - estrattoQuoziente Familiare - estratto
Quoziente Familiare - estratto
 
2015.11.21 Scrum:用一半的時間做兩倍的事
2015.11.21 Scrum:用一半的時間做兩倍的事2015.11.21 Scrum:用一半的時間做兩倍的事
2015.11.21 Scrum:用一半的時間做兩倍的事
 

Semelhante a 50120130404055

Query clip genre recognition using tree pruning technique for video retrieval
Query clip genre recognition using tree pruning technique for video retrievalQuery clip genre recognition using tree pruning technique for video retrieval
Query clip genre recognition using tree pruning technique for video retrieval
IAEME Publication
 
Review on content based video lecture retrieval
Review on content based video lecture retrievalReview on content based video lecture retrieval
Review on content based video lecture retrieval
eSAT Journals
 
Action event retrieval from cricket video using audio energy feature for event
Action event retrieval from cricket video using audio energy feature for eventAction event retrieval from cricket video using audio energy feature for event
Action event retrieval from cricket video using audio energy feature for event
IAEME Publication
 
Action event retrieval from cricket video using audio energy feature for even...
Action event retrieval from cricket video using audio energy feature for even...Action event retrieval from cricket video using audio energy feature for even...
Action event retrieval from cricket video using audio energy feature for even...
IAEME Publication
 
Key frame extraction methodology for video annotation
Key frame extraction methodology for video annotationKey frame extraction methodology for video annotation
Key frame extraction methodology for video annotation
IAEME Publication
 
Video Liveness Verification
Video Liveness VerificationVideo Liveness Verification
Video Liveness Verification
ijtsrd
 

Semelhante a 50120130404055 (20)

Query clip genre recognition using tree pruning technique for video retrieval
Query clip genre recognition using tree pruning technique for video retrievalQuery clip genre recognition using tree pruning technique for video retrieval
Query clip genre recognition using tree pruning technique for video retrieval
 
Review on content based video lecture retrieval
Review on content based video lecture retrievalReview on content based video lecture retrieval
Review on content based video lecture retrieval
 
Content based video retrieval using discrete cosine transform
Content based video retrieval using discrete cosine transformContent based video retrieval using discrete cosine transform
Content based video retrieval using discrete cosine transform
 
Action event retrieval from cricket video using audio energy feature for event
Action event retrieval from cricket video using audio energy feature for eventAction event retrieval from cricket video using audio energy feature for event
Action event retrieval from cricket video using audio energy feature for event
 
Action event retrieval from cricket video using audio energy feature for even...
Action event retrieval from cricket video using audio energy feature for even...Action event retrieval from cricket video using audio energy feature for even...
Action event retrieval from cricket video using audio energy feature for even...
 
RECURRENT FEATURE GROUPING AND CLASSIFICATION MODEL FOR ACTION MODEL PREDICTI...
RECURRENT FEATURE GROUPING AND CLASSIFICATION MODEL FOR ACTION MODEL PREDICTI...RECURRENT FEATURE GROUPING AND CLASSIFICATION MODEL FOR ACTION MODEL PREDICTI...
RECURRENT FEATURE GROUPING AND CLASSIFICATION MODEL FOR ACTION MODEL PREDICTI...
 
RECURRENT FEATURE GROUPING AND CLASSIFICATION MODEL FOR ACTION MODEL PREDICTI...
RECURRENT FEATURE GROUPING AND CLASSIFICATION MODEL FOR ACTION MODEL PREDICTI...RECURRENT FEATURE GROUPING AND CLASSIFICATION MODEL FOR ACTION MODEL PREDICTI...
RECURRENT FEATURE GROUPING AND CLASSIFICATION MODEL FOR ACTION MODEL PREDICTI...
 
Sub1577
Sub1577Sub1577
Sub1577
 
Content based video retrieval system
Content based video retrieval systemContent based video retrieval system
Content based video retrieval system
 
System analysis and design for multimedia retrieval systems
System analysis and design for multimedia retrieval systemsSystem analysis and design for multimedia retrieval systems
System analysis and design for multimedia retrieval systems
 
An Stepped Forward Security System for Multimedia Content Material for Cloud ...
An Stepped Forward Security System for Multimedia Content Material for Cloud ...An Stepped Forward Security System for Multimedia Content Material for Cloud ...
An Stepped Forward Security System for Multimedia Content Material for Cloud ...
 
Key frame extraction methodology for video annotation
Key frame extraction methodology for video annotationKey frame extraction methodology for video annotation
Key frame extraction methodology for video annotation
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
 
Video Liveness Verification
Video Liveness VerificationVideo Liveness Verification
Video Liveness Verification
 
IRJET - Applications of Image and Video Deduplication: A Survey
IRJET -  	  Applications of Image and Video Deduplication: A SurveyIRJET -  	  Applications of Image and Video Deduplication: A Survey
IRJET - Applications of Image and Video Deduplication: A Survey
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization
 
VIDEO OBJECTS DESCRIPTION IN HINDI TEXT LANGUAGE
VIDEO OBJECTS DESCRIPTION IN HINDI TEXT LANGUAGE VIDEO OBJECTS DESCRIPTION IN HINDI TEXT LANGUAGE
VIDEO OBJECTS DESCRIPTION IN HINDI TEXT LANGUAGE
 
Profile based Video segmentation system to support E-learning
Profile based Video segmentation system to support E-learningProfile based Video segmentation system to support E-learning
Profile based Video segmentation system to support E-learning
 
On Annotation of Video Content for Multimedia Retrieval and Sharing
On Annotation of Video Content for Multimedia  Retrieval and SharingOn Annotation of Video Content for Multimedia  Retrieval and Sharing
On Annotation of Video Content for Multimedia Retrieval and Sharing
 

Mais de IAEME Publication

A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
IAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
IAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
IAEME Publication
 

Mais de IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

50120130404055

  • 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 556 TEXTUAL QUERY BASED SPORTS VIDEO RETRIEVAL BY EMBEDDED TEXT RECOGNITION Vilas Naik1 , Sagar Savalagi2 1 Department of CSE, Basaveshwar Engineering College, Bagalkot, India 2 Department of CSE, Basaveshwar Engineering College, Bagalkot, India ABSTRACT With growing popularity of sites like YouTube, video sharing and recording has obtained popularity in last several years. Unlike text documents, these multimedia contents are difficult to searched and index. Hence content based video retrieval systems are need of the hour. Content-Based Video Retrieval (CBVR) is an active research discipline focused on computational strategies to search for relevant videos based on multimodal content analysis in video such as visual, audio, text to represent and index video. In recent research on Content Based Video Retrieval has presented many such solutions based on these features. The textual content in the video in the form of embedded and scene text. They are quite helpful for indexing the videos. Proposed work is a content based video retrieval system based on textual ques. Text based video retrieval is an approach that enables search based on the textual information present in the video. Regions of textual information are identified within the frames of the video. Video is then annotated with the textual content present in the images. Then traditionally, OCRs are used to extract the text within the video. It also enables applications such as keyword based search in multimedia databases. With help of this video indexing and retrieval is done. A result shows that the system is quite efficient with an accuracy of around 90%. A textual query returns higher accuracy than visual queries which proves the concept. 1. INTRODUCTION With the development of various multimedia compression standards and significant increases in desktop computer performance and storage, the widespread exchange of multimedia information is becoming a reality. Video is arguably the most popular means of communication and entertain- ment. With this popularity comes an increase in the volume of video and an increase need for the ability to automatically sift through the search for relevant material stored in large video databases. Even with increase in hardware capabilities, which make video distribution possible, factors such as algorithms and speed and storage costs are concerns that must still be addressed. Considering this, a first step should be therefore an attempt to increase speed when using existing compression stan- INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), pp. 556-565 © IAEME: www.iaeme.com/ijcet.asp Journal Impact Factor (2013): 6.1302 (Calculated by GISI) www.jifactor.com IJCET © I A E M E
  • 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 557 dards. Performing analysis in the compressed domain reduces the amount of efforts involved in de- compression and providing a means of abstracting the data keeps the storage costs of the resulting feature set low. Both of these problems are active areas of research. The aim of this proposed work is to develop a new detection algorithm which has the ability of boosting the speed of search and in due reduces the cost of the storage. Every day, both military and civilian equipment generates giga-bytes of images. A huge amount of information is out there. However, it is impossible access or makes use of the information unless it is organized so as to allow efficient browsing, searching, and retrieval. Image retrieval has been a very active research area since the 1970s, with the thrust from two major research communities, database management and computer vision. These two research communities study image retrieval from different angles, one being text-based and the other visual-based. Many advances, such as data modelling, multidimensional indexing, and query evaluation, have been made along this research direction. There exist two major difficulties, especially when the size of image collection is large (tens or hundreds of thousands) and vast amount of labour requirement in manual image annotation. Other difficulty, which is more essential, results from the rich content in the im- ages and the subjectivity of human perception. That is, for the same image content different people may perceive it differently. The perception subjectivity and annotation impreciseness may cause un- recoverable mismatches in later retrieval processes. The proposed mechanism is unique scheme in the direction of alleviating these hurdles with a new detection algorithm with boosting that offer a retrieving system which is based on text. The work is folded in following steps: Initially frames are collected from video clip. From these frames text part is segmented. Further, character segmentation identifies the characters. These characters are recognized by the character recognition process carried by Optical Character Recognition (OCR). In order to increase the accuracy of identification Color features are additionally extracted from video clip. These color features are combined with text features and are stored in the database. When user feeds text query it will be matched against stored characters and displays matching videos. 2. RELATED WORK The video retrieval is important in multimedia search engine related applications. Recogniz- ing the text is a crucial task in such applications. In last decade’s most of the researchers proposed different methods for video retrieval some of the related work are summarized in the following. An approach that enables search based on the textual information present in the video is in- troduced in [1]. In this method a Regions of textual information are identified within the frames of the video. Video is then annotated with the textual content present in the images. An approach that enables matching at the image-level and thereby avoiding an OCR is also addressed. Videos contain- ing the query string are retrieved from a video database and sorted based on the relevance. Results are shown from video collections in English, Hindi and Telugu. In [2] a method to automatically localize captions in JPEG compressed images and the I-frames of MPEG compressed videos is pro- posed. In this method a Caption text regions are segmented from background images using their dis- tinguishing texture characteristics. Unlike previously published methods which fully decompress the video sequence before extracting the text regions, this method locates candidate caption text regions directly in the DCT compressed domain using the intensity variation information encoded in the DCT domain. Therefore, only a very small amount of decoding is required. A method in [3] is a news video retrieval solution that target specific news videos based on their contents described by overlay text is addressed. This approach is based on use of overlay text that conveys direct meaning of video as a source of complementary information. The whole process is divided in to two steps. Firstly, they build the “metadata labels” by detecting and extracting the overlay text. Secondly, these labels are then used to index the news videos. The experiments are carried on the news videos from NDTV News and large data set of video images containing artificial text developed at Image
  • 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 558 Processing Centre (IPC) a research facility at National University of Sciences and Technology (NUST), Pakistan. FFMPEG Library is used to extract the frames form news videos. Overlay scene is also inserted on the video scene like the overlay text is, the transition region is also observed at. In [4] the authors proposed three main factors, 1. The integration of the image and audio analysis re- sults in identifying news segments. 2. The video OCR technology to detect text from frames, which provides a good source of textual information for story classification when transcripts and close cap- tions are not available. 3. Natural language processing (NLP) technologies which are used to per- form automated categorization of news stories based on the texts obtained from close caption or vid- eo OCR process. Based on these video structure and content analysis technologies, two advanced video browsers are developed for home users: intelligent highlight player and HTML-based video browser. Author has proposed a annotation-based indexing method which allows user to retrieve video using textual annotations in [5]. This takes a text based query and compares it with tags used for the indexing the event based video is retrieved from cricket video database. Experiment shows that annotation based event retrieval based methods can potentially improve retrieval accuracy using different searching techniques like binary search or indexing when database is very large and hereby the video retrieval can be efficiently carried out with this type of retrieval system. A technique has been proposed to address problems regarding extracting text from a video and to design algorithms for each phase of extracting text from a video using java libraries and classes. In this first the input video is framed into stream of images using the Java Media Framework (JMF) with the input being a real time or a video from the database. Then pre processing algorithms are applied to convert the image to gray scale and remove the disturbances like superimposed lines over the text, discontinuity removal, and dot removal then continue with the algorithms for localization, segmentation and rec- ognition for which uses the neural network pattern matching technique. The performance of an ap- proach is demonstrated by presenting experimental results for a set of static images. Improving Mul- timedia Retrieval with a Video OCR a set of experiments with a video OCR system (VOCR) tailored for video information retrieval and establishes its importance in multimedia search in general and for some specific queries in particular. By the method in [7] analysis of video frames producing candi- date text regions is detailed. The text regions are then binaries and sent to a commercial OCR result- ing in ASCII text that is finally used to create search indexes. The system is evaluated using the TRECVID data. The effectiveness of various textual sources is evaluated on multimedia retrieval by combining the VOCR outputs with automatic speech recognition (ASR) transcripts. For general search queries, the VOCR system coupled with ASR sources outperforms the other system by a very large extent. For search queries that involve named entities, especially people names, the VOCR sys- tem even outperforms speech transcripts, demonstrating that source selection for particular query types is extremely essential. Another important consideration is the quality and complexity of pictures containing text for evaluation. Some methods consider large fonts in images, advertisements and video clips . The me- thods also have some limitations as method in [8] does not detect low contrast text and small fonts. The techniques in [9] use text with deferent complex motions. The method in [10] as well as in [11] detect only caption text in news video clips. The work proposed extracts text from video frames by separating text region from back- ground and employs conventional OCR for text recognition. 3. PROPOSED ALGORITHM FOR VIDEO RETRIEVAL In this section, overview and detail description of all the blocks of the proposed system is given.
  • 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July 3.1 Overview of the Approach The proposed mechanism is unique scheme that offers a video retrieval system which is based on embedded text the method uses the information conveyed to embedded text to recognize the video to be retrieved from collection based on text query .the mechanism matches que presented in video frame based on feature explained . First extract frames from video. Text part is segmented. Character segmentation extracts the characters. Character recognition recognizes the characters. Color features from video scene are are stored in the database. User can input either text query. If query is in text form, then that is matched against stored characters and displays matched videos. Figure 1. Fig. 1 Proposed algorithm for Video retrieval by aText Query 3.2 The Text Query Based Video Retrieval Algorithm. This proposed algorithm is summarized into following steps. Step 1. Input a video and Convert it Step 2.Apply Median Filter to each frame and perform sobel Edge Detection for detecting an text region edge from the frame then Calculate Sumgraph. i.e. Adding rows and column of binay image. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 559 mechanism is unique scheme that offers a video retrieval system which is bedded text the method uses the information conveyed to embedded text to recognize the video to be retrieved from collection based on text query .the mechanism matches que presented in video frame based on feature explained . First extract frames from video. Text part is segmented. Character segmentation extracts the characters. Character recognition recognizes the features from video scene are extracted. Color features combined with text features are stored in the database. User can input either text query. If query is in text form, then that is matched against stored characters and displays matched videos. The over all flow is as in the Proposed algorithm for Video retrieval by aText Query The Text Query Based Video Retrieval Algorithm. rithm is summarized into following steps. . Input a video and Convert it in to frames. .Apply Median Filter to each frame and perform sobel Edge Detection for detecting an text region edge from the frame then Calculate Sumgraph. i.e. Adding rows and column International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- August (2013), © IAEME mechanism is unique scheme that offers a video retrieval system which is bedded text the method uses the information conveyed to embedded text to recognize the video to be retrieved from collection based on text query .the mechanism matches query the text presented in video frame based on feature explained . First extract frames from video. Text part is segmented. Character segmentation extracts the characters. Character recognition recognizes the features combined with text features are stored in the database. User can input either text query. If query is in text form, then that is The over all flow is as in the .Apply Median Filter to each frame and perform sobel Edge Detection for detecting an text region edge from the frame then Calculate Sumgraph. i.e. Adding rows and column
  • 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 560 Step 3.Text region segmentation is performed by applying Threshold as Threshold = (sum(sum(B'))/prod(size(sum(B')))*50 + max(max(sum(B')))*30)/100 Where B`= input image. Step 4. Apply OCR to recognize the text characters from frames and color feature are stored in database as text features. Normalize characters to size 32x32. Step 5. Given a text query, extract characters. Match with character set associated with videos in one direction. Calculate total character match with respect to each video. Step 6. Retrieve the videos with highest matches. 3.3 Text region localization As a first step, extract frames from that are taken from video collection on individual bases. Convert an video frame into image because an video frame will be compressed format so when it processes the frame it will be an image, then convert it into greyscale image as show. Now apply an Median filter to an image the output of median filter is shown in fig 4.2. The median filter considers each pixel in the image in turn and looks at its nearby neighbours to decide whether or not it is repre- sentative of its surroundings. Instead of simply replacing the pixel value with the mean of neighbour- ing pixel values, it replaces it with the median of those values. The median is calculated by first sort- ing all the pixel values from the surrounding neighborhood into numerical order and then replacing the pixel being considered with the middle pixel value. Now an sobel operator is used, Its an edge detection algorithm technique which is applied to an greyscale image that detects an text region edge from an greyscale image. 3.3 Text detection and Segmentation After the text region is localized. Text area is to be segmented for further reorganization the output of this step is a binary image where black text characters appear on a white background. This stage included extraction of actual text regions as follows. Here again a median filter to an edge de- tected image that will give us a smooth image now take the vertical and horizontal histogram. The horizontal and vertical histogram, this represents the column-wise and row-wise histogram respec- tively. These histograms represent the sum of differences of gray values between neighbouring pix- els of an image, column-wise and row-wise. In the above step, first the horizontal correction is cal- culated. To find a horizontal correction, the algorithm traverses through each column of an image. In each column, the algorithm starts with the second pixel from the top. The difference between second and first pixel is calculated. If the difference exceeds certain threshold, it is added to total sum of differences. Then, algorithm will move downwards to calculate the difference between the third and second pixels. So on, it moves until the end of a column and calculate the total sum of differences between neighboring pixels. At the end, an array containing the column-wise sum is created. The same process is carried out to find the vertical correction. In this case, rows are processed instead of columns .Then calculate an threshold value with normalize sum as shown below. Threshold= (sum(sum(B'))/prod(size(sum(B')))*50+max(max(sum(B')))*30)/100; Where B`= input image. The rows and column which satisfies the threshold value then those column are considered. And this will gives us the rows and column where an text is appeared, then extraction of an text block as shown in figure.2 (d) and storing that image into an result folder. Extract all regions sepa- rately. Perform Sum graph. Extract Maxima to extract the characters and Normalize characters to size 32x32.
  • 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July (a) (c) Fig. 2 Overview of text detection and segmentation (a) original frame. (b) gray scale image with noise reduction and edge detection.(c ) feature 3.4 Text Reorganization with Optical character reorganization (OCR) This stage includes actual recognition of extracted characters by combining various fe extracted in previous stages to give actual text. The output of the segmentation stage is co and given as a input to this stage. Here an put image and recognizes character’s. An undergoes above 4 stage processing they are processing. In above four stages an important stage is an feature extraction, On basis of feature e traction an OCR ia possible to recognize. We have is one of the simplest approaches to patter recognition. Template matching: This process involves the use of a database of characters or templates. There exists a template for all possible input ter is compared to each template to find either an exact match, or the template with the closest r presentation of the input character. If I(x, y) is the input character, Tn(x, y) is the templ the matching function s(I, Tn) will return a value indicating how well template n matches the input character. The generated outputs from the OCR are for future indexing and retrieval. In Fig rated out from the rest of the image and binarized. When this detected block is given as input to the OCR, the corresponding ASCII output is shown in Fig extraction part system detects the text blocks accurately even in a complex background, the OCR also recognize 90% text correctly. As seen in Fig to the presence of noise. Extract mean, standard deviation of feature extracted is also store in with text database as text feature. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 561 (a) (b) (c) (d) Overview of text detection and segmentation (a) original frame. (b) gray scale image tion and edge detection.(c ) feature vector graph when text detected in frame (d) detected text Text Reorganization with Optical character reorganization (OCR) This stage includes actual recognition of extracted characters by combining various fe ive actual text. The output of the segmentation stage is co and given as a input to this stage. Here an Optical Character recognition (OCR) is used takes an i put image and recognizes character’s. An When a text image is given input to OCR then a i undergoes above 4 stage processing they are Pre-processing, Feature Extraction, Classification, Post . In above four stages an important stage is an feature extraction, On basis of feature e traction an OCR ia possible to recognize. We have used an template matching feature extraction, this proaches to patter recognition. This process involves the use of a database of characters or templates. There ists a template for all possible input characters. For recognition to occur, the current i ter is compared to each template to find either an exact match, or the template with the closest r presentation of the input character. If I(x, y) is the input character, Tn(x, y) is the templ tion s(I, Tn) will return a value indicating how well template n matches the input ated outputs from the OCR are ASCII characters, which are used as keywords and retrieval. In Figure. 3 (a) shows an identified as a text block. This it is sep rated out from the rest of the image and binarized. When this detected block is given as input to the SCII output is shown in Figure. 3.(c). It is observed that while the text extraction part system detects the text blocks accurately even in a complex background, the OCR t correctly. As seen in Figure. 3 (d), the some word was miss recognized due Extract mean, standard deviation of R,G,B components of frames, feature extracted is also store in with text database as text feature. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- August (2013), © IAEME Overview of text detection and segmentation (a) original frame. (b) gray scale image vector graph when text detected in frame This stage includes actual recognition of extracted characters by combining various features ive actual text. The output of the segmentation stage is considered Optical Character recognition (OCR) is used takes an in- When a text image is given input to OCR then a image processing, Feature Extraction, Classification, Post- . In above four stages an important stage is an feature extraction, On basis of feature ex- ing feature extraction, this This process involves the use of a database of characters or templates. There characters. For recognition to occur, the current input charac- ter is compared to each template to find either an exact match, or the template with the closest re- presentation of the input character. If I(x, y) is the input character, Tn(x, y) is the template n, then tion s(I, Tn) will return a value indicating how well template n matches the input ASCII characters, which are used as keywords fied as a text block. This it is sepa- rated out from the rest of the image and binarized. When this detected block is given as input to the (c). It is observed that while the text extraction part system detects the text blocks accurately even in a complex background, the OCR (d), the some word was miss recognized due R,G,B components of frames, color
  • 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July (a) (c) Fig 3 (a) Frame contaning text. (b)Original frame (c) Text extraction by done using OCR 3.4 Text querying A text query which is entered by an text is extracted and recognized and sent to an matching process which is next stage as shown in fig 3.3. In that database an individual video has its own character set which is reco the matching process which has an direct access to database as shown in fig 3.3. The video character set associated with a videos which are stored in database with an mean deviation, at first level while frame extract racter with an of character set that takes place in one direction. character ‘C’ followed by ‘R’, like this it matches character form query to character from video text dataset. Then Calculate total character matches with respect to each video and Display the videos names with highest matches result as shown in fig Query Fig 4. Block di Matching process Query text reorganization Videos names International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 562 (a) (b) (c) (d) (a) Frame contaning text. (b)Original frame (c) Text extraction by done using OCR. (d) text recognization by OCR A text query which is entered by an user is processed as shown in figure 4. in which an query text is extracted and recognized and sent to an matching process which is next stage as shown in fig 3.3. In that database an individual video has its own character set which is recognized by an OC the matching process which has an direct access to database as shown in fig 3.3. The video character set associated with a videos which are stored in database with an color feature extracted with std iation, at first level while frame extraction. The process will start matching an query ch racter with an of character set that takes place in one direction. The matching process will match an lowed by ‘R’, like this it matches character form query to character from video text late total character matches with respect to each video and Display the videos result as shown in figure 5. Block diagram of query processing Database Recognized text from video International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- August (2013), © IAEME (a) Frame contaning text. (b)Original frame (c) Text extraction by done using . in which an query text is extracted and recognized and sent to an matching process which is next stage as shown in fig nized by an OCR. In the matching process which has an direct access to database as shown in fig 3.3. The video character feature extracted with std ion. The process will start matching an query cha- The matching process will match an lowed by ‘R’, like this it matches character form query to character from video text late total character matches with respect to each video and Display the videos Database Recognized text from video
  • 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 563 Fig 5. Result of query 4. EXPERIMENTAL RESULTS AND DISCUSSIONS In this section, it presents quantitative results on the performance of the text extraction sys- tem. The performance can be measured in terms of true positives (TP) - text regions identified cor- rectly as text regions, false positives (FP) non-text regions identified as text regions and false nega- tives (FN) - text regions missed by the system. Using these basic definitions, recall and precision of retrieval can be defined as follows: Recall = TP/(TP+FN) and Precision = TP/(TP+FP) While the above definitions are generic, different researchers use different units of text for calculating recall and precision. Wong and Chen consider the number of characters while some of the other authors count the number of text boxes or text regions. Jain and Yu calculate recall and precision by considering either characters or blocks depending on the type of image. It has adopted the second definition in which it consider the text regions as units for counting. The ground-truth is obtained by manually marking the correct text regions. Having calculated recall and precision on a large number of text-rich images. For video processing, testing the system on different types of mpeg videos such as news clips, sports clips and commercials. The videos contain both caption texts as well as scene texts of different font, color and intensity. Table 1 shows the performance of our pro- posed method on four types of video. It is seen that our method has an overall average recall of 82% and precision of 87%. The method is able to detect text under a large number of different conditions like text with small fonts, low intensity, deferent color and cluttered background, text from noisy video, News caption with horizontal scrolling and both caption text and scene. Table 1 Recall and precision of text block extraction No. of text blocks TP FP FN Recall % Precession% SPORTS VIDEO 780 624 60 24 80% 92% Where TP= True positive, FP= False positive, FN= False negative
  • 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 564 Table 2 Execution time for retrieval Videos with different back- ground Text extraction OCR Retrieval Total Time in sec Complex 57 sec for 100 frames 20 sec 1.55 sec 1:08:55 sec Plain 23.78 sec for 60 frames 10 sec 1.20 sec 00:34:98 sec The primary advantage of the proposed method is that it is very fast since most of the compu- tationally intensive algorithms are applied only on the regions of interests. Table 2 shows processing time for different types of video clips using a 1.83 GHZ Intel’s core 2 duo machine. As show com- parative time required by the algorithms including retrieval is 1:08:55 sec for complex background and for simple it is nearly half a sec. An average is taken over a number of different image sizes.. Since by process every frame which occurs at the rate of about 5.6 per second, and OCR takes about 20 sec for complex background and 10 sec for simple’s per retrieval concern it is with an 1:55 sec. So it is seen that algorithm requires the least time for processing each frame and Retrieval. 5. CONCLUSION The proposed work uses a textual contents to present a comprehensive video i.e used as con- tent for retrieval system that is based on extracting text from video, recognition of text from image and then matching text from database with query text. Beside this matching, system performs a matching based on color features, such that irrelevant videos are not extracted. The proposed work uses Median filter and soble operator for text region localization, an histogram for text segmentation and on OCR is used for recognition embedded text from sports video. Result shows significant effi- ciency in detection with a 80 % recall and 92% precession for an text region. Time taken for a re- trieval for complex background will be 1.55 sec and for simple background will be an 1.20 sec Sys- tem can be further improved by implementing better OCR technique for 100% accuracy in text rec- ognition from videos. That will significantly improve the quality of the process. REFERENCES [1]. C. V. Jawahar, Balakrishna Chennupati, Balamanohar Paluri, Nataraj Jammalamadaka,2006 “Video Retrieval Based on Textual Queries” [2]. Yu Zhong, Hongjiang Zhang, and Anil K. Jain, April 2000. “Automatic Caption Localization in Compressed Video” IEEE transactions on pattern analysis and machine intelligence [3]. Nilesh Bhojne, Pravinkumar Kamde and Dr. S. P. Algur , 2012 “News Video Indexing and Retrieval using Overlay Text”. [4]. Wei Qi, Lie Gu, Hao Jiang, Xiang-Rong Chen and Hong-Jiang Zhang, 1998 “Integrating Vis- ual, Audio and Text analysis for news video”. [5]. Shi-Yong Neo, Jin Zhao, Min-Yen Kan, and Tat-Seng Chua, 1998 “Video Retrieval using High Level Features: Exploiting Query Matching and Confidence-based Weighting”. [6]. Pranali Kosamkar, Vikram Wathodkar,Rajendra Shinde , April 2012 “Annotation Based Event Retrieval in Cricket Video”, International Journal of Advances in Computing and In- formation Researches [7]. Jayshree Ghorpade, Raviraj Palvankar, Ajinkya Patankar and Snehal Rathi, June 2011 “Ex- tracting Text From Video” Signal & Image Processing An International Journal (SIPIJ).
  • 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 4, July-August (2013), © IAEME 565 [8] D. Xu and Shih-Fu Chang, 2007 “Visual Event Recognition in News Video using Kernel Me- thods with Multi-Level Temporal Alignment”, IEEE Conference. on Computer Vision and Pattern Recognition. [9] H-K. Kim, , Dec 1996 “Efficient Automatic Text Location Method and Content-Based Index- ing and Structuring of Video Database”. Journal of Visual Communication and Image Repre- sentation, [10] H. Li, D. Doerman and O. Kia, Jan. 2000 “Automatic Text Detection and Tracking in Digital Video” IEEE Transactions on Image Processing. [11] T. Sato, T. Kanade, E. Hughes and M. Smith, 1999 “Video OCR Indexing Digital News Li- braries by Recognition of Superimposed Captions”. Multimedia Systems, Vol. 7,pp. 385-394. [12] Vilas Naik, Prasanna Patil and Vishwanath Chikaraddi, “Action Event Retrieval from Cricket Video using Audio Energy Feature for Event Summarization”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 4, 2013, pp. 267 - 274, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [13] Vilas Naik, Vishwanath Chikaraddi and Prasanna Patil, “Query Clip Genre Recognition using Tree Pruning Technique for Video Retrieval”, International Journal of Computer Engineering & Technology (IJCET), Volume 4, Issue 4, 2013, pp. 257 - 266, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [14] Vilas Naik and Raghavendra Havin, “Entropy Features Trained Support Vector Machine Based Logo Detection Method for Replay Detection and Extraction from Sports Videos”, International Journal of Graphics and Multimedia (IJGM), Volume 4, Issue 1, 2013, pp. 20 - 30, ISSN Print: 0976 – 6448, ISSN Online: 0976 –6456.