Home
 Mirror Sites
 General Information
 Confernce Schedule
 Technical Program
 Tutorials
 Industry Technology Tracks
 Exhibits
 Sponsors
 Registration
 Coming to Phoenix
 Call for Papers
 Author's Kit
 On-line Review
 Future Conferences
 Help
|
Abstract: Session MMSP-1 |
|
MMSP-1.1
|
Time-Series Active Search for Quick Retrieval of Audio and Video
Kunio Kashino,
Gavin Smith,
Hiroshi Murase (NTT Basic Research Laboratories)
This paper proposes a search method that can quickly detect and locate known sound (video) in a long audio (video) stream. The method is based on active search. Active search reduces the number of candidate matches between reference and input signals by approximately 10 to 100 times compared to exhaustive search, while guaranteeing the same retrieval accuracy. We proposed a quick search method in our previous paper, and here we focus on improvement of the accuracy. Thus the feature used has been extended to the audio power spectrum and temporal division of the histogram windows has been introduced to incorporate time information. Tests carried out under practical circumstances clearly show the accuracy improvement. The proposed method is still so fast that it can correctly retrieve a 15-s commercial in a 6-h recording of TV broadcasting within 2 s, once the features are calculated.
|
MMSP-1.2
|
Content-Based Video Indexing of TV Broadcast News Using Hidden Markov Models
Stefan Eickeler,
Stefan Mueller (Gerhard-Mercator-University Duisburg)
This paper presents a new approach to content-based video indexing using
Hidden Markov Models (HMMs). In this approach one feature vector is
calculated for each image of the video sequence. These feature vectors are
modeled and classified using HMMs. This approach has many advantages
compared to other video indexing approaches. The system has automatic
learning capabilities. It is trained by presenting manually indexed video
sequences. To improve the system we use a video model, that allows the
classification of complex video sequences. The presented approach works
three times faster than real-time. We tested our system on TV broadcast
news. The rate of 97.3% correctly classified frames shows the efficiency of
our system.
|
MMSP-1.3
|
Hierarchical Classification of Audio Data for Archiving and Retrieving
Tong Zhang,
C.-C. Jay Kuo (Integrated Media Systems Center and Department of Electrical Engineering-Systems, University of Southern California)
A hierarchical system for audio classification and
retrieval based on audio content analysis is presented
in this paper. The system consists of three stages.
The first stage is called the coarse-level audio
segmentation and classification, where audio recordings
are segmented and classified into speech, music,
several types of environmental sounds, and silence,
based on morphological and statistical analysis of
temporal curves of short-time features of audio signals.
In the second stage, environmental sounds are further
classified into finer classes such as applause, rain,
birds' sound, etc. This fine-level classification is
based on time-frequency analysis of audio signals and
use of the hidden Markov model (HMM) for classification.
In the third stage, the query-by-example audio retrieval
is implemented where similar sounds can be found
according to an input sample audio. It is shown that
the proposed system has achieved an accuracy higher
than 90% for coarse-level audio classification.
Examples of audio fine classification and audio
retrieval are also provided.
|
MMSP-1.4
|
A Fast Audio Classification from MPEG Coded Data
Yasuyuki Nakajima (KDD R&D Labs.),
Yang Lu (University of Electro-Communications),
Masaru Sugano,
Akio Yoneyama (KDD R&D Labs.),
Hiromasa Yanagihara (KDD R&D Lbas.),
Akira Kurematsu (University of Electro-Communications)
Audio information classification becomes a very important task for such purposes as automatic keyword spotting and other content-based audio-visual query system. In this paper, we describe a fast and accurate audio data classification method on MPEG coded data domain. Firstly silent segments are detected using a robust approach for different recording conditions. Then the non-silent segments are classified into three types, music, speech, and applause using temporal density, bandwidth and center frequency of subband energy. In order to be robust for a variety of audio sources as much as possible, we use Bayes discriminant function for multivariate Gaussian distribution instead of manually adjusting a threshold for each discriminator. In the experiment, every one-second MPEG audio data is classified and about 90% of audio and speech segments have been successfully detected. As for the detection speed, less than 20% of MPEG audio decoding processing power is required.
|
MMSP-1.5
|
Image Retrieval Based on Energy Histograms of The Low Frequency DCT Coefficients
Jose A Lay,
Ling Guan (School of Electrical and Information Engineering, University of Sydney)
With the increasing popularity of the use of compressed images, an intuitive approach for lowering computational complexity towards a practically efficient image retrieval system is to propose a scheme that is able to perform retrieval computation directly in the compressed domain. In this paper, we investigate the use of energy histograms of the low frequency DCT coefficients as features for the retrieval of DCT compressed images. We propose a feature set that is able to identify similarities on changes of image-representation due to several lossless DCT transformations. We then use the features to construct an image retrieval system based on the real-time image retrieval model. We observe that the proposed features are sufficient for performing high level retrieval on medium size image databases. And by introducing transpositional symmetry, the features can be brought to accommodate several lossless DCT transformations such as horizontal and vertical mirroring, rotating, transposing, and transversing.
|
MMSP-1.6
|
Texture Features for DCT-Coded Image Retrieval and Classification
Yu-Len Huang,
Ruey-Feng Chang (Department of Computer Science and Information Engineering, National Chung Cheng University, Taiwan, R.O.C.)
The multiresolution wavelet transform has been shown to be an effective technique and achieved very good performance for texture analysis. However, a large number of images are compressed by the methods based on discrete cosine transform (DCT). Hence, the image decompression of inverse DCT is needed to obtain the texture features based on the wavelet transform for the DCT-coded image. This paper proposes the use of the multiresolution reordered features for texture analysis. The proposed features are directly generated by using the DCT coefficients from the DCT-coded image. Comparisons with the subband-energy features extracted from the wavelet transform, conventional DCT using the Brodatz texture database indicate that the proposed method provides the best texture pattern retrieval accuracy and obtains much better correct classification rate. The proposed DCT based features are expected to be very useful and efficient for texture pattern retrieval and classification in large DCT-coded image databases.
The detail simulation results can be found in web page: http://www.cs.ccu.edu.tw/~hyl/mrdct/.
|
MMSP-1.7
|
An Efficient Low-Dimensional Color Indexing Scheme for Region-Based Image Retrieval
Yining Deng,
B. S. Manjunath (ECE Dept., Univ. of California, Santa Barbara)
In this work, an efficient low-dimensional color indexing scheme for region-based
image retrieval is presented. The colors in each image region are first quantized
so that only a small number of cluster centroids are needed to represent the region
color information. The proposed color feature descriptor consists of these quantized
colors and their percentages in the region. A similarity distance measure is defined
and shown to be equivalent to the quadratic color histogram distance measure. The
quantized colors are indexed in the 3-D color space so that high-dimensional indexing
can be avoided. During the search process, each quantized color in the query is used
as a separate cue to find matches containing that color. The matches from all the
query colors are then joined to obtain the final retrievals. Experimental results
show that the proposed scheme is fast and accurate compared to the color histogram
approach.
|
MMSP-1.8
|
Vector-Wavelet Based Scalable Indexing And Retrieval Systems For Large Color Image Archives
Elif Albuz,
Erturk D Kocalar,
Ashfaq A Khokhar (University of Delaware)
This paper presents an efficient content based
indexing and retrieval mechanism based on vector
wavelet coefficients of color images. We use highly
decorrelated wavelet coefficient planes to acquire
a search efficient feature space. The feature space
is subsequently indexed using properties of the all
the images in the database. Therefore the feature key
of an image does not only correspond to the content
of the image itself but also how much the image is
different from the other images being stored in the
database. The search time depends only on the number
of images similar to the query image but not on the
size of the entire database. The system is scalable
and provides fast retrievals. We show that in a
database of 1000 images, query search takes less than
50 msec, on a 266 MHz Pentium processor compared to
several seconds of retrieval time in the earlier
systems proposed in the literature.
|
|