Chair: A. Katsaggelos, Northwestern University (USA)
Ezzatollah Salari, University of Toledo (USA)
Sheng Lin, University of Toledo (USA)
In this paper a new motion compensated predictive coding based on object region segmentation is proposed for image sequence coding at low bit-rates. The motion compensated prediction involves segmentation, motion detection, and motion estimation for moving objects. Segmentation is carried out on the reconstructed images in both the encoder and decoder. This will eliminate the need to transmit the region shape information. Also, motion vector prediction is performed in both the encoder and decoder leading to a significant reduction of overhead for motion information. Motion compensated prediction errors are transformed using the Discrete Cosine Transform (DCT) and the coefficients are quantized and entropy coded as recommended by CCITT. Computer simulation shows that the proposed coding algorithm significantly reduces the block artifact which is a dominant distortion associated with the conventional block matching algorithms at low bit-rates.
Kristine Matthews, Catholic University of America (USA)
Nader Namazi, Catholic University of America (USA)
We have formulated and evaluated a binary hypothesis test for the detection of uncovered background pixels between image frames in a noisy image sequence where we assume additive, white, gaussian noise. We have extended the binary hypothesis test to a 3-ary hypothesis test to allow for the segmentation of the image into three regions: uncovered background, stationary and moving pixels. We have evaluated both the binary and 3-ary hypothesis tests using a single measurement and multiple measurements for classifying each pixel on synthetic images, and we have evaluated the 3-ary hypothesis test on the Trevor image sequence.
Chung-Lin Huang, National Tsing- Hua University (REPUBLIC OF CHINA)
Jhy-Gau Chen, National Tsing- Hua University (REPUBLIC OF CHINA)
A new method called Complex Subband Transform (CST) is introduced for subband signals decomposition, the motion estimation, and the subband image sequence coding. There are two subband-based methods proposed which combine the process of motion compensation and the process of subband decomposition, namely out-band and in- band compensation. In this paper, we modify the M-band Perfect Reconstruction Modulation Filter (PRMF) and propose a so- called Complex Subband Transform (CST). We also derive a method using CST-based phase correlation functions to calculate motion vectors for overlapped windowed regions. In the experiments, we show that the CST-based motion estimation generates a more accurate and coherent motion field than the conventional block matching methods.
Wu Chou, AT&T Bell Laboratories (USA)
Homer H. Chen, AT&T Bell Laboratories (USA)
In this paper, we discuss some issues related to acoustic assisted image coding and animation. An approach of talker independent acoustic assisted image coding and animation scheme is studied. A perceptually based sliding window encoder is proposed. It utilizes the high rate (or oversampled) viseme sequence from the audio domain for image domain viseme interpolation and smoothing. The image domain visemes in our aproach are dynamically constructed from a set of basic visemes. The look-ahead and look-back moving interpolations in the proposed approach provide an effective way to compensate the mismatch between auditory and visual perceptions.
Yui-Lam Chan, Hong Kong Polytechnic University (HONG KONG)
Wan-Chi Siu, Hong Kong Polytechnic University (HONG KONG)
A new adaptive technique based on pixel decimation for estimating motion vector is presented. In traditional approach, a uniform pixel decimation is used. Since part of the pixels in each block does not enter into the matching criterion, this approach limits the accuracy of the motion vector. In this paper, we select the most representative pixels based on image content in each block for the matching criterion. This is due to the fact that high activity in the luminance signal such as edges and texture mainly contributes to the matching criterion. Our approach can compensate the drawback in standard pixel decimation techniques. Computer simulations show that this technique is close to the performance of the exhaustive search with significant reduction on computational complexity.
Fabrice Moscheni, Swiss Federal Institute of Technology (SWITZERLAND)
Frederic Dufaux, Massachusetts Institute of Technology (USA)
Murat Kunt, Swiss Federal Institute of Technology (SWITZERLAND)
In the framework of sequence coding, motion estimation and compensation have been shown to be very efficient at removing temporal redundancy. The motion existing in a scene can be mainly seen as arising from local motions superimposed to the camera motion. In this paper, a new two stage global/local motion estimation approach is presented. The global motion estimation only relies on the background information. It is based on a matching technique and the global motion model is chosen to be affine. Simulation results show significant improvements obtained with the proposed method compared to usual methods.
Laurent Bonnaud, IRISA/INRIA
Claude Labit, IRISA/INRIA
Janusz Konrad, INRS-Telecom (FRANCE)
This paper presents a new temporal interpolation algorithm based on segmentation of images into polygonal regions undergoing affine motion. The goal of this work is to improve upon the block-based interpolation used in MPEG (B-frames). In the first part, we briefly describe the region-based framework and the temporal linking algorithm that jointly provide the segmentation and motion parameters. In the second part, we present various applications of the proposed algorithm to temporal interpolative prediction. We examine one of these schemes in detail, including the special processing of occlusion areas. Results are illustrated by predicted images and using the MSE criterion we compare their quality with other schemes.
Baldine-Brunel Paul, Georgia Institute of Technology (USA)
Monson H. Hayes III, Georgia Institute of Technology (USA)
An approach is presented for the compression of motion video sequences using Iterated Function Systems. In the proposed approach, the video stream is partitioned into three-dimensional range regions. Each range region consists of a variable number of rectangular blocks that belong to consecutive frames along motion trajectory. Our approach exploits correlation between consecutive blocks in the direction of motion by predicting the IFS map of a given range block with that of a parent range block along the trajectory of motion. The proposed approach shows good promise for efficient modeling and compression of motion video sequences at an affordable computational cost.
Yuzo Senda, NEC Corporation (JAPAN)
Hidenobu Harasaki, NEC Corporation (JAPAN)
Mitsuharu Yano, NEC Corporation (JAPAN)
We propose a simplified motion estimation method which provides motion vectors for all types of motion compensation used in MPEG-2. The method is a result of applying a newly introduced approximation to the canonical three-step method described in MPEG-2 Test Model. It reduces the number of necessary computations in the second and the third steps to less than 1%, and that of data transfers to about 8% of the canonical method. Total complexity of the proposed method is nearly that of the full-pel search motion estimation.
Yucel Altunbasak, University of Rochester (USA)
A. Murat Tekalp, University of Rochester (USA)
Gozde Bozdagi, University of Rochester (USA)
We present a new framework for combining maximum likelihood (ML) stereo-motion fusion with adaptive iterated extended Kalman filtering (IEKF) for 3-D motion tracking. The ML stereo-fusion step, with two stereo-pairs, generates observations of 3-D feature matches to be used by the IEKF step. The IEKF step, in turn, computes updated 3-D motion parameter estimates to be used by the ML stereo-motion fusion step. The covariance of the observation noise process is regulated by the value of the ML cost function to address occlusion related problems. The proposed simultaneous approach is compared with performing the 3-D feature correspondence estimation and the Kalman filtering separately using simulated stereo imagery.