Chair: Michael Orchard, University of Illinois Urbana- Champaign (USA)
Hiroshi Ito, University of Maryland (USA)
Nariman Farvardin, University of Maryland (USA)
New intra- and inter-frame coding techniques for coding a wavelet decomposed video signal are proposed. These methods are based on the required accuracy of motion compensation for each of the different frequency subbands. For each subband, our methods prohibit inter-frame coding and switch to intra-frame coding when the estimated motion vector is not sufficiently accurate. The simple switching mechanism adopted results in a low overhead for coding the adaptation parameters while keeping the loss in coding gain negligible, compared to an unconstrained adaptive strategy in which intra- and inter- frame coding can be switched independently in each subband. The required accuracy of motion compensation is derived analytically for equally distributed pass-band signals and used throughout the paper. Simulation results for practical video sequences are presented.
R.S. Jasinschi, Carnegie Mellon University (USA)
J.M.F. Moura, Carnegie Mellon University (USA)
J.C. Cheng, Carnegie Mellon University (USA)
A. Asif, Carnegie Mellon University (USA)
Current video compression standards compress video sequences at NTSC quality with factors in the range of 10-100, like in MPEG-1 and MPEG-2. To operate beyond this range, that is, MPEG-4, radically new techniques are needed. We discuss here one such technique called Generative Video (GV). Video compression is realized in GV in two steps. First, the video sequence is reduced to constructs. Constructs are world images, corresponding to augmented images containing the non-redundant information on the sequence, and window, figure, motion, and signal processing operators, representing video sequence properties. Second, the world images are spatially compressed. The video sequence is reconstructed by applying the various operators to the decompressed world images. We apply GV to a $10$sec video sequence of a real 3-D scene and obtain compression ratios of about $2260$ and $4520$ for two experiments done with different quantization codebooks. The reconstructed video sequence exhibits very good perceptual quality.
Richard R. Schultz, University of Notre Dame (USA)
Robert L. Stevenson, University of Notre Dame (USA)
The human visual system seems to be capable of temporally integrating information in a video sequence in such a way that the perceived spatial resolution of a sequence appears much higher than the spatial resolution of an individual frame. This paper addresses how to utilize both the spatial and temporal information present in an image sequence to create a high-resolution video still. A novel observation model based on motion compensated subsampling is proposed for a video sequence. Since the reconstruction problem is ill-posed, Bayesian restoration with an edge-preserving prior image model is used to extract a high-resolution video frame from a low-resolution sequence. Estimates computed from an image sequence containing a camera pan show dramatic improvement over bilinear, cubic B-spline, and Bayesian single frame interpolations. Improved definition is also shown for a video sequence containing objects moving with independent trajectories.
Haluk Aydinoglu, Georgia Institute of Technology (USA)
Faouzi Kossentini, Georgia Institute of Technology (USA)
Monson H. Hayes III, Georgia Institute of Technology (USA)
In this paper, we propose a new framework for the compression of multi-view image sequences. We define three types of frames, and each type is coded with a different strategy. The first type of frame is independently coded and is called I-frame. The second is a B-frame and is coded using a bi-directional disparity estimator and a modified version of the Subspace Projection Technique (SPT), as proposed in [1]. The SPT algorithm compensates the photometric variations between the multi-view frames. Projection block size is chosen to be small so that coding of the residual image is not necessary. On the other hand, to decrease the overhead information both disparity vectors and projection coefficients are coded with a lossy scheme. Finally, the third type of frame is a P-frame and is coded by employing a uni-directional disparity estimator and dc level compensation.
Fu-Huei Lin, Georgia Institute of Technology (USA)
Russell M. Mersereau, Georgia Institute of Technology (USA)
In this paper, we present an video objective quality measure which has good correlation with subjective tests. We then introduce the objective measure into the design of an MPEG encoder. The new MPEG encoder extracts four features (bit rate, a feature that measures blockiness, one that measures false edges, and one that measures blurred edges) from the input and output video sequences and feeds those features into a four-layered back-propagation neural network which has been trained by subjective testing. Then the system uses a simple feedback technique to adjust the GOP (group of pictures) bit-rate to achieve a constant subjective quality output video sequence.
P.B. Penafiel, Catholic University of America (USA)
N.M. Namazi, Catholic University of America (USA)
This paper introduces a new framework for video compression. The proposed method considers noise directly in the video sequence and seeks the optimal compression ratio and video quality. Compression is achieved by eliminating the spatial and temporal redundancies found in the intensity and motion fields of the video. Processing is performed in blocks of N frames stored in a video buffer. Encoder and decoder are synchronized prior to the transmission of a new block. A reference frame is chosen from each block and encoded before transmission. Spatial redundancies in the intensity domain are reduced by a wavelet filter. The pixel-motion field between the reference frame and other frames in a block is evaluated using a Kalman filter that estimates the pixel motion in the presence of noise. Video frames are predicted from the reference frame and the corresponding motion field. Prediction errors, motion vectors and the reference frame are compressed in wavelet domain before transmission. The compression system includes quantization and entropy coding.
Oh-Jin Kwon, University of Maryland (USA)
Rama Chellappa, University of Maryland (USA)
Carlos Morimoto, University of Maryland (USA)
We improve the performance of conventional motion compensated Discrete Cosine Transform video coding. For motion compensation, we employ a two step algorithm in which the camera motion is compensated first and then the motion of moving objects is estimated. We use a feature matching algorithm for camera motion compensation. Motion compensated frame differences are divided into three regions called stationary background, moving objects, and newly emerging area. A region adaptive subband image coding scheme is used for spatial coding of these regions.
Jiro Katto, NEC Corporation (JAPAN)
Mutsumi Ohta, NEC Corporation (JAPAN)
This paper presents a novel framework which proves the superiority of overlapped motion compensation. Window design problem is revised by introducing a statistical model of motion estimation process. The result clarifies the relationship between the optimum window and image characteristics in an explicit formula and quantifies prediction error reduction achieved by overlapped motion compensation. Experimental results using real image sequences support the proposed theory and demonstrate its superiority. Overlapping in warping prediction is also considered and its effectiveness is shown. --
Kostantinos Konstantinides, Hewlett-Packard Labs (USA)
Gregory S. Yovanof, Hewlett-Packard Labs (USA)
Video capture devices, such as CCD cameras, are a significant source of noise in image sequences. Pre-processing of video sequences with spatial filtering techniques usually improves their compressibility. In this paper we present a block-based, non-linear filtering algorithm based on the theories of SVD and compression-based filtering. A novel noise estimation algorithm allows us to operate on the input data without any prior knowledge of either the noise or signal characteristics. Experiments with real video sequences and an MPEG codec have shown that SVD based filters preserve edge details and can significantly improve nearly-lossless compression ratios by 15%.
Andrew J. Patti, University of Rochester
M. Ibrahim Sezan, Eastman Kodak Company
A. Murat Tekalp, University of Rochester (USA)
With the advent of frame grabbers capable of acquiring multiple video frames, a great deal of attention is being directed at creating high-resolution (hi-res) imagery from interlaced or low-resolution (low-res) video. This is a multi-faceted problem, which generally necessitates standards conversion and hi-res reconstruction. Standards conversion is the problem of converting from one spatio-temporal sampling lattice to another, while hi-res image reconstruction involves increasing the spatial sampling density. Also of interest is removing degradations that occur during the image acquisition process. These tasks have all received considerable, yet separate, treatment in the literature. Here, a unifying video formation model is presented which addresses these problems simultaneously. Then, a POCS-based algorithm for generating high-resolution imagery from video is delineated. Results with real imagery are included.