Session: IMDSP-L3
Time: 1:00 - 3:00, Wednesday, May 9, 2001
Location: Room 151
Title: Video Coding
Chair: Phil Chou

1:00, IMDSP-L3.1
P-DOMAIN SOURCE MODELING AND RATE CONTROL FOR VIDEO CODING AND TRANSMISSION
Z. HE, Y. KIM, S. MITRA
In this work, the coding bit rate R is considered as a function of p which is the percentage of zeros among the quantized DCT coefficients. We discover that the rate function R(p) has some very interesting properties in the p-domain. By introducing the new concepts of characteristic rate curves and rate curve decomposition, a novel framework for source modeling is proposed. Using the proposed source model, we can estimate the rate-quantization (R-Q) curve before quantization and coding with relative error less than 5%. Based on the estimated R-Q curve, the output bit rate of the video encoder can be accurately controlled. Our extensive simulation results show that the proposed algorithm outperforms TMN8 and VM7 rate control algorithms by providing more accurate and robust rate control.

1:20, IMDSP-L3.2
A NOVEL LINEAR SOURCE MODEL AND A UNIFIED RATE CONTROL ALGORITHM FOR H.263 / MPEG-2 / MPEG-4
Y. KIM, Z. HE, S. MITRA
Let p be the percentage of zeros among the quantized transform coefficients. We discover that, in any typical video coding systems, there is always a strictly linear relationship between p and the actual coding bit rate R. This linearity leads to a novel and unified source model for different types of source data and different coding systems, such as H.263, MPEG-2, and MPEG-4. The proposed linear source model is much simpler, but much more accurate than other source models reported in the literature. Based on this source model, a unified rate control algorithm is proposed for the above three video coding systems. Despite its extreme simplicity, the proposed algorithm outperforms other rate control algorithms by providing more accurate and robust rate control.

1:40, IMDSP-L3.3
REAL-TIME FOVEATION TECHNIQUES FOR H.263 VIDEO ENCODING IN SOFTWARE
H. SHEIKH, S. LIU, A. BOVIK, B. EVANS
Video coding techniques employ characteristics of the Human Visual System (HVS) to achieve high coding efficiency. S. Lee has exploited foveation, which is a non-uniform resolution repre-sentation of an image reflecting the sampling in the retina, for low bit-rate video coding. In this paper, we develop a fast ap-proximation of the foveation model and demonstrate real-time foveation techniques in the spatial domain and the Discrete Co-sine Transform (DCT) domain. We demonstrate that fast DCT domain foveation can be incorporated into the baseline H.263 video encoding standard with very low computational overhead. We also present a comparison of the two techniques in terms of their computational complexity as well as resulting bit rates. Our techniques do not require any modifications of the decoder.

2:00, IMDSP-L3.4
RATE SCALABLE VIDEO CODING USING A FOVEATION-BASED HUMAN VISUAL SYSTEM MODEL
Z. WANG, L. LU, A. BOVIK
Recently, there are two interesting trends in image and video coding research. One is to use human visual system (HVS) models to improve the current state-of-the-art coding algorithms by better exploiting the properties of the intended receiver. The other is to design rate scalable video codecs, which allow the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream. In this paper, we follow these two trends and propose a foveation scalable video coding (FSVC) algorithm, which supplies good quality-compression performance as well as effective rate scalability to support simple and precise bit rate control. A foveation-based HVS model plays a key role in the algorithm. The algorithm is amenable to the inclusion of various HVS models and adaptable to different video communication applications.

2:20, IMDSP-L3.5
A MULTI-FRAME BLOCKING ARTIFACT REDUCTION METHOD FOR TRANSFORM-CODED VIDEO
Y. ALTUNBASAK, B. GUNTURK, R. MERSEREAU
A major drawback of block-based still image or video compression methods at low rates is the visible block boundaries that are also known as blocking artifacts. Several methods have been proposed in the literature to reduce these artifacts for video sequences. However, most are simply adaptations of still image blocking artifact reduction methods, which do not exploit temporal information. In this paper, we propose a novel multi-frame blocking artifact reduction method that incorporates temporal information effectively. This method uses the spatial correlations that exist between the successive frames to define constraint sets at multiple frames and provides a Projections Onto Convex Sets (POCS) solution. The proposed method operates solely on transform domain (DCT) data, and hence provides a solution that is compatible with the observed video. It does not need to make any spatial smoothness assumptions, which are typical with blocking artifact reduction algorithms for still images.

2:40, IMDSP-L3.6
THREE-DIMENSIONAL LIFTING SCHEMES FOR MOTION COMPENSATED VIDEO COMPRESSION
B. PESQUET-POPESCU, V. BOTTREAU
Three dimensional wavelet decompositions are efficient tools for scalable video coding. In this paper, we show the interest of a lifting formulation of these decompositions. The temporal wavelet transform is inherently non-linear, due to the motion estimation step, and the lifting formalism allows us to provide several improvements to the classical scheme: a better processing of the uncovered areas is proposed and an overlapped motion compensated temporal filtering method is introduced in the multiresolution decomposition. As shown by simulations, the proposed method results in higher coding efficiency, while keeping the scalability functionalities.