28th Picture Coding Symposium
Technical Program

Technical Program: SIMPLE version DETAILED version with abstract

Author Index: HERE

Session Schedule

Wednesday, December 8, 2010


Opening Session 8:30 - 8:45
K1 Keynote Speech 1 8:45 - 9:30
Break 9:30 - 9:45
O1 Oral Session 1: FTV 9:45 - 11:45
Lunch 11:45 - 13:00
S1 Special Session 1: 3DTV/FTV 13:00 - 15:30
D1 Panel Discussion 1: 3DTV/FTV 15:30 - 16:30
Break 16:30 - 16:45
P1 Poster Session 1 16:45 - 18:15
Welcome Reception

Thursday, December 9, 2010


K2 Keynote Speech 2 8:45 - 9:30
P2 Poster Session 2 9:30 - 11:00
Break 11:00 - 11:15
T1 Tutorial Session 1 11:15 - 12:00
Lunch 12:00 - 13:15
O2 Oral Session 2: Depth Map Coding 13:15 - 15:15
Break 15:15 - 15:30
O3 Oral Session 3: New Techniques for Video Coding 15:30 - 17:30
Banquet

Friday, December 10, 2010


K3 Keynote Speech 3 8:45 - 9:30
P3 Poster Session 3 9:30 - 11:00
Break 11:00 - 11:15
T2 Tutorial Session 2 11:15 - 12:00
Lunch 12:00 - 13:15
S2 Special Session 2: Beyond H.264/MPEG-4 AVC 13:15 - 15:15
D2 Panel Discussion 2: Beyond H.264/MPEG-4 AVC 15:15 - 16:15
Break 16:15 - 16:30
P4 Poster Session 4 16:30 - 18:00

List of Papers

Wednesday, December 8, 2010

Session K1 Keynote Speech 1
Time: 8:45 - 9:30 Wednesday, December 8, 2010
Chair: Takahiro Saito (Kanagawa University, Japan)

K1-1 (Time: 8:45 - 9:30)

Title	(Keynote Speech) Research and Activities on Ultra-realistic Communications
Author	Kazumasa Enami (National Institute of Information and Communications Technology, Japan)
Abstract	Ultra-realistic communications are future means of communicating that provide users with a highly realistic presence. The technologies to achieve these communications are varied and include ultra-high definition/3-dimensional images, reproduction of highly realistic surround sound, and multisensory communication that includes touch and smell. Japan Broadcasting Corporation (NHK) is developing Ultra-HDTV system, called Super Hi-vision, which has four times the pixels of HDTV both horizontally and vertically. The images can be viewed as close as 0.75 times the height of the screen. At the proximity, the angle of viewing is 100 degrees, which confers an enhanced sense of reality. National Institute of Information and Communications Technology (NICT) is researching Ultra-realistic communication technologies that provide natural and real information to everybody. Main research subjects are holographic 3D video, super multi-view 3D video without special glasses on and super surround audio systems. We also researching necessary requirements for ultra-realistic systems based on underlying principles of human information processing. The Ultra-Realistic Communications Forum (URCF) was established on 2007 to ensure efficient industry-academia-government collaboration in research, development, verification experiments, and standardization on ultra-realistic technologies. Especially standardization of FTV (Free-viewpoint TV) is being actively discussed in a working group of URCF. Ultra-realistic communication systems have quite huge amount of data. For example, the data rate of Super Hi-vision, super multi-view 3D video and electronic holography video is 72Gbps, more than 200Gbps and more than several hundred Tbps respectively. Then efficient video coding algorithms are required for the ultra-realistic communications. In the Symposium, I will introduce research on ultra-realistic communications, activities of URCF and describe the expectation of new picture coding technology.

Session O1 Oral Session 1: FTV
Time: 9:45 - 11:45 Wednesday, December 8, 2010
Chair: Marek Domanski (Poznan University of Technology, Poland)

O1-1 (Time: 9:45 - 10:15)

Title	Free Viewpoint Image Generation with Super Resolution
Author	Norishige Fukushima, Yutaka Ishibashi (Graduate School of Engineering, Nagoya Institute of Technology, Japan)
Page	pp. 1 - 4
Keyword	Free Viewpoint TV, Image Based Rendering, Ray Space, Light Field, Super Resolution
Abstract	In this paper, we propose a method of free viewpoint image generation with super resolution. In the conventional approach, such as nearest neighbor and linear interpolation, the synthetic image on zoomed virtual view tends to have low resolution. To overcome this problem, we combine the super resolution process with free viewpoint image generation. The experimental results show that synthesized image in the effective range has higher PSNR than the conventional method.

O1-2 (Time: 10:15 - 10:45)

Title	Reducing Bitrates of Compressed Video with Enhanced View Synthesis for FTV
Author	Lu Yang, Meindert Onno Wildeboer, Tomohiro Yendo, Mehrdad Panahpour Tehrani (Nagoya University, Japan), Toshiaki Fujii (Tokyo Institute of Technology, Japan), Masayuki Tanimoto (Nagoya University, Japan)
Page	pp. 5 - 8
Keyword	Free-viewpoint TV (FTV), video compression, bitrates, view synthesis, MVC
Abstract	We deal with the bitrates of view synthesis at the decoder side of FTV that would use compressed depth maps and views. The focus is to reduce bitrates required for generating the high-quality virtual view. We employ a reliable view synthesis method which is compared with standard MPEG view synthesis software. The experimental results show that the bitrates required for synthesizing high-quality virtual view could be reduced by utilizing our enhanced view synthesis technique to improve the PSNR at medium bitrates.

O1-3 (Time: 10:45 - 11:15)

Title	Depth Map Processing with Iterative Joint Multilateral Filtering
Author	PoLin Lai (Samsung Telecommunications America, U.S.A.), Dong Tian (Mitsubishi Electric Research Laboratories, U.S.A.), Patrick Lopez (Technicolor, Research and Innovation, France)
Page	pp. 9 - 12
Keyword	3D video, depth maps, joint filtering, iterative filtering, view synthesis
Abstract	Depth maps estimated using stereo matching between frames from different video views typically exhibit false contours and noisy artifacts around object boundaries. In this paper, iterative joint multilateral filtering is proposed to deal with these artifacts. The proposed filter consists of multiple filter kernels. Knowing that the estimated depth maps are erroneous, besides the kernels which measure the proximity of depth samples and the similarity between depth sample values, we further develop kernels which measure similarity between the corresponding video pixel values. To increase reliability, these novel kernels operate on the color (RGB) domain instead of only on the luminance domain. Furthermore, the filter shapes are designed to adapt brightness variations. Finally, to tackle large misalignment between boundaries in depth maps and in the corresponding video frames, iterative approach is utilized. Our results demonstrate that the proposed method can significantly improve the boundaries in depth maps and can reduce false contours. With the processed depth maps, it is observed that the quality of object boundaries in synthesized views can be improved.

O1-4 (Time: 11:15 - 11:45)

Title	Stereoscopic Depth Estimation Using Fuzzy Segment Matching
Author	Krzysztof Wegner, Olgierd Stankiewicz, Marek Domański (Poznan University of Technology, Poland)
Page	pp. 13 - 16
Keyword	Stereo matching, depth estimation, soft segmentation, disparity calculation, fuzzy set
Abstract	Stereo matching techniques usually match segments or blocks of pixels. This paper proposes to match segments defined as fuzzy sets of pixels. The proposed matching method is applicable to various techniques of stereo matching as well as to different measures of differences between pixels. In the paper, embedment of this approach into the state-of-the-art depth estimation software is described. Obtained experimental results show that the proposed way of stereo matching increases reliability of various depth estimation techniques.

Session S1 Special Session 1: 3DTV/FTV
Time: 13:00 - 15:30 Wednesday, December 8, 2010
Chair: Toshiaki Fujii (Tokyo Institute of Technology, Japan)

S1-1 (Time: 13:00 - 13:30)

Title	(Invited Talk) Challenges in Multiview Video - The 3 D's
Author	Antonio Ortega (University of Southern California, U.S.A.)
Page	p. 17
Keyword	multiview video, multiview video coding, depth estimation
Abstract	We highlight some of the challenges facing wide adoption of multiview video technologies, by focusing on design issues involving displays, depth data and tools for content delivery. Our goal is to show that uncertainty about what choices will be made in each of these areas have a significant impact on the design of coding tools.

S1-2 (Time: 13:30 - 14:00)

Title	(Invited Talk) Vision Field Capturing and Its Applications in 3DTV
Author	Qionghai Dai (Tsinghua University, China)
Page	p. 18
Keyword	multiview capturing, vision field capturing, 3DTV
Abstract	3D video capturing acquires the visual information in 3D manner, which possesses the first step of the entire 3DTV system chain before 3D coding, transmission and visualization. As the cost of sensors reduces recently, many systems utilize multiple cameras to acquire visual information, which is called multiview capturing. We start with reviewing these multiview systems built in the past decades. Then, a new concept of vision field capturing is introduced. Finally, some new techniques using TOF(time-of-flight) camera and 3D scanner will also be presented.

S1-3 (Time: 14:00 - 14:30)

Title	(Invited Talk) 3D Information Coding
Author	Joern Ostermann (Hannover University, Germany)
Page	p. 19
Keyword	MPEG, 3D reconstruction, 3DTV, stereo video, multiview display
Abstract	3D visual information can be presented to viewers using stereo displays. For these displays, the two video sequences of a stereo pair are jointly coded using MVC. The extension of coding standards for use with the existing MPEG-2 or AVC infrastructure is currently developed. Auto stereoscopic displays enabling motion parallax will be supported by the future 3DV standard. Several video streams and a depth map will enable the display to render the correct views of the scene.

S1-4 (Time: 14:30 - 15:00)

Title	(Invited Talk) 3D Television System Based on Integral Photography
Author	Tomoyuki Mishina (NHK, Japan)
Page	p. 20
Keyword	3D image, integral photography, full-parallax, Super Hi-Vision
Abstract	Integral photography is a photographic technique which is able to reconstruct three-dimensional (3D) images having full-parallax, and the reconstructed 3D images can be seen without wearing special glasses. We have been researching into a 3D television system based on integral photography. This paper describes our system in which an ultra-high-definition imaging system called Super Hi-Vision is applied to integral photography.

S1-5 (Time: 15:00 - 15:30)

Title	(Invited Talk) Progress from Stereoscopic to Three-Dimensional Displays Based on Visual Perception
Author	Sumio Yano (NHK, formerly NICT, Japan)
Page	p. 21
Keyword	stereoscopic displays, multi-view stereoscopic displays, three-dimensional display, light field reproduction, human visual field
Abstract	The proper characteristics of stereoscopic and multi-view displays are described in the context of human visual perception at first. Next, the development of three-dimensional displays, matched to the function of human visual field, is described accounting for these points. These three-dimensional displays were developed based on the principle of light field reproduction. From these issues, the processing technology for large capacity image information is important for the development of the three-dimensional image systems in the future.

Session D1 Panel Discussion 1: 3DTV/FTV
Time: 15:30 - 16:30 Wednesday, December 8, 2010
Chair: Toshiaki Fujii (Tokyo Institute of Technology, Japan)

D1-1 (Time: 15:30 - 16:30)

Title	(Panel Discussion) 3DTV/FTV
Author	Chair: Toshiaki Fujii (Tokyo Institute of Technology, Japan), Panelists: Kazumasa Enami (National Institute of Information and Communications Technology, Japan), Antonio Ortega (University of Southern California, U.S.A.), Qionghai Dai (Tsinghua University, China), Joern Ostermann (Hannover University, Germany), Tomoyuki Mishina (NHK, Japan), Sumio Yano (NHK, formerly NICT, Japan)

Session P1 Poster Session 1
Time: 16:45 - 18:15 Wednesday, December 8, 2010
Chair: Jiro Katto (Waseda University, Japan)

[3DTV/FTV/multi-view-related topics]

P1-1

Title 3D Space Representation Using Epipolar Plane Depth Image

Author Takashi Ishibashi, Tomohiro Yendo, Mehrdad Panahpour Tehrani (Graduate School of Engineering, Nagoya University, Japan), Toshiaki Fujii (Graduate School of Science and Engineering, Tokyo Institute of Technology, Japan), Masayuki Tanimoto (Graduate School of Engineering, Nagoya University, Japan)

Page pp. 22 - 25

Keyword FTV, EPI, EPDI, Ray-Space, GDM

Abstract We propose a novel 3D space representation for multi-view video, using epipolar plane depth images (EPDI). Multi-view video plus depth (MVD) is used as common data format for FTV(Free-viewpoint TV), which enables synthesizing virtual view images. Due to large amount of data and complexity of the multi-view video coding (MVC), compression of MVD is a challenging issue. We address this problem and propose a new representation that is constructed from MVD using rayspace. MVD is converted into image and depth ray-spaces. The proposed representation is obtained by converting each of ray-spaces into a global depth map and a texture map using EPDI. Experiments demonstrates the analysis of this representation, and its efficiency.

P1-2

Title	View Synthesis Error Analysis for Selecting the Optimal QP of Depth Map Coding in 3D Video Application
Author	Yanwei Liu, Song Ci, Hui Tang (Institute of Acoustics, Chinese Academy of Sciences, China)
Page	pp. 26 - 29
Keyword	view synthesis error analysis, depth map coding, selecing QP
Abstract	In 3D video communication, how to select the appropriate quantization parameter (QP) for depth map coding is very important for obtaining the optimal view synthesis quality. This paper first analyzes the depth uncertainty induced two kinds of view synthesis errors, namely the original depth error induced view synthesis error and the depth compression induced view synthesis error, and then proposes a quadratic model to characterize the relationship between the view synthesis quality and the depth quantization step size. The proposed model can find the inflexion point in the curve of the view synthesis quality with the increasing depth quantization step size. Experimental results show that, given the rate constraint for depth map, the proposed model can accurately find the optimal QP for depth map coding.

P1-3

Title	Suppressing Texture-Depth Misalignment for Boundary Noise Removal in View Synthesis
Author	Yin Zhao (Zhejiang University, China), Zhenzhong Chen (Nanyang Technological University, Singapore), Dong Tian (Mitsubishi Electric Research Labs, U.S.A.), Ce Zhu (Nanyang Technological University, Singapore), Lu Yu (Zhejiang University, China)
Page	pp. 30 - 33
Keyword	3D video, boundary noise, DIBR, texture-depth misalignment, view synthesis
Abstract	During view synthesis based on depth maps, also known as Depth-Image-Based Rendering (DIBR), annoying artifacts are often generated around foreground objects, yielding the visual effects that slim silhouettes of foreground objects are scattered into the background. The artifacts are referred as the boundary noises. We investigate the cause of boundary noises, and find out that they result from the misalignment between texture and depth information along object boundaries. Accordingly, we propose a novel solution to remove such boundary noises by applying restrictions during forward warping on the pixels within the texture-depth misalignment regions. Experiments show this algorithm can effectively eliminate most boundary noises and it is also robust for view synthesis with compressed depth and texture information.

P1-4

Title	A High Efficiency Coding Framework for Multiple Image Compression of Circular Camera Array
Author	Dongming Xue (Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Japan), Akira Kubota (Faculty of Science and Engineering, Chuo University, Japan), Yoshinori Hatori (Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Japan)
Page	pp. 34 - 37
Keyword	multi-view video coding, virtual plane, polar axis, poxel framework
Abstract	Many existing multi-view video coding techniques remove inter-viewpoint redundancy by applying disparity compensation in conventional video coding frameworks, e.g., H264/MPEG4. However, conventional methodology works ineffectively as it ignores the special features of the inter-view-point disparity. This paper proposed a framework using virtual plane for multi-view image compression, such as we can largely reduce the disparity compensation cost. Based on this VP predictor, we design a poxel (probabilistic voxelized volume) framework, which can integrate the information of the cameras in different view-points in the polar axis to obtain a more effective compression performance. In addition, considering the replay convenience of the multi-view video at the receiving side, we reformed overhead information in polar axis at the sending side in advance.

P1-5

Title	Exploiting Depth Information for Fast Multi-View Video Coding
Author	Brian W. Micallef, Carl J. Debono, Reuben A. Farrugia (University of Malta, Malta)
Page	pp. 38 - 41
Keyword	3DTV, disparity vector estimation, geometric disparity vector predictor, multi-view video coding
Abstract	Multi-view video coding exploits inter-view redundancies to compress the video streams and their associated depth information. These techniques utilize disparity estimation techniques to obtain disparity vectors (DVs) across different views. However, these methods contribute to the majority of the computational power needed for multi-view video encoding. This paper proposes a solution for fast disparity estimation based on multi-view geometry and depth information. A DV predictor is first calculated followed by an iterative or a fast search estimation process which finds the optimal DV in the search area dictated by the predictor. Simulation results demonstrate that this predictor is reliable enough to determine the area of the optimal DVs to allow a smaller search range. Furthermore, results show that the proposed approach achieves a speedup of 2.5 while still preserving the original rate-distortion performance.

P1-6

Title	An Adaptive Early Skip Mode Decision Scheme for Multiview Video Coding
Author	Bruno Zatt (Federal University of Rio Grande do Sul, Brazil), Muhammad Shafique (Karlsruhe Institute of Technology, Germany), Sergio Bampi (Federal University of Rio Grande do Sul, Brazil), Jörg Henkel (Karlsruhe Institute of Technology, Germany)
Page	pp. 42 - 45
Keyword	MVC, Early Skip, Video Coding, Mode Decision
Abstract	In this work a novel scheme is proposed for adaptive early SKIP mode decision in the multiview video coding based on mode correlation in the 3D-neighborhood, variance, and rate-distortion properties. Our scheme employs an adaptive thresholding mechanism in order to react to the changing values of Quantization Parameter (QP). Experimental results demonstrate that our scheme provides a consistent time saving over a wide range of QP values. Compared to the exhaustive mode decision, our scheme provides a significant reduction in the encoding complexity (up to 77%) at the cost of a small PSNR loss (0.172 dB in average). Compared to state-of-the-art, our scheme provides an average 2x higher complexity reduction with a relatively higher PSNR value (avg. 0.2 dB).

P1-7

Title	Electoronic Hologram Generation Using High Quality Color and Depth Information of Natural Scene
Author	Kousuke Nomura (Tokyo University of Science, Japan), Ryutaro Oi, Taiichiro Kurita (National Institute of Information and Communications Technology, Japan), Takayuki Hamamoto (Tokyo University of Science, Japan)
Page	pp. 46 - 49
Keyword	CGH, color hologram, real Scene, natural light, optically reconstruction
Abstract	Recently, the computer generated hologram (CGH) methods are heavily researched. In those conventional CGH, however, it is not so common to provide a hologram of real scene. This time we report that we could get the properly reconstruction of image by using our proposal method against the color hologram anew. We confirmed that our proposed color hologram properly recorded both color and the 3-D information of the space by an assessment experiment which optically reconstructs the hologram.

[Beyond H.264/MPEG-4 AVC and related topics]

P1-8

Title Complementary Coding Mode Design Based on R-D Cost Minimization for Extending H.264 Coding Technology

Author Tomonobu Yoshino, Sei Naito, Shigeyuki Sakazawa, Shuichi Matsumoto (KDDI R&D Laboratories, Japan)

Page pp. 50 - 53

Keyword low bit-rate video coding, high resolution video coding, H.264, SKIP mode

Abstract To improve high resolution video coding efficiency under low bit-rate condition, an appropriate coding mode is required from an R-D optimization (RDO) perspective, although a coding mode defined within the H.264 standard is not always optimal for RDO criteria. With this in mind, we previously proposed extended SKIP modes with close-to-optimal R-D characteristics. However, the additional modes did not always satisfy the optimal R-D characteristics, especially for low bit-rate coding. In response, in this paper, we propose an enhanced coding mode capable of providing a candidate corresponding to the minimum R-D cost by controlling the residual signal associated with the extended SKIP mode. The experimental result showed that the PSNR improvement against H.264 and our previous approach reached 0.42 dB and 0.24 dB in the maximum case, respectively.

P1-9

Title	Enhanced Video Compression with Region-Based Texture Models
Author	Fan Zhang, David Bull (University of Bristol, U.K.)
Page	pp. 54 - 57
Keyword	video, compression, texture, warping, synthesis
Abstract	This paper presents a region-based video compression algorithm based on texture warping and synthesis. Instead of encoding whole images or prediction residuals after translational motion estimation, this algorithm employs a perspective motion model to warp static textures and uses a texture synthesis approach to synthesise dynamic textures. Spatial and temporal artefacts are prevented by an in-loop video quality assessment module. The proposed method has been integrated into an H.264 video coding framework. The results show significant bitrate savings, up to 55%, compared with H.264, for similar visual quality.

P1-10

Title	Coding Efficient Improvement by Adaptive Search Center Definition
Author	Kyohei Oba, Takahiro Bandou, Tian Song, Takashi Shimamoto (Tokushima University, Japan)
Page	pp. 58 - 61
Keyword	H.264/AVC, Motion Estimation, Inter Prediction, RDO
Abstract	In this paper, an efficient search center definition algorithm is proposed for H.264/AVC. H.264/AVC achieved high coding efficiency by introducing some new coding tools including a new definition of the search center. However, the definition of the search center is not efficient in the case of significant motions. This work proposes some new search center candidates using spatial and temporal correlations of motion vectors to improve the coding efficiency. Simulation results show that the proposed search centers can achieve very much bit saving but induced high computation complexity. Additional complexity reduction algorithm is also introduced to improve the trade off between bit saving and implementation performance. This work realized a maximum bit saving of 19%.

P1-11

Title	A Novel Coding Scheme for Intra Pictures of H.264/AVC
Author	Jin Young Lee, Jaejoon Lee, Hochen Wey, Du-Sik Park (Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., Republic of Korea)
Page	pp. 62 - 65
Keyword	H.264/AVC, intra coding, intra prediction
Abstract	A novel intra coding scheme is proposed to improve coding performance in intra pictures of H.264/AVC. The proposed method generates two sub-images, which are defined as a sampled image and a prediction error image in this paper, from an original image, and then encodes them separately. Especially, in an intra prediction process of encoding, the sampled image employs the original intra prediction modes, while the prediction error image uses newly defined four intra prediction modes. Experimental results demonstrate that the proposed method achieves significantly higher intra coding performance and reduces encoding complexity with the smaller number of a rate-distortion (RD) cost calculation process, as compared with the original intra coding method of H.264/AVC.

P1-12

Title	Entropy Coding in Video Compression Using Probability Interval Partitioning
Author	Detlev Marpe, Heiko Schwarz (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany), Thomas Wiegand (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute/Technical University of Berlin, Germany)
Page	pp. 66 - 69
Keyword	entropy coding, variable length coding
Abstract	We present a novel approach to entropy coding, which provides the coding efficiency and simple probability modeling capability of arithmetic coding at the complexity level of Huffman coding. The key element of the proposed approach is a partitioning of the unit interval into a small set of probability intervals. An input sequence of discrete source symbols is mapped to a sequence of binary symbols and each of the binary symbols is assigned to one of the probability intervals. The binary symbols that are assigned to a particular probability interval are coded at a fixed probability using a simple code that maps a variable number of binary symbols to variable length codewords. The probability modeling is decoupled from the actual binary entropy coding. The coding efficiency of the probability interval partitioning entropy (PIPE) coding is comparable to that of arithmetic coding.

P1-13

Title	Separable Wiener Filter Based Adaptive In-Loop Filter for Video Coding
Author	Mischa Siekmann, Sebastian Bosse, Heiko Schwarz, Thomas Wiegand (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany)
Page	pp. 70 - 73
Keyword	Video Compression, In-loop Filtering, Wiener Filter, Separable Image Filtering, Regional Filtering
Abstract	Recent investigations have shown that a non-separable Wiener filter, that is applied inside the motion-compensation loop, can improve the coding efficiency of hybrid video coding designs. In this paper, we study the application of separable Wiener filters. Our design includes the possibility to adaptively choose between the application of the vertical, horizontal, or combined filter. The simulation results verify that a separable in-loop Wiener filter is capable of providing virtually the same increase in coding efficiency as a non- separable Wiener filter, but at a significantly reduced decoder complexity.

[Image/video coding and related topics]

P1-14

Title Hyperspectral Image Compression Suitable for Spectral Analysis Application

Author Kazuma Shinoda, Yukio Kosugi, Yuri Murakami, Masahiro Yamaguchi, Nagaaki Ohyama (Tokyo Institute of Technology, Japan)

Page pp. 74 - 77

Keyword Hyperspectral image, Image compression, Vegetation index, JPEG2000

Abstract This paper presents a HSI compression considering the error of both vegetation index and spectral data. The proposed method separates a hyperspectral data into spectral data for vegetation index and residual data. Both of the data are encoded by using a seamless coding individually. By holding the spectral channels required for vegetation index in the head of the code-stream, a precise vegetation analysis can be done in a low bit rate. Additionally, by decoding the residual data, the spectral data can be reconstructed in low distortion.

P1-15

Title	A Background Model Based Method for Transcoding Surveillance Videos Captured by Stationary Camera
Author	Xianguo Zhang (Peking University, China), Luhong Liang, Qian Huang (Institute of Computing Technology, Chinese Academy of Sciences, China), Tiejun Huang, Wen Gao (Peking University, China)
Page	pp. 78 - 81
Keyword	surveillance, archive, transcode, background model, difference frame
Abstract	Real-world video surveillance applications require storing videos without neglecting any part of scenarios for weeks or months. To reduce the storage cost, the high bit-rate videos from cameras should be transcoded into a more efficient compressed format with little quality loss. We propose a background model based method to improve the transcoding efficiency for surveillance videos captured by stationary cameras, and objectively measure it. Experimental Results show this method saves about half the used bits compared with the full-decoding-full-encoding method.

P1-16

Title	Analysis of In-Loop Denoising in Lossy Transform Coding
Author	Eugen Wige, Gilbert Yammine (Multimedia Communications and Signal Processing / University of Erlangen-Nuremberg, Germany), Peter Amon, Andreas Hutter (Siemens Corporate Technology / Information and Automation Technologies, Germany), Andre Kaup (Multimedia Communications and Signal Processing / University of Erlangen-Nuremberg, Germany)
Page	pp. 82 - 85
Keyword	high quality compression, predictor denoising, quantization effects
Abstract	When compressing noisy image sequences, the compression efficiency is limited by the noise amount within these image sequences as the noise part cannot be predicted. In this paper, we investigate the influence of noise within the reference frame on lossy video coding of noisy image sequences. We estimate how much noise is left within a lossy coded reference frame. Therefore we analyze the transform and quantization step inside a hybrid video coder, specifically H.264/AVC. The noise power after transform, quantization, and inverse transform is calculated analytically. We use knowledge of the noise power within the reference frame in order to improve the inter frame prediction. For noise filtering of the reference frame, we implemented a simple denoising algorithm inside the H.264/AVC reference software JM15.1. We show that the bitrate can be decreased by up to 8.1% compared to the H.264/AVC standard for high resolution noisy image sequences.

P1-17

Title	An Efficient Side Information Generation Using Seed Blocks for Distributed Video Coding
Author	DongYoon Kim, DongSan Jun, HyunWook Park (KAIST, Republic of Korea)
Page	pp. 86 - 89
Keyword	Distrivuted vidoe coding, Wyner-Ziv video coding, Side information
Abstract	Recently, a new video coding technique, distributed video coding (DVC), is an emerging research area for low power video coding applications. In the DVC, the encoder is much simpler than the conventional video codec, whereas the decoder is very heavy. The DVC decoder exploits side information which is generated by motion compensated frame interpolation to reconstruct the Wyner-Ziv frame. This paper proposes an efficient side information generation algorithm using seed blocks for DVC. Seed blocks are firstly selected to be used for motion estimation of the other blocks. As the side information is close to the target image, the final reconstructed image in the DVC decoder has better quality and the compression ratio becomes high. The proposed method contributes to improve the DVC compression performance with reduced computing time. Experimental results show that accurate motion vectors are estimated by the proposed method and its computational complexity of motion estimation is significantly reduced in comparison with the previous methods.

P1-18

Title	Compressive Video Sensing Based on User Attention Model
Author	Jie Xu (Advanced Computing Research Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, China), Jianwei Ma (School of computational Science, Florida State University, U.S.A.), Dongming Zhang, Yongdong Zhang, Shouxun Lin (Advanced Computing Research Laboratory, Institute of Computing Technology, Chinese Academy of Sciences, China)
Page	pp. 90 - 93
Keyword	compressive sensing, video, user attention model, ROI
Abstract	We propose a compressive video sensing scheme based on user attention model (UAM) for real video sequences acquisition. In this work, for every group of consecutive video frames, we set the first frame as reference frame and build a UAM with visual rhythm analysis (VRA) to automatically determine region-of-interest (ROI) for non-reference frames. The determined ROI usually has significant movement and attracts more attention. Each frame of the video sequence is divided into non-overlapping blocks of 16x16 pixel size. Compressive video sampling is conducted in a block-by-block manner on each frame through a single operator and in a whole region manner on the ROIs through a different operator. Our video reconstruction algorithm involves alternating direction l1-norm minimization algorithm (ADM) for the frame difference of non-ROI blocks and minimum total-variance (TV) method for the ROIs. Experimental results showed that our method could significantly enhance the quality of reconstructed video and reduce the errors accumulated during the reconstruction.

P1-19

Title	Advanced Inpainting-Based Macroblock Prediction with Regularized Structure Propagation in Video Compression
Author	Yang Xu, Hongkai Xiong (Shanghai Jiao Tong University, China)
Page	pp. 94 - 97
Keyword	belief propagation, tensor voting, inpainting, model selection, H.264
Abstract	In this paper, we propose an optimized inpainting-based macroblock (MB) prediction mode (IP- mode) in the state-of-the-art H.264/AVC video compression engine, and investigate a natural extension of structured sparsity over the ordered Belief Propagation (BP) inference in inpainting-based prediction. The IP- mode is regularized by a global spatio-temporal consistency between the predicted content and the co-located known texture, and could be adopted in both Intra and Inter frames without redundant assistant information. It is solved by an optimization problem under Markov Random Field (MRF), and the structured sparsity of the predicted macroblock region is inferred by tensor voting projected from the decoded regions to tune the priority of message scheduling in BP with a more convergent manner. Rate-distortion optimization is maintained to select the optimal mode among the inpainting-based prediction (IP-), the intra-, and inter- modes. Compared to the existing prediction modes in H.264/AVC, the proposed inpainting-based prediction scheme is validated to achieve a better R-D performance for homogeneous visual patterns and behave a more robust error resilience capability with an intrinsic probabilistic inference.

P1-20

Title	An Improved Iterative Algorithm for Calculating the Rate-Distortion Performance of Causal Video Coding for Continuous Sources and its Application to Real Video Data
Author	En-hui Yang, Chang Sun, Lin Zheng (University of Waterloo, Canada)
Page	pp. 98 - 101
Keyword	video coding, causal video coding, rate-distortion performance, soft decision quantization, interactive algorithm
Abstract	An improved iterative algorithm is first proposed to calculate the rate-distortion performance of causal video coding for any continuous sources. Instead of using continuous reproduction alphabets, it utilizes finite reproduction alphabets and iteratively updates them along with transitional probabilities from the continuous source to reproduction letters, thus overcoming the computation complexity problem encountered when applying the algorithm recently proposed by Yang et al for discrete sources to continuous sources. The proposed algorithm converges in the sense that the rate-distortion cost is monotonically decreasing until a stationary point is reached. It is then applied to practical video data to establish some theoretic coding performance benchmark. In comparison with H.264, experiments show that under the same motion compensation setting, causal video coding offers a roughly 1 dB coding gain on average over H.264 for the IPPIPP…GOP structure. This suggests that an area one could explore to further improve the rate-distortion performance of H.264 be how quantization and coding should be performed conditionally given previous frames and coded frames and given motion compensation.

P1-21

Title	Technical Design & IPR Analysis for Royalty-Free Video Codecs
Author	Cliff Reader (none, U.S.A.)
Page	pp. 102 - 105
Keyword	Royalty-free, video codec, performance, complexity, IPR
Abstract	Royalty-free standards for image and video coding have been actively discussed for over 20 years. This paper breaks down the issues of designing royalty-free codecs into the major topics of requirements, video coding tools, classes of patents and performance. By dissecting the codec using a hierarchy of major to minor coding tools, it is possible to pinpoint where a patent impacts the video coding, and what the consequence will be of avoiding the patented tool.

P1-22

Title	Accelerating Pixel Predictor Evolution Using Edge-Based Class Separation
Author	Seishi Takamura, Masaaki Matsumura, Hirohisa Jozawa (NTT Cyber Space Laboratories, NTT Corporation, Japan)
Page	pp. 106 - 109
Keyword	Genetic programming, lossless image coding, pixel prediction, divide and conquer
Abstract	Evolutionary methods based on genetic programming (GP) enable dynamic algorithm generation, and have been successfully applied to many areas such as plant control, robot control, and stock market prediction. However, one of the challenges of this approach is its high computational complexity. Conventional image/video coding methods such as JPEG and H.264 all use fixed (non-dynamic) algorithms without exception. However, one of the challenges of this approach is its high computational complexity. In this article, we introduce a GP-based image predictor that is specifically evolved for each input image, as well as local image properties such as edge direction. Via the simulation, proposed method demonstrated ~180 times faster evolution speed and 0.02-0.1 bit/pel lower bit rate than previous method.

P1-23

Title	Avoidance of Singular Point in Reversible KLT
Author	Masahiro Iwahashi (Nagaoka University of Technology, Japan), Hitoshi Kiya (Tokyo Metropolitan University, Japan)
Page	pp. 110 - 113
Keyword	KLT, error, reversible, coding
Abstract	Permutation of order and sign of signals are introduced to avoid the singular point problem of a reversible transform. Even though a transform in the lifting structure can be "reversible" in spite of rounding operations, its multiplier coefficients have singular points (SP). Around the SP, rounding errors are magnified to huge amount and the coding efficiency is decreased. We investigate effect of the permutations on rotation angles and confirmed that PSNR was improved by 14 [dB] for RGB color components.

[Image/video processing and related topics]

P1-24

Title Super-Resolution Decoding of JPEG-Compressed Image Data with the Shrinkage in the Redundant DCT Domain

Author Takashi Komatsu, Yasutaka Ueda, Takahiro Saito (Kanagawa University, Japan)

Page pp. 114 - 117

Keyword JPEG, decoding, redundant DCT, shrinkage, super-resolution

Abstract Alter, Durand and Froment introduced the total-variation (TV) minimization approach to the artifact-free JPEG decoding. They formulated the decoding problem as the constrained TV restoration problem, in which the TV semi-norm of its restored color image is minimized under the constraint that each DCT coefficient of the restored color image should be in the quantization interval of its corresponding DCT coefficient of the JPEG-compressed data. This paper proposes a new restoration approach to the JPEG decoding. Instead of the TV regularization, our new JPEG-decoding method employs a shrinkage operation in the redundant DCT domain, to mitigate degradations caused by the JPEG coding.

P1-25

Title	Image Interpolation via Regularized Local Linear Regression
Author	Xianming Liu, Debin Zhao (Harbin Institute of Technology, China), Ruiqin Xiong, Siwei Ma, Wen Gao (Peking University, China)
Page	pp. 118 - 121
Keyword	Image interpolation, regularized local linear regression, edge preservation
Abstract	In this paper, we present an efficient image interpolation scheme by using regularized local linear regression (RLLR). On one hand, we introduce a robust estimator of local image structure based on moving least squares, which can efficiently handle the statistical outliers compared with ordinary least squares based methods. On the other hand, motivated by recent progress on manifold based semi-supervise learning, the intrinsic manifold structure is explicitly considered by making use of both measured and unmeasured data points. In particular, the geometric structure of the marginal probability distribution induced by unmeasured samples is incorporated as an additional locality preserving constraint. The optimal model parameters can be obtained with a closed-form solution by solving a convex optimization problem. Experimental results demonstrate that our method outperform the existing methods in both objective and subjective visual quality over a wide range of test images.

P1-26

Title	Fast and Efficient Gaussian Noise Image Restoration Algorithm by Spatially Adaptive Filtering
Author	Tuan Anh Nguyen, Myoung Jin Kim, Min Cheol Hong (Soongsil University, Republic of Korea)
Page	pp. 122 - 125
Keyword	noise, detection, removal, constraint, smoothness
Abstract	In this paper, we propose a spatially adaptive noise removal algorithm using local statistics that consists of two stages: noise detection and removal. To corporate desirable properties into denoising process, the local weighted mean, local weighted activity, and local maximum are defined. With these local statistics, the noise detection function is defined and a modified Gaussian filter is used to suppress the detected noise components. The experimental results demonstrate the effectiveness of the proposed algorithm.

P1-27

Title	Image Simplification by Frequency-Selective Means Filtering
Author	Johannes Ballé (RWTH Aachen University, Germany)
Page	pp. 126 - 129
Keyword	narrowband filters, non-local means filtering, phase congruency
Abstract	In this paper, we present an algorithm to remove high-frequent texture and detail from images. The algorithm effectively removes texture regardless of its contrast without destroying high-level image structure or introducing artificial edges. While based on the same general framework, this “image simplification” filter differs from noise filtering methods such as bilateral filtering by its frequency-selectivity and edge awareness. Applications include artistic filtering of images (edge-preserving smoothing), image reconstruction and preprocessing for low-bitrate image compression.

P1-28

Title	Theoretical Analysis of Trend Vanishing Moments for Directional Orthogonal Transforms
Author	Shogo Muramatsu, Dandan Han, Tomoya Kobayashi, Hisakazu Kikuchi (Niigata University, Japan)
Page	pp. 130 - 133
Keyword	M-D filter banks, M-D Wavelets, Directional transforms, Nonseparable filter design, Image coding
Abstract	This work contributes to investigate theoretical properties of the trend vanishing moments (TVMs) which the authors have defined in a previous work and applied to the directional design of 2-D nonseparable GenLOT. Some significant properties of TVMs are shown theoretically and experimentally.

P1-29

Title	A Study on Memorability and Shoulder-Surfing Robustness of Graphical Password Using DWT-Based Image Blending
Author	Takao Miyachi, Keita Takahashi, Madoka Hasegawa, Yuichi Tanaka, Shigeo Kato (Utsunomiya University, Japan)
Page	pp. 134 - 137
Keyword	Graphical password, discrete wavelet transform, authentication, usable security
Abstract	We propose a graphical password method which is difficult to steal original pass-image by using characteristics of human vision system. In our method, we combine low frequency components of a decoy picture with high frequency components of a pass-image. It is easy for legitimate users to recognize the pass-image in the blended image. On the other hand, this task is difficult for attackers. We used discrete wavelet transform (DWT) to blend a decoy image and a pass-image. User studies are conducted to evaluate memorability and shoulder-surfing robustness of this method.

P1-30

Title	Scalable Image Scrambling Method Using Unified Constructive Permutation Function on Diagonal Blocks
Author	KokSheik Wong (University of Malaya, Malaysia), Kiyoshi Tanaka (Shinshu University, Japan)
Page	pp. 138 - 141
Keyword	scalable scrambling, UCPF, diagonal block, image encryption
Abstract	In this paper, an extension of ScaScra [1] is proposed to scalably scramble an image in the diagonal direction for achieving distorted scanline-like effect. The non-overlapping diagonal blocks are first defined and the unified constructive permutation function is applied to scramble pixels in each diagonal block. Scalability in scrambling is achieved by varying the block size. Experiments were carried out to objectively and subjectively verify the basic performance of the proposed extension and compare them to the results of ScaScra by using standard test images. Evaluations on pixel correlation and entropy are also carried out to verify the performance of both ScaScra and the proposed extension.

[Quality, system, applications, and other topics]

P1-31

Title Memory-Efficient Parallelization of JPEG-LS with Relaxed Context Update

Author Simeon Wahl, Zhe Wang, Chensheng Qiu, Marek Wroblewski, Lars Rockstroh, Sven Simon (University of Stuttgart, Germany)

Page pp. 142 - 145

Keyword JPEG-LS, Lossless Image Coding, Parallelization, Context Update

Abstract A relaxation to the context update of JPEG-LS by delaying the update procedure is proposed, in order to achieve a guaranteed degree of parallelism with a negligible effect on the compression ratio. The lossless mode of JPEG-LS including the run-mode is considered. A descewing scheme is provided generating a bit-stream that preserves the order needed for the decoder to mimic the prediction in a consistent way.

P1-32

Title	H.264 Hierarchical P Coding in the Context of Ultra-Low Delay, Low Complexity Applications
Author	Danny Hong, Michael Horowitz, Alexandros Eleftheriadis, Thomas Wiegand (Vidyo, Inc., U.S.A.)
Page	pp. 146 - 149
Keyword	H.264, hierarchical P coding
Abstract	Despite the attention that hierarchical B picture coding has received, little attention has been given to a related technique called hierarchical P picture coding. P picture only coding without reverse prediction is necessary in constrained bit rate applications that require ultra-low delay and/or low complexity, such as videoconferencing. Such systems, however, have been using the traditional IPPP picture coding structure almost exclusively. In this paper, we investigate the use of hierarchical P coding vs. traditional IPPP coding and demonstrate that it has significant advantages which have not yet been well documented or understood. From a pure coding efficiency point of view, we show that for encoders configured to use ultra-low delay and low complexity coding tools, hierarchical P coding achieves an average advantage of 7.86% BD-rate and 0.34 dB BD-SNR.

P1-33

Title	Correlation Modeling with Decoder-Side Quantization Distortion Estimation for Distributed Video Coding
Author	Jozef Skorupa, Jan De Cock, Jürgen Slowack, Stefaan Mys (Ghent University -- IBBT, Belgium), Nikos Deligiannis (Vrije Universiteit Brussel -- IBBT, Belgium), Peter Lambert (Ghent University -- IBBT, Belgium), Adrian Munteanu (Vrije Universiteit Brussel -- IBBT, Belgium), Rik Van de Walle (Ghent University -- IBBT, Belgium)
Page	pp. 150 - 153
Keyword	distributed video coding, correlation modeling, quantization distortion
Abstract	Aiming for low-complexity encoding, distributed video coders still fail to achieve the performance of current industrial standards for video coding. One of most important problems in this area is the accurate modeling of the correlation between the predicted signal and the original video. In our previous work we showed that exploiting the quantization distortion can significantly improve the accuracy of a correlation estimator. In this paper we describe how the quantization distortion can be exploited purely at the decoder side without any performance penalty when compared to an encoder-aided system. As a result, the proposed correlation estimator delivers state-of-the-art modeling accuracy while neatly fitting the low-encoder-complexity characteristic of distributed video coding.

P1-34

Title	Content-Based Retrieval by Multiple Image Examples for Sign Board Retrieval
Author	Atsuo Yoshitaka (JAIST, Japan), Terumasa Hyoudou (Hiroshima University, Japan)
Page	pp. 154 - 157
Keyword	Multiple image examples, QBE, image retrieval
Abstract	In the area of retrieving image databases, one of the promising approaches is to retrieve it by specifying image example. However, specifying a single image example is not always sufficient to get satisfactory result, since one image example does not give comprehensive ranges of values that reflect the various aspects of the object to be retrieved. In this paper, we propose a method of retrieving images by specifying multiple image examples that is designed for retrieving sign boards. Features of color, shape, and spatial relation of color regions are extracted from example images, and they are clustered so as to obtain proper range of values. Compared with QBE systems that accept only a single image as the query condition, MIERS (Multi-Image Example-based Retrieval System) returns better retrieval result, where the experimental result showed that specifying more examples helps to improve recall with little deterioration of precision.

Thursday, December 9, 2010

Session K2 Keynote Speech 2
Time: 8:45 - 9:30 Thursday, December 9, 2010
Chair: Kiyoharu Aizawa (University of Tokyo, Japan)

K2-1 (Time: 8:45 - 9:30)

Title	(Keynote Speech) Decoding Visual Perception from Human Brain Activity
Author	Yukiyasu Kamitani (ATR Computational Neuroscience Laboratories, Japan)
Abstract	In modern neuroscience, brain activity is considered as "codes" that encode mental and behavioral contents. Recent advances in human neuroimaging, in particular functional magnetic resonance imaging (fMRI), have revealed brain regions that appear to encode specific behavior and cognition. Despite the wide-spread use of human neuroimaging in such "functional brain mapping", its potential to read out, or "decode", mental contents from brain activity has not been fully explored. Mounting evidence from animal neurophysiology has revealed the roles of the early visual cortex in representing visual features such as orientation and motion direction. However, non-invasive neuroimaging methods have been thought to lack the resolution to probe into these putative feature representations in the human brain. In this talk, I present methods for decoding early visual representations from fMRI voxel patterns based on machine learning techniques. I first show how early visual features represented in "subvoxel" neural structures could be decoded from ensemble fMRI responses. Decoding of stimulus features is extended to the method for neural mind-reading, which attempts to predict a person's subjective state using a decoder trained with unambiguous stimulus presentation. We discuss how a multivoxel pattern can represent more information than the sum of individual voxels, and how an effective set of voxels for decoding can be selected from all available voxels. Based on these methods, we have recently proposed a modular decoding approach, in which a wide variety of percepts can be predicted by combining the outputs of multiple modular decoders. I show an example of visual image reconstruction where arbitrary visual images can be accurately reconstructed by the decoding model trained on fMRI responses to several hundred random images. Finally, I discuss potential applications of neural decoding for brain-based communications.

Session P2 Poster Session 2
Time: 9:30 - 11:00 Thursday, December 9, 2010
Chair: Kazunori Kotani (Japan Advanced Institute of Science and Technology, Japan)

[3DTV/FTV/multi-view-related topics]

P2-1

Title Focus on Visual Rendering Quality through Content-Based Depth Map Coding

Author Emilie Bosc, Luce Morin, Muriel Pressigout (INSA of Rennes, France)

Page pp. 158 - 161

Keyword 3D video coding, adaptive coding, depth coding

Abstract Multi-view video plus depth (MVD) data is a set of multiple sequences capturing the same scene at different viewpoints, with their associated per-pixel depth value. Overcoming this large amount of data requires an effective coding framework. Yet, a simple but essential question refers to the means assessing the proposed coding methods. While the challenge in compression is the optimization of the rate-distortion ratio, a widely used objective metric to evaluate the distortion is the Peak-Signal-to-Noise-Ratio (PSNR), because of its simplicity and mathematically easiness to deal with such purposes. This paper points out the problem of reliability, concerning this metric, when estimating 3D video codec performances. We investigated the visual performances of two methods, namely H.264/MVC and Locally Adaptive Resolution (LAR) method, by encoding depth maps and reconstructing existing views from those degraded depth images. The experiments revealed that lower coding efficiency, in terms of PSNR, does not imply a lower rendering visual quality and that LAR method preserves the depth map properties correctly.

P2-2

Title	Bit Allocation of Vertices and Colors for Patch-Based Coding in Time-Varying Meshes
Author	Toshihiko Yamasaki, Kiyoharu Aizawa (The University of Tokyo, Japan)
Page	pp. 162 - 165
Keyword	3DTV, Time-varying mesh (TVM), inter-frame compression, vector quantization (VQ)
Abstract	This paper discusses the optimal bit rates assignment for vertices and color and for reference frames (I frames) and target frames (P frames) in the patch-based compression method for Time-Varying Meshes (TVMs). TMVs are non-isomorphic 3D mesh sequences generated from multi-view images. Experimental results demonstrated that, the bit rate for vertices affects the visual quality of the rendered 3D model very much whereas that for color does not contribute to the quality improvement. Therefore, as many bits as possible should be assigned to vertices and 8-10 bits per vertex (bpv) is enough for color. In inter-frame coding, the bit rates for the target frames improves the visual quality proportionally but at the same time it is demonstrated that less bits (5~6 bpv) are enough to achieve the same visual quality as the intra-frames.

P2-3

Title	Motion Activity-Based Block Size Decision for Multi-view Video Coding
Author	Huanqiang Zeng, Kai-Kuang Ma (School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore), Canhui Cai (Institute of Information Science and Technology, Huaqiao University, China)
Page	pp. 166 - 169
Keyword	multi-view video coding, motion estimation, disparity estimation, block size decision, motion activity
Abstract	Motion estimation and disparity estimation using variable block sizes have been exploited in multi-view video coding to effectively improve the coding efficiency, but at the expense of yielding higher computational complexity. In this paper, a fast block size decision algorithm, called motion activity-based block size decision (MABSD), is proposed. In our approach, the various motion estimation and disparity estimation block sizes are classified into four classes, and only one of them will be chosen to further identify the optimal block size within that class according to the measured motion activity of the current macroblock. The above-mentioned motion activity can be measured by the maximum city-block distance of a set of motion vectors taken from the adjacent macroblocks in the current view and its neighboring view. Experimental results have shown that compared with exhaustive block size decision, which is a default approach set in the JMVM reference software, the proposed MABSD algorithm achieves a reduction of computational complexity by 42% on average, while incurring only 0.01 dB loss in peak signal-to-noise ratio (PSNR) and 1% increment on the total bit rate.

P2-4

Title	Color Based Depth Up-Sampling for Depth Compression
Author	Meindert Onno Wildeboer, Tomohiro Yendo, Mehrdad Panahpour Tehrani (Nagoya University, Japan), Toshiaki Fujii (Tokyo Institute of Technology, Japan), Masayuki Tanimoto (Nagoya University, Japan)
Page	pp. 170 - 173
Keyword	depth up-sampling, depth coding, depth-map, FTV, 3DTV
Abstract	3D scene information can be represented in several ways. In applications based on a (N-)view plus (N-)depth representation, both view and depth data is compressed. In this paper we present a depth compression method containing an depth up-sample filter which uses the color view as prior. Our method of depth down-/up-sampling is able to maintain clear object boundaries in the reconstructed depth maps. Our experimental results show that the proposed depth re-sampling filter, used in combination with a standard state-of-the art video encoder, can increase both the coding efficiency and rendering quality.

P2-5

Title	Efficient Free Viewpoint Video-On-Demand Scheme Realizing Walk-Through Experience
Author	Akio Ishikawa, Hiroshi Sankoh, Sei Naito, Shigeyuki Sakazawa (KDDI R&D Laboratories Inc., Japan)
Page	pp. 174 - 177
Keyword	Freeviewpoint Television, Multi-view Video, Freeviewpoint Video Transmission, Walk-through, Multi-texturing
Abstract	This paper presents an efficient VOD scheme for FTV, and proposes a data format and its data generation method to provide a walk-through experience. We employ a hybrid rendering approach to describe a 3D scene using 3D model data for objects and textures. In this paper we propose an efficient texture data format, which removes the redundancy due to occlusion of objects by employing an orthogonal projection image for each object. The advantage of the data format is great simplification at the server to choose the transmitted images that correspond to the requested viewpoint.

P2-6

Title	3-D Video Coding Using Depth Transition Data
Author	Woo-Shik Kim, Antonio Ortega (University of Southern California, U.S.A.), Jaejoon Lee, HoCheon Wey (Samsung Electronics Co., Ltd., Republic of Korea)
Page	pp. 178 - 181
Keyword	3-D video coding, multiview plus depth, view synthesis
Abstract	The objective is to develop new 3-D video coding system to provide better coding efficiency with improved subjective quality. We analyzed rendered view distortions in DIBR system, and found that the depth map coding distortion leads to “erosion artifacts”, which lead to significant perceptual quality degradation. To solve this, we propose a solution using depth transition data which indicates the camera position where depth changes. Simulation results show that significant subjective quality improvement with maximum PSNR gains of 0.5 dB.

P2-7

Title	Subjective Assessment of Frame Loss Concealment Methods in 3D Video
Author	Joao Carreira, Luis Pinto, Nuno Rodrigues, Sergio Faria, Pedro Assuncao (Institute of Telecommunications / Polytechnic Institute of Leiria - ESTG, Portugal)
Page	pp. 182 - 185
Keyword	3D video, Frame loss concealment, Subjective Assessment
Abstract	This paper investigates the subjective impact resulting from different concealment methods for coping with lost frames in 3D video communication systems. It is assumed that a high priority channel is assigned to the main view and only the auxiliary view is subject to either transmission errors or packet loss, leading to missing frames at the decoder output. Three methods are used for frame concealment under different loss ratios. The results show that depth is well perceived by users and the subjective impact of frame loss not only depends on the concealment method but also exhibits high correlation with the disparity of the original sequence. It is also shown that under heavy loss conditions it is better to switch from 3D to 2D rather than presenting concealed 3D video to users.

P2-8

Title	A Novel Upsampling Scheme for Depth Map Compression in 3DTV System
Author	Yanjie Li, Lifeng Sun (Tsinghua University, China)
Page	pp. 186 - 189
Keyword	Depth map compression, Resolution reduction, Upsampling, Rate-distortion
Abstract	In this paper, we propose a novel two-step depth map upsampling scheme to address the problem on 3D videos. The first step utilizes the full resolution 2D color map to help reconstruct a more accurate full resolution depth map. And the second step further flats the reconstructed depth map to ensure its local uniformity. Test results show that the proposed upsampling scheme achieves up to 2dB coding gains for the rendering of free-viewpoint video, and improves its perceptual quality significantly.

[Beyond H.264/MPEG-4 AVC and related topics]

P2-9

Title Adaptive Direct Vector Derivation for Video Coding

Author Yusuke Itani, Shunichi Sekiguchi, Yoshihisa Yamada (Information Technology R&D Center, Mitsubishi Electric Corporation, Japan)

Page pp. 190 - 193

Keyword direct mode, motion vector predictor, HEVC, H.264, extended macroblock

Abstract This paper proposes a new method for improving direct prediction scheme that has been employed in conventional video coding standards such as AVC/H.264. We extend direct prediction concept to achieve better adaptation to local statistics of video source with the assumption of the use of larger motion blocks than conventional macroblock size. Experimental results show the proposed method provides up to 3.3% bitrate saving in low-bitrate coding.

P2-10

Title	Inter Prediction Based on Spatio-Temporal Adaptive Localized Learning Model
Author	Hao Chen, Ruimin Hu, Zhongyuan Wang, Rui Zhong (Wuhan University, China)
Page	pp. 194 - 197
Keyword	Inter prediction, STALL, LSP
Abstract	Inter prediction based on block matching motion estimation is important for video coding. But this method suffers from the additional overhead in data rate representing the motion information that needs to be transmitted to the decoder. To solve this problem, we present an improved implicit motion information inter prediction algorithm for P slice in H.264/AVC based on the spatio-temporal adaptive localized learning (STALL) model. According to 4*4 block transform structure in H.264/AVC, we first adaptively choose nine spatial neighbors and nine temporal neighbors, and a localized 3D casual cube is designed as training window. By using these information, the model parameters could be adaptively computed based on the Least Square Prediction (LSP) method. Finally, we add a new inter prediction mode into H.264/AVC standard for P slice. The experimental results show that our algorithm improves encoding efficiency compared with H.264/AVC standard, with relatively increases in complexity.

P2-11

Title	Intra Picture Coding with Planar Representations
Author	Jani Lainema, Kemal Ugur (Nokia Research Center, Finland)
Page	pp. 198 - 201
Keyword	video coding, intra coding, planar coding, H.264/AVC, HEVC
Abstract	In this paper we introduce a novel concept for Intra coding of pictures especially suitable for representing smooth image segments. Traditional block based transform coding methods cause visually annoying blocking artifacts for image segments with gradually changing smooth content. The proposed solution overcomes this drawback by defining a fully continuous surface of sample values approximating the original image. The gradient of the surface is indicated by transmitting values for selected control points within the image segment and the surface itself is obtained by interpolating sample values in-between the control points. This approach is found to provide up to 30 percent bitrate reductions in the case of natural imagery and it has also been adopted to the initial HEVC codec design by JCT-VC.

P2-12

Title	Adaptive Global Motion Temporal Prediction for Video Coding
Author	Alexander Glantz, Andreas Krutz, Thomas Sikora (Technische Universität Berlin, Germany)
Page	pp. 202 - 205
Keyword	H.264/AVC, video coding, global motion, temporal filtering, prediction
Abstract	Depending on the content of a video sequence, the amount of bits spent for the transmission of motion vectors can be enormous. A global motion model can be a better representation of movement in these regions than a motion vector. This paper presents a novel prediction technique that is based on global motion compensation and temporal filtering. The new approach is incorporated into H.264/AVC and outperforms the reference by up to 14%.

P2-13

Title	Highly Efficient Video Compression Using Quadtree Structures and Improved Techniques for Motion Representation and Entropy Coding
Author	Detlev Marpe, Heiko Schwarz, Thomas Wiegand, Sebastian Bosse, Benjamin Bross, Philipp Helle, Tobias Hinz, Heiner Kirchhoffer, Haricharan Lakshman, Tung Nguyen, Simon Oudin, Mischa Siekmann, Karsten Suehring, Martin Winken (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany)
Page	pp. 206 - 209
Keyword	Video coding, H.265, HEVC
Abstract	This paper describes a novel video coding scheme that can be considered as a generalization of the block-based hybrid video coding approach of H.264/AVC. While the individual building blocks of our approach are kept simple similarly as in H.264/AVC, the flexibility of the block partitioning for prediction and transform coding has been substantially in-creased. This is achieved by the use of nested and pre-configurable quadtree structures, such that the block parti-tioning for temporal and spatial prediction as well as the space-frequency resolution of the corresponding prediction residual can be adapted to the given video signal in a highly flexible way. In addition, techniques for an improved motion representation as well as a novel entropy coding concept are included. The presented video codec was submitted to a Call for Proposals of ITU-T VCEG and ISO/IEC MPEG and was ranked among the five best performing proposals, both in terms of subjective and objective quality.

[Image/video coding and related topics]

P2-14

Title Dictionary Learning-Based Distributed Compressive Video Sensing

Author Hung-Wei Chen, Li-Wei Kang, Chun-Shien Lu (Academia Sinica, Taiwan)

Page pp. 210 - 213

Keyword compressive sensing, sparse representation, dictionary learning, single-pixel camera, l1-minimization

Abstract We address an important issue of fully low-cost and low-complex video compression for use in resource-extremely limited sensors/devices. Conventional motion estimation-based video compression or distributed video coding (DVC) techniques all rely on the high-cost mechanism, namely, sensing/sampling and compression are disjointedly performed, resulting in unnecessary consumption of resources. That is, most acquired raw video data will be discarded in the (possibly) complex compression stage. In this paper, we propose a dictionary learning-based distributed compressive video sensing (DCVS) framework to “directly” acquire compressed video data. Embedded in the compressive sensing (CS)-based single-pixel camera architecture, DCVS can compressively sense each video frame in a distributed manner. At DCVS decoder, video reconstruction can be formulated as an l1-minimization problem via solving the sparse coefficients with respect to some basis functions. We investigate adaptive dictionary/basis learning for each frame based on the training samples extracted from previous reconstructed neighboring frames and argue that much better basis can be obtained to represent the frame, compared to fixed basis-based representation and recent popular “CS-based DVC” approaches without relying on dictionary learning.

P2-15

Title	Medium-Granularity Computational Complexity Control for H.264/AVC
Author	Xiang Li, Mathias Wien, Jens-Rainer Ohm (Institute of Communications Engineering, RWTH Aachen University, Germany)
Page	pp. 214 - 217
Keyword	Computational complexity control, computational scalability
Abstract	Today, video applications on handheld devices become more and more popular. Due to limited computational capability of handheld devices, complexity constrained video coding draws much attention. In this paper, a medium-granularity computational complexity control (MGCC) is proposed for H.264/AVC. First, a large dynamic range in complexity is achieved by taking 16x16 motion estimation in a single reference frame as the basic computational unit. Then a high coding efficiency is obtained by an adaptive computation allocation at MB level. Simulations show that coarse-granularity methods cannot work when the normalized complexity is below 15\%. In contrast, the proposed MGCC performs well even when the complexity is reduced to 8.8\%. Moreover, an average gain of 0.3 dB over coarse-granularity methods in BD-PSNR is obtained for 11 sequences when the complexity is around 20\%.

P2-16

Title	An Improved Wyner-Ziv Video Coding With Feedback Channel
Author	Feng Ye, Aidong Men, Bo Yang, Manman Fan, Kan Chang (School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, China)
Page	pp. 218 - 221
Keyword	Wyner-Ziv video coding, motion activity, 3DRS motion estimation, side information
Abstract	This paper presents a improved feedback-assisted low complexity WZVC scheme. The performance of this scheme is improved by two enhancements: an improved mode-based key frame encoding and a 3DRS-assisted (three-dimensional recursive search assisted) motion estimation algorithm for WZ encoding. Experimental results show that our coding scheme can achieve significant gain compared to state-of-the-art TDWZ codec while still low encoding complexity.

P2-17

Title	Background Aided Surveillance-Oriented Distributed Video Coding
Author	Hongbin Liu (Harbin Institute of Technology, China), Siwei Ma (Peking University, China), Debin Zhao (Harbin Institute of Technology, China), Wen Gao (Peking University, China), Xiaopeng Fan (Harbin Institute of Technology, China)
Page	pp. 222 - 225
Keyword	surveillance, background, distributed video coding
Abstract	This paper presents a background aided surveillance-oriented distributed video coding system. A high quality background frame is encoded for each group of pictures (GOP), which can provides high quality SI for the background parts of the Wyner-Ziv (WZ) frames. Consequently, bit rate for the WZ frames can be reduced. Experimental results demonstrate that the proposed system can decrease the bit rate by up to 67.4% when compared with traditional DVC codec.

P2-18

Title	Content-Adaptive Spatial Scalability for Scalable Video Coding
Author	Yongzhe Wang (Shanghai Jiao Tong University, China), Nikolce Stefanoski (Disney Research Zurich, Switzerland), Xiangzhong Fang (Shanghai Jiao Tong University, China), Aljoscha Smolic (Disney Research Zurich, Switzerland)
Page	pp. 226 - 229
Keyword	H.264/AVC, scalable video coding, spatial scalability, content-adaptation, non-linear image warping
Abstract	This paper presents an enhancement of the SVC extension of the H.264/AVC standard by content-adaptive spatial scalability (CASS). The video streams (spatial layers), which are used as input to the encoder, are created by content-adaptive and art-directable retargeting of existing high resolution video. The non-linear dependencies between such video streams are efficiently exploited by CASS for scalable coding. This is achieved by integrating warping-based non-linear texture prediction and warp coding into the SVC framework.

P2-19

Title	Colorization-Based Coding by Focusing on Characteristics of Colorization Bases
Author	Shunsuke Ono, Takamichi Miyata, Yoshinori Sakai (Tokyo Institute of Technology, Japan)
Page	pp. 230 - 233
Keyword	Colorization, Colorization-based coding, Representative pixels, Redundancy, Correct color
Abstract	A novel approach to image compression called colorization-based coding has recently been proposed. It automatically extracts representative pixels from an original color image at an encoder and restores a full color image by using colorization at a decoder. However, previous studies on colorization-based coding extract redundant representative pixels. We propose a new colorization-based coding method focuses on the colorization basis. Experimental results revealed that our method can drastically suppress the information amount compared conventional method while objective quality is maintained.

P2-20

Title	Wyner-Ziv Coding of Multispectral Images for Space and Airborne Platforms
Author	Shantanu Rane, Yige Wang, Petros Boufounos, Anthony Vetro (Mitsubishi Electric Research Laboratories, U.S.A.)
Page	pp. 234 - 237
Keyword	Multispectral, Wyner-Ziv coding, LDPC code
Abstract	This paper investigates the application of lossy distributed source coding to high resolution multispectral images. The choice of distributed source coding is motivated by the need for very low encoding complexity on space and airborne platforms. The data consists of red, blue, green and infra-red channels and is compressed in an asymmetric Wyner-Ziv setting. One image channel is compressed using traditional JPEG and transmitted to the ground station where it is available as side information for Wyner-Ziv coding of the other channels. Encoding is accomplished by quantizing the image data, applying a Low-Density Parity Check code to the remaining three image channels, and transmitting the resulting syndromes. At the ground station, the image data is recovered from the syndromes by exploiting the correlation in the frequency spectrum of the band being decoded and the JPEG-decoded side information band. In experiments with real uncompressed images obtained by a satellite, the rate-distortion performance is found to be vastly superior to JPEG compression of individual image channels and rivals that of JPEG2000 at much lower encoding complexity.

P2-21

Title	Reversible Component Transforms by the LU Factorization
Author	Hisakazu Kikuchi, Junghyeun Hwang, Shogo Muramatsu (Niigata University, Japan), Jaeho Shin (Dongguk University, Republic of Korea)
Page	pp. 238 - 241
Keyword	component transform, image compression, LU factorization, lifting, round-off error
Abstract	A scaled transform is defined for a given irreversible linear transformation based on the LU factorization of a nonsingular matrix so that the transformation may be computed in a lifting form and hence may be reversible. Round-off errors in the lifting computation and the computational complexity are analyzed. Some reversible component transforms are presented and experimented to give some remarks to image compression applications. Discussions are developed with coding gain and actual bit rates.

P2-22

Title	MOS-Based Bit Allocation in SNR-Temporal Scalable Coding
Author	Yuya Yamasaki, Toshiyuki Yoshida (University of Fukui, Japan)
Page	pp. 242 - 245
Keyword	scalable coding, video coding, MOS, bit allocation
Abstract	In a scalable video coding, improvement in the motion smoothness (frame rate) and in the spatial quality (SNR) conflicts within a given bit rate. Since the motion and the spatial activities in a target video varies from scene to scene, these activities should be taken into account in order to optimally allocate the temporal and quality scalability. This paper proposes a fundamental idea for allocating bit rates in the temporal and quality scalability based on the maximization of the estimated mean opinion score (MOS) for each scene. A reduction technique of MOS fluctuation in the enhancement layer is discussed as an application of the proposed technique.

[Image/video processing and related topics]

P2-23

Title Automatic Moving Object Extraction Using X-means Clustering

Author Kousuke Imamura, Naoki Kubo, Hideo Hashimoto (Kanazawa University, Japan)

Page pp. 246 - 249

Keyword moving object extraction, x-means clustering, watershed algorithm, voting method

Abstract The present paper proposes an automatic extraction technique of moving objects using x-means clustering. The proposed technique is an extended k-means clustering and can determine the optimal number of clusters based on the Bayesian Information Criterion(BIC). In the proposed method, the feature points are extracted from a current frame, and x-means clustering classifies the feature points based on their estimated affine motion parameters. A label is assigned to the segmented region, which is obtained by morphological watershed, by voting for the feature point cluster in each region. The labeling result represents the moving object extraction. Experimental results reveal that the proposed method provides extraction results with the suitable object number.

P2-24

Title	Accurate Motion Estimation for Image of Spatial Periodic Pattern
Author	Jun-ichi Kimura, Naohisa Komatsu (School of Fundamental Science and Engineering, Waseda University, Japan)
Page	pp. 250 - 253
Keyword	motion estimation, motion vector
Abstract	We investigate the mechanism of the error of motion estimation for image including spatial periodic patterns. With a motion estimation model, we conclude that block matching distortion caused by motion vector sampling error (DVSE) degrades the motion estimation accuracy. We propose a new motion estimation using a maximum value of DVSE for each block. Simulations show that precision of proposed method is over 98% which are superior to full search method.

P2-25

Title	Direction-Adaptive Image Upsampling Using Double Interpolation
Author	Yi-Chun Lin, Yi-Nung Liu, Shao-Yi Chien (National Taiwan University, Taiwan)
Page	pp. 254 - 257
Keyword	double interpolation, upsampling, direction-adaptive, zigzagging, bicubic
Abstract	Double interpolation quality evaluation can be used as a measurement of an interpolation operation. By using this double interpolation framework, a high efficient direction-adaptive upsampling algorithm is proposed without any threshold setting and post-processing. With the proposed upsampling algorithm, the problem of zigzagging artifacts on the edge no longer exists. Moreover, the proposed algorithm has low computation complexity. The experimental results show that the proposed algorithm has high quality image output.

P2-26

Title	Blind GOP Structure Analysis of MPEG-2 and H.264/AVC Decoded Video
Author	Gilbert Yammine, Eugen Wige, Andre Kaup (Multimedia Communications and Signal Processing - University of Erlangen-Nuremberg, Germany)
Page	pp. 258 - 261
Keyword	GOP Structure, Blind Analysis, Noise Estimation
Abstract	In this paper, we provide a simple method for analyzing the GOP structure of an MPEG-2 or H.264/AVC decoded video without having access to the bitstream. Noise estimation is applied on the decoded frames and the variance of the noise in the different I-, P-, and B-frames is measured. After the encoding process, the noise variance in the video sequence shows a periodic pattern, which helps in the extraction of the GOP period, as well as the type of frames. This algorithm can be used along with other algorithms to blindly analyze the encoding history of a video sequence. The method has been tested on several MPEG-2 DVB and DVD streams, as well as on H.264/AVC encoded sequences, and shows successful results in both cases.

P2-27

Title	Distributed Video Coding Based on Adaptive Slice Size Using Received Motion Vectors
Author	Kyungyeon Min, Seanae Park, Donggyu Sim (Kwangwoon University, Republic of Korea)
Page	pp. 262 - 265
Keyword	DVC, crossover, adaptive slice control
Abstract	In this paper, we propose a new distributed video coding (DVC) method based on adaptive slice size using received motion vectors (MVs). In the proposed algorithm, the MVs estimated at a DVC decoder are transmitted to a corresponding encoder. In the proposed encoder, a predicted side information (PSI) is reconstructed with the transmitted MVs and key frames. Therefore, the PSI can be generated same to side information (SI) at the decoder. We can, also, calculate an exact crossover rate between the SI and original input frame using PSI and the original frame. As a result, the proposed method can transmit minimum parity bits to maximize error correction ability of a channel decoder with minimal computational complexity. Experimental results show that the proposed algorithm is better than several conventional DVC methods.

P2-28

Title	Improved Watermark Sharing Scheme Using Minimum Error Selection and Shuffling
Author	Aroba Khan, Yohei Yokoyama, Kiyoshi Tanaka (Shinshu University, Japan)
Page	pp. 266 - 269
Keyword	Watermark Sharing, Halftoning, error diffussion
Abstract	In this work, we focus on a watermark sharing scheme using error diffusion called DHCED, and try to overcome some drawbacks of this method. The proposed method simultaneously generates carrier halftone images that share the watermark information by selecting the minimum error caused in the noise function for watermark embedding. Also, the proposed method shuffles watermark image before embedding not only to increase the secrecy of the embedded watermark information but also improve the watermark detection ratio as well as the watermark appearance in the detection process.

[Quality, system, applications, and other topics]

P2-29

Title On the Duality of Rate Allocation and Quality Indices

Author Thomas Richter (University of Stuttgart, Germany)

Page pp. 270 - 273

Keyword JPEG 2000, SSIM

Abstract In a recent work, the author proposed to study the performance of still image quality indices such as the SSIM by using them as bjective function of rate allocation algorithms. The outcome of that work was not only a multi-scale SSIM optimal JPEG 2000 implementation, but also a first-order approximation of the MS-SSIM that is surprisingly similar to more traditional contrast-sensitivity and visual masking based approaches. It will be seen in this work that the only difference between the latter works and the MS-SSIM index is the choice of the exponent of the masking term, and furthermore, that a slight modification of the SSIM definition reproducing the traditional exponent is able to improve the performance of the index at or below the visual threshold. It is hence demonstrated that the duality of quality indices and rate allocation helps to improve both the visual performance of the compression codec and the performance of the index.

P2-30

Title	Image Quality Assessment Based on Local Orientation Distributions
Author	Yue Wang (Graduate University of Chinese Academy of Sciences, China), Tingting Jiang, Siwei Ma, Wen Gao (Institute of Digital Media, Peking University, China)
Page	pp. 274 - 277
Keyword	mage quality assessment (IQA), human visual system (HVS), Histograms of Oriented Gradients (HOG)
Abstract	Image quality assessment (IQA) is very important for many image and video processing applications, e.g. compression, archiving, restoration and enhancement. An ideal image quality metric should achieve consistency between image distortion prediction and psychological perception of human visual system (HVS). Inspired by that HVS is quite sensitive to image local orientation features, in this paper, we propose a new structural information based image quality metric, which evaluates image distortion by computing the distance of Histograms of Oriented Gradients (HOG) descriptors. Experimental results on LIVE database show that the proposed IQA metric is competitive with state-of-the-art IQA metrics, while keeping relatively low computing complexity.

P2-31

Title	Distance and Relative Speed Estimation of Binocular Camera Images Based on Defocus and Disparity Information
Author	Mitsuyasu Ito, Yoshiaki Takada, Takayuki Hamamoto (Tokyo University of Science, Japan)
Page	pp. 278 - 281
Keyword	ITS, focus blur, disparity information, distance estimation, relative speed
Abstract	In this paper, we discuss a method of distance and relative speed estimation for ITS. In this method, we use different focus positions of two cameras for obtaining the amount of focus blur. Next, we propose the method of distance estimation by the amount and disparity information. According to the result of simulation, proposed method was reasonably. In addition, we compose a prototype system for the real-time estimation. As a result of the implementation of processing, system was properly validated.

P2-32

Title	Comparing Two Eye-Tracking Databases: The Effect of Experimental Setup and Image Presentation Time on the Creation of Saliency Maps
Author	Ulrich Engelke (Blekinge Institute of Technology, Sweden), Hantao Liu (Delft University of Technology, Netherlands), Hans-Jürgen Zepernick (Blekinge Institute of Technology, Sweden), Ingrid Heynderickx (Philips Research Laboratories, Netherlands), Anthony Maeder (University of Western Sydney, Australia)
Page	pp. 282 - 285
Keyword	Eye tracking experiments, Visual saliency, Correlation analysis, Natural images
Abstract	Visual attention models are typically designed based on human gaze patterns recorded through eye tracking. In this paper, two similar eye tracking experiments from independent laboratories are presented, in which humans observed natural images under task-free condition. The resulting saliency maps are analysed with respect to two criteria; the consistency between the experiments and the impact of the image presentation time. It is shown, that the saliency maps between the experiments are strongly correlated independent of presentation time. It is further revealed that the presentation time can be reduced without substantially sacrificing the accuracy of the convergent saliency map. The results provide valuable insight into the similarity of saliency maps from independent laboratories and are highly beneficial for the creation of converging saliency maps at reduced experimental time and cost.

P2-33

Title	Successive Refinement of Overlapped Cell Side Quantizers for Scalable Multiple Description Coding
Author	Muhammad Majid, Charith Abhayaratne (The University of Sheffield, U.K.)
Page	pp. 286 - 289
Keyword	Multiple description coding, scalability, robustness
Abstract	Scalable multiple description coding (SMDC) provides reliability and facility to truncate the descriptions according to the user rate-distortion requirements. In this paper we generalize the conditions of successive refinement of the side quantizer of a multiple description scalar quantizer that has overlapped quantizer cells generated by a modified linear index assignment matrix. We propose that the split or refinement factor for each of the refinement side quantizers should be greater than the maximum side quantizer bin spread and should not be integer multiples of each other for satisfying the SMDC distortion conditions and verify through simulation results on scalable multiple description image coding.

P2-34

Title	VQ Based Data Hiding Method for Still Images by Tree-Structured Links
Author	Hisashi Igarashi, Yuichi Tanaka, Madoka Hasegawa, Shigeo Kato (Utsunomiya University, Japan)
Page	pp. 290 - 293
Keyword	Data Hiding, Vector Quantization, Tree-Structured Links
Abstract	In this paper, we propose a data embedding method into still images based on Vector Quantization (VQ). Several VQ-based data embedding methods, called 'Mean Gray-Level Embedding method (MGLE)' or 'Pair wise Nearest-Neighbor Embedding method (PNNE)' have been proposed. Those methods are, however, not sufficiently effective. Meanwhile an efficient adaptive data hiding method called 'Adaptive Clustering Embedding method (ACE)' was pro-posed, but is somewhat complicated because the VQ indices have to be adaptively clustered in the embedding process. In our proposed method, output vectors are linked with tree structure and information is embedded by using some of linked vectors. The simulation results show that our proposed method indicates higher SNR than the conventional methods under the same amounts of embedded data.

Session T1 Tutorial Session 1
Time: 11:15 - 12:00 Thursday, December 9, 2010
Chair: Kenji Sugiyama (Seikei University, Japan)

T1-1 (Time: 11:15 - 12:00)

Title	(Tutorial) Evolutive Video Coding - From Generic Algorithm towards Content-Specific Algorithm -
Author	Seishi Takamura (NTT Cyber Space Laboratories, NTT Corporation, Japan)
Abstract	Evolutive methods based on genetic programming (GP) enable dynamic algorithm generation, and have been successfully applied to many areas such as plant control, robot control, and stock market prediction. However, conventional image/video coding methods such as JPEG and H.264 all use fixed (non-dynamic) algorithms without exception. We have proposed a GP-based image predictor that is specifically evolved for each input image. It is a radical departure from conventional "fixed algorithm", "man-made algorithm" and "hand-made program" towards new paradigm such as "content-specific algorithm", "machine-generated algorithm/program". In this course, the predictor generation algorithm, its basic performance, application examples and some speeding up techniques of the evolution process will be presented.

Session O2 Oral Session 2: Depth Map Coding
Time: 13:15 - 15:15 Thursday, December 9, 2010
Chair: Kiyoharu Aizawa (University of Tokyo, Japan)

O2-1 (Time: 13:15 - 13:45)

Title	Multiscale Recurrent Pattern Matching Approach for Depth Map Coding
Author	Danillo B. Graziosi (UFRJ, Brazil), Nuno M. M. Rodrigues (IT, Portugal), Carla L. Pagliari (IME, Brazil), Eduardo A. B. da Silva (UFRJ, Brazil), Sérgio M. M. de Faria (IT, Portugal), Marcelo M. Perez (IME, Brazil), Murilo B. de Carvalho (UFF, Brazil)
Page	pp. 294 - 297
Keyword	Depth Maps, 3D Image Coding, Recurrent Pattern Matching
Abstract	In this article we propose to compress depth maps using a coding scheme based on multiscale recurrent pattern matching and evaluate its impact on depth image based rendering (DIBR). Depth maps are usually converted into gray scale images and compressed like a conventional luminance signal. However, using traditional transform-based encoders to compress depth maps may result in undesired artifacts at sharp edges due to the quantization of high frequency coefficients. The Multiscale Multidimensional Parser (MMP) is a pattern matching-based encoder, that is able to preserve and efficiently encode high frequency patterns, such as edge information. This ability is critical for encoding depth map images. Experimental results for encoding depth maps show that MMP is much more efficient in a rate-distortion sense than standard image compression techniques such as JPEG2000 or H.264/AVC. In addition, the depth maps compressed with MMP generate reconstructed views with a higher quality than all other tested compression algorithms.

O2-2 (Time: 13:45 - 14:15)

Title	Sparse Representation of Depth Maps for Efficient Transform Coding
Author	Gene Cheung (National Institute of Informatics, Japan), Akira Kubota (Chuo University, Japan), Antonio Ortega (University of Southern California, U.S.A.)
Page	pp. 298 - 301
Keyword	Depth-image-based rendering, transform coding, sparse representation
Abstract	Compression of depth maps is important for "image plus depth" representation of multiview images, which enables synthesis of novel intermediate views via depth-image-based rendering (DIBR) at decoder. Previous depth map coding schemes exploit unique depth characteristics to compactly and faithfully reproduce the original signal. In contrast, given that depth maps are not directly viewed but are only used for view synthesis, in this paper we manipulate depth values themselves, without causing severe synthesized view distortion, in order to maximize sparsity in the transform domain for compression gain. We formulate the sparsity maximization problem as an l0-norm optimization. Given l0-norm optimization is hard in general, we first find a sparse representation by iteratively solving a weighted l1 minimization via linear programming (LP). We then design a heuristic to push resulting LP solution away from constraint boundaries to avoid quantization errors. Using JPEG as an example transform codec, we show that our approach gained up to 2.5dB in rate-distortion performance for the interpolated view.

O2-3 (Time: 14:15 - 14:45)

Title	A Novel Approach for Efficient Multi-View Depth Map Coding
Author	Jin Young Lee, Hochen Wey, Du-Sik Park (Samsung Advanced Institute of Technology, Samsung Electronics Co., Ltd., Republic of Korea)
Page	pp. 302 - 305
Keyword	Multi-view video plus depth format, Depth map coding, Video coding
Abstract	Multi-view video plus depth (MVD) format, which consists of texture and depth images, has been recently presented as video representation to support depth perception of scenes and efficient view generation at the arbitrary positions. In particular, a depth image has been one of the significantly important issues for successful services of highly advanced multi-media video applications, such as three-dimensional television (3DTV) and free-viewpoint television (FTV). In this paper, we present a novel approach for efficient multi-view depth map coding. We assume that a texture image in the MVD format is first encoded and then the corresponding depth image is encoded. According to an analysis of inter-view correlation between the previously encoded texture images, the proposed method skips some blocks of the depth image without encoding. The skipped blocks in the depth map are predicted from the neighboring depth images at the same time instant. Experimental results demonstrate that the proposed method reduces the coding bitrate of up to 74.8% and improves PSNR of up to 3.51dB in P and B views.

O2-4 (Time: 14:45 - 15:15)

Title	Diffusion Filtering of Depth Maps in Stereo Video Coding
Author	Gerhard Tech, Karsten Müller, Thomas Wiegand (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany)
Page	pp. 306 - 309
Keyword	diffusion filtering, noise reduction, stereo video, video plus depth coding
Abstract	A method for removing irrelevant information from depth maps in Video plus Depth coding is presented. The depth map is filtered in several iterations using a diffusional approach. In each iteration smoothing is carried out in local sample neighborhoods considering the distortion introduced to a rendered view. Smoothing is only applied when the rendered view is not affected. Therefore irrelevant edges and features in the depth map can be damped while the quality of the rendered view is retained. The processed depth maps can be coded at a reduced rate compared to unaltered data. Coding experiments show gains up to 0.5dB for the rendered view at the same bit rate.

Session O3 Oral Session 3: New Techniques for Video Coding
Time: 15:30 - 17:30 Thursday, December 9, 2010
Chair: Yoshiyuki Yashima (Chiba Institute of Technology, Japan)

O3-1 (Time: 15:30 - 16:00)

Title	Parallel Intra Prediction for Video Coding
Author	Andrew Segall, Jie Zhao (Sharp Labs of America, U.S.A.), Tomoyuki Yamamoto (Sharp Corporation, Japan)
Page	pp. 310 - 313
Keyword	intra prediction, video coding, parallel
Abstract	In this paper, we propose an intra-prediction system that is both parallel friendly and with high coding efficiency. This is achieved by combining a novel prediction strategy that reduces serial dependencies and a novel, multi-directional and adaptive prediction system. The resulting technique is compared with state-of-the-art ITU-T H.264/MPEG-4 AVC. We observe a 2x and 8x increase in parallelism for 8x8 and 4x4 partitions, respectively, and an average rate increase of less than 0.08% for predictive coding scenarios.

O3-2 (Time: 16:00 - 16:30)

Title	Quantization Noise Reduction in Hybrid Video Coding by a System of Three Adaptive Filters
Author	Matthias Narroschke (Panasonic R&D Center Germany GmbH, Germany)
Page	pp. 314 - 317
Keyword	Video coding, Noise reduction, Quantization noise, Wiener filter
Abstract	Hybrid video coding algorithms, e.g. H.264/MPEG-4 AVC, apply prediction and subsequent prediction error coding introducing quantization noise. The quantized prediction error signal and the prediction signal are added for reconstruction. Deblocking filters reduce quantization noise of the reconstructed signal at block boundaries. To further reduce quantization noise, adaptive Wiener filters are applied to the deblocked signal. In this paper, the adaptive Wiener filter is extended to a system of three adaptive filters in order to improve the quantization noise reduction. A first filter is applied to the deblocked signal, a second filter to the quantized prediction error signal, and a third filter to the prediction signal. The three filtered signals are added for reconstruction. For a set of thirteen test sequences, the system of three adaptive filters achieves an average bit rate reduction at the same quality of 1.9% compared to the adaptive Wiener filter and of 4.9% compared to no Wiener filter. For particular sequences, a bit rate reduction of 6.1% is achieved compared to the adaptive Wiener filter and of 17.1% compared to no Wiener filter.

O3-3 (Time: 16:30 - 17:00)

Title	Spatio-Temporal Prediction in Video Coding by Non-Local Means Refined Motion Compensation
Author	Jürgen Seiler, Thomas Richter, André Kaup (University of Erlangen-Nuremberg, Germany)
Page	pp. 318 - 321
Keyword	Video Coding, Prediction, Signal Extrapolation
Abstract	The prediction step is a very important part of hybrid video codecs. In this contribution, a novel spatio-temporal prediction algorithm is introduced. For this, the prediction is carried out in two steps. Firstly, a preliminary temporal prediction is conducted by motion compensation. Afterwards, spatial refinement is carried out for incorporating spatial redundancies from already decoded neighboring blocks. Thereby, the spatial refinement is achieved by applying Non-Local Means denoising to the union of the motion compensated block and the already decoded blocks. Including the spatial refinement into H.264/AVC, a rate reduction of up to 14 % or respectively a gain of up to 0.7 dB PSNR compared to unrefined motion compensated prediction can be achieved.

O3-4 (Time: 17:00 - 17:30)

Title	A Novel Video Coding Scheme for Super Hi-Vision
Author	Shun-ichi Sekiguchi, Akira Minezawa, Kazuo Sugimoto (Information Technology R&D Center, Mitsubishi Electric Corporation, Japan), Atsuro Ichigaya, Kazuhisa Iguchi, Yoshiaki Shishikui (Science & Technology Research Laboratories, NHK, Japan)
Page	pp. 322 - 325
Keyword	video coding, SHV, motion compensation prediction, intra prediction, transform
Abstract	We propose a novel video coding scheme targeting Super Hi-Vision (SHV) video sources. While it takes a conventional block-based MC + Transform hybrid coding approach that is suitable for hardware implementation of a SHV video codec, the proposed scheme achieved significant coding efficiency improvement by introducing several coding tools such as intra prediction and adaptive transform. According to our experimental analysis, the proposed scheme achieves significant bit-rate saving compared to the state-of-the-art AVC/H.264 high profile.

Friday, December 10, 2010

Session K3 Keynote Speech 3
Time: 8:45 - 9:30 Friday, December 10, 2010
Chair: Kohtaro Asai (Mitsubishi Electric Corporation, Japan)

K3-1 (Time: 8:45 - 9:30)

Title	(Keynote Speech) Advances in Video Compression
Author	Thomas Wiegand (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany)
Abstract	Recent work in the research and standards community has shown that significant advances are possible in the field of video compression. These gains have been demonstrated in particular for high-resolution video content. The talk will present the new techniques and analyze the reasons for the improvements in coding efficiency. Comparisons against the state-of-the-art will be provided and potential new developments will be discussed.

Session P3 Poster Session 3
Time: 9:30 - 11:00 Friday, December 10, 2010
Chair: Akira Kubota (Chuo University, Japan)

[3DTV/FTV/multi-view-related topics]

P3-1

Title A Fast Graph Cut Algorithm for Disparity Estimation

Author Cheng-Wei Chou, Jang-Jer Tsai, Hsueh-Ming Hang, Hung-Chih Lin (National Chiao Tung University, Taiwan)

Page pp. 326 - 329

Keyword FTV, stereo correspondence, disparity estimation, graph cut

Abstract In this paper, we propose a fast graph cut (GC) algorithm for disparity estimation. Two accelerating techniques are suggested: one is the early termination rule, and the other is prioritizing the alpha-beta swap pair search order. Our simulations show that the proposed fast GC algorithm outperforms the original GC scheme by 210% in the average computation time while its disparity estimation quality is almost similar to that of the original GC.

P3-2

Title	Parallel Processing Method for Realtime FTV
Author	Kazuma Suzuki (Graduate School of Engineering, Nagoya University, Japan), Norishige Fukushima (Graduate School of Engineering, Nagoya Institute of Technology, Japan), Tomohiro Yendo, Mehrdad Panahpour Tehrani (Graduate School of Engineering, Nagoya University, Japan), Toshiaki Fujii (Graduate School of Engineering, Tokyo Institute of Technology, Japan), Masayuki Tanimoto (Graduate School of Engineering, Nagoya University, Japan)
Page	pp. 330 - 333
Keyword	FTV, Free Viewpoint Image Generation, Image Based Rendering, Realtime Processing, Parallel Processing
Abstract	In this paper, we propose a parallel processing method to generate free viewpoint image in realtime. It is impossible to arrange the cameras in a high density realistically though it is necessary to capture images of the scene from innumerable cameras to express the free viewpoint image. Therefore, it is necessary to interpolate the image of arbitrary viewpoint from limited captured images. However, this process has the relation of the trade-off between the image quality and the computing time. In proposed method, it aimed to generate the high-quality free viewpoint image in realtime by applying the parallel processing to time-consuming interpolation part.

P3-3

Title	Influence of Wavelet-Based Depth Coding in Multiview Video Systems
Author	Ismael Daribo, Hideo Saito (Keio University, Japan)
Page	pp. 334 - 337
Keyword	wavelet, depth, coding, 3dtv, mvv
Abstract	Multiview video representation based on depth data, such as multiview video-plus-depth (MVD), is emerging 3D video communication services raising in the meantime the problem of coding and transmitting depth video in addition to classical texture video. Depth video is considered as a key side information in novel view synthesis within multiview video systems, such as three-dimensional television (3DTV) or free viewpoint television (FTV), wherein the influence of depth compression on the novel synthesized view is still a contentious issue. In this paper, we propose to discuss and investigate the impact of wavelet-based compression of the depth video on the quality of the view synthesis. Experimental results show that significant gains can be obtained by improving depth edge preservation through shorter wavelet-based filtering on depth edges.

P3-4

Title	An Epipolar Resticted Inter-Mode Selection for Stereoscopic Video Encoding
Author	Guolei Yang (Peking University, China), Luhong Liang (Institute of Computing Technology, Chinese Academy of Sciences, China), Wen Gao (Peking University, China)
Page	pp. 338 - 341
Keyword	Stereoscopic video, Epipolar restriction, Inter-mode selection
Abstract	In this paper, we propose a fast inter-prediction mode selection algorithm for stereoscopic video encoding. Different from methods using disparity estimation, candidate modes are generated by sliding a window along the macro-block line restricted by the epipolar. Then the motion information is utilized to rectify the candidate modes. A selection failure handling algorithm is also proposed to preserve coding quality. The proposed algorithm is evaluated using independent H.264/AVC encoders for left and right views and can be extended to MVC.

P3-5

Title	Temporal Consistency Enhancement on Depth Sequences
Author	Deliang Fu, Yin Zhao, Lu Yu (Zhejiang University, China)
Page	pp. 342 - 345
Keyword	3D video, temporal consistency, temporal depth filtering, view synthesis
Abstract	Currently, depth sequences generated by automatic depth estimation suffer from the temporal inconsistency problem. Estimated depth values of some objects vary in adjacent frames, whereas the objects actually remain on the same depth planes. These temporal depth errors significantly impair the visual quality of the synthesized virtual view as well as the coding efficiency of the depth sequences. Since depth sequences correspond to texture sequences, some erroneous temporal depth variations can be detected by analyzing temporal variations of the texture sequences. Utilizing this property, we propose a novel solution to enhance the temporal consistency of depth sequences by applying adaptive temporal filtering on them. Experiments demonstrate that the proposed depth filtering algorithm can effectively suppress transient depth errors and generate more stable depth sequences, resulting in notable temporal quality improvement of the synthesized views and higher coding efficiency on the depth sequences.

P3-6

Title	Real-Time Free Viewpoint Television for Embedded Systems
Author	Davide Aliprandi, Emiliano Piccinelli (STMicroelectronics, Italy)
Page	pp. 346 - 349
Keyword	Viewpoint, Depth, 3DTV
Abstract	In this paper we describe an image-based rendering pipeline for interactive real-time Free Viewpoint Television (FTV) on embedded systems. Description of the processing steps and optimizations implemented targeting the hardware acceleration of a commercial programmable Graphics Processing Unit (GPU) is given. As a result, real-time view synthesis at 70 fps in XGA resolution has bee achieved. Restrictions and modifications introduced to support the application on OpenGL ES 2.0 based GPUs for embedded systems have also been discussed.

P3-7

Title	Power-Aware Complexity-Scalable Multiview Video Coding for Mobile Devices
Author	Muhammad Shafique (Karlsruhe Institute of Technology, Germany), Bruno Zatt, Sergio Bampi (Federal University of Rio Grande do Sul, Brazil), Jörg Henkel (Karlsruhe Institute of Technology, Germany)
Page	pp. 350 - 353
Keyword	MVC, Mobile Devices, Complexity reduction, Power-Aware, Adaptivity
Abstract	We propose a novel power-aware scheme for complexity-scalable multiview video coding on mobile devices. Our scheme exploits the asymmetric view quality which is based on the binocular suppression theory. Our scheme employs different quality-complexity classes (QCCs) and adapts at run time depending upon the current battery state. It thereby enables a run-time tradeoff between complexity and video quality. The experimental results show that our scheme is superior to state-of-the-art and it provides an up to 87% complexity reduction while keeping the PSNR close to the exhaustive mode decision. We have demonstrated the power-aware adaptivity between different QCCs using a laptop with battery charging and discharging scenarios.

P3-8

Title	3D Pose Estimation in High Dimensional Search Spaces with Local Memorization
Author	Weilan Luo, Toshihiko Yamasaki, Kiyoharu Aizawa (The University of Tokyo, Japan)
Page	pp. 354 - 357
Keyword	tracking, annealing, twist, particle filter
Abstract	In this paper, a stochastic approach for extracting the articulated 3D human postures by synchronized multiple cameras is presented in the high-dimensional configuration spaces. Annealed Particle Filtering (APF) seeks for the globally optimal solution of the likelihood. We improve and extend the APF with local memorization to estimate the suited kinematic postures for a volume sequence directly instead of projecting a rough simplified body model to 2D images. Our method guides the particles to the global optimization on the basis of local constraints. A segmentation algorithm is performed on the volumetric models and the process is repeated. We assign the articulated models 42 degrees of freedom. The matching error is about 6% on average while tracking the posture between two neighboring frames.

P3-9

Title	Free-Viewpoint Image Generation Using Different Focal Length Camera Array
Author	Kengo Ando (Graduate School of Engineering, Nagoya University, Japan), Norishige Fukushima (Graduate School of Engineering, Nagoya Institute of Technology, Japan), Tomohiro Yendo, Mehrdad Panahpour Tehrani (Graduate School of Engineering, Nagoya University, Japan), Toshiaki Fujii (Graduate School of Engineering, Tokyo Institute of Technology, Japan), Masayuki Tanimoto (Graduate School of Engineering, Nagoya University, Japan)
Page	pp. 358 - 361
Keyword	Free-viewpoint image generation, Image Based Rendering
Abstract	The availability of multi-view images including Free-Viewpoint TV. Virtual viewpoint images are synthesized by Image-Based Rendering. In this paper, we introduce a depth estimation method for forward virtual viewpoints and view generation method using a zoom camera in our camera setup to improve virtual viewpoints’ image quality. Simulation results confirm reduced error during depth estimation using our proposed method in comparison with conventional stereo matching scheme. We have demonstrated the improvement in image resolution of virtually moved forward camera.

[Beyond H.264/MPEG-4 AVC and related topics]

P3-10

Title Decoder-Side Hierarchical Motion Estimation for Dense Vector Fields

Author Sven Klomp, Marco Munderloh, Jörn Ostermann (Leibniz Universität Hannover, Germany)

Page pp. 362 - 365

Keyword video coding, motion compensation, dense vector field, block matching

Abstract Recent research revealed that the data rate can be reduced by performing an additional motion estimation at the decoder. This paper addresses an improved hierarchical motion estimation algorithm to be used in a decoder-side motion estimation system. A special motion vector latching is used to be more robust for very small block sizes and to better adapt to object borders. A dense motion vector field is estimated which reduces the rate by 6.9% in average compared to H.264/AVC.

P3-11

Title	Edge-Based Adaptive Directional Intra Prediction
Author	Feng Zou, Oscar C. Au, Wen Yang, Chao Pang, Jingjing Dai, Xing Wen (The Hong Kong University of Science and Technology, Hong Kong), Yu Liu (Hong Kong Applied Science and Technology Research Institute, Hong Kong)
Page	pp. 366 - 369
Keyword	H.264/AVC, intra prediction, edge
Abstract	H.264/AVC employs intra prediction to reduce spatial redundancy between neighboring blocks. Different directional prediction modes are used to cater diversified video content. Although it achieves quite high coding efficiency, it is desirable to analyze its drawbacks in the existing video coding standard, since it allows us to design better ones. Basically, even after intra prediction, the residue still contains a lot of edge or texture information. Unfortunately, these high frequency components consume a large quantity of bits and the distortion is usually quite high. Based on this drawback, an Edge-based Adaptive Directional Intra Prediction is proposed (EADIP) to reduce the residue energy especially for the edge region. In particular, we establish an edge model in EADIP, which is quite flexible for natural images. Within the model, the edge splits the macroblock into two regions, each being predicted separately. In implementation, we consider the current trend of mode selection and complexity issues. A mode extension is made on INTRA 16x16 in H.264/AVC. Experimental results show that the proposed algorithm outperforms H.264/AVC. And the proposed mode is more likely to be chosen in low bitrate situations.

P3-12

Title	An Improved Low Delay Inter Frame Coding Using Template Matching Averaging
Author	Yoshinori Suzuki, Choong Seng Boon (NTT DOCOMO, INC., Japan)
Page	pp. 370 - 373
Keyword	Video coding, Prediction methods, Motion compensation
Abstract	This paper presents an efficient inter prediction method for video coding. The method applies the idea of template matching averaging to the conventional motion compensated prediction. While one of the candidate is specified by a motion vector, the remaining candidates are obtained by a template matching without using explicit motion vector. The averaging of multiple predictors reduces coding noise residing in each of the predictors. Simulation results show that the proposed scheme improves coding efficiency up to 4.5%.

P3-13

Title	Generating Subject Oriented Codec by Evolutionary Approach
Author	Masaaki Matsumura, Seishi Takamura, Hirohisa Jozawa (NTT Cyber Space Laboratories, NTT Corporation, Japan)
Page	pp. 374 - 377
Keyword	Evolutive image coding, coding tools combination, subject oriented codec, lossless image coding
Abstract	In this paper, we propose an automatic optimization method for deriving the combination that suits for categorized pictures. We prepare some categorised pictures, and optimize the combination for each category. In the case of optimization for lossless image coding, our method achieves a bit-rate reduction of over 2.8% (maximum) compared to the combination that offers the best bit-rate averagely prepared beforehand.

P3-14

Title	Improved Context Modeling for Coding Quantized Transform Coefficients in Video Compression
Author	Tung Nguyen, Heiko Schwarz, Heiner Kirchhoffer, Detlev Marpe (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany), Thomas Wiegand (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute/Technical University of Berlin, Germany)
Page	pp. 378 - 381
Keyword	context modeling, transform coding
Abstract	Recent investigations have shown that the support of ex-tended block sizes for motion-compensated prediction and transform coding can significantly increase the coding effi-ciency for high-resolution video relative to H.264/AVC. In this paper, we present a new context-modeling scheme for the coding of transform coefficient levels that is particularly suitable for transform blocks greater than 8x8. While the basic concept for transform coefficient coding is similar to CABAC, the probability model selection has been optimized for larger block transforms. The proposed context modeling is compared to a straightforward extension of the CABAC context modeling; both schemes have been implemented in a hybrid video codec design that supports block sizes of up to 128x128 samples. In our simulations, we obtained overall bit rate reductions of up to 4%, with an average of 1.7% with the proposed context modeling scheme.

[Image/video coding and related topics]

P3-15

Title Bitwise Prediction Error Correction for Distributed Video Coding

Author Axel Becker-Lakus, Ka-Ming Leung, Zhonghua Ma (Canon Information Systems Research Australia (CiSRA), Australia)

Page pp. 382 - 385

Keyword Distributed Video Coding, Wyner-Ziv Coding, Side Information Generation

Abstract Side information plays a key role in the performance of a Distributed Video Coding (DVC) system. However, the generation of side information often relies on complex motion estimation/interpolation operation. The correlation between the source data and the side information, sometimes referred as virtual channel, is also very difficult to model accurately. In this paper, we propose a bitwise prediction error correction method to improve the quality of the side information during Wyner-Ziv decoding. Whenever a bit error is detected in a bit plane, the less significant bits of the corresponding pixel are adjusted to match the initial prediction. The proposed method has been evaluated using a pixel-domain DVC system and delivers a better coding performance with improved decoding quality and reduced bitrate.

P3-16

Title	Improved Texture Compression for S3TC
Author	Yifei Jiang, Dandan Huan (Institute of Computing Technology, Chinese Academy of Sciences, China)
Page	pp. 386 - 389
Keyword	computer graphics, texture compression, clustering algorithms
Abstract	Texture compression is a specialized form of still image compression employed in computer graphics systems to reduce memory bandwidth consumption. Modern texture compression schemes cannot generate satisfactory qualities for both alpha channel and color channel of texture images. We propose a novel texture compression scheme, named ImTC, based on the insight into the essential difference between transparency and color. ImTC defines new data formats and compresses the two channels flexibly. While keeping the same compression ratio as the de facto standard texture compression scheme, ImTC improves compression qualities of both channels. The average PSNR score of alpha channel is improved by about 0.2 dB, and that of color channel can be increased by 6.50 dB over a set of test images, which makes ImTC a better substitute for the standard scheme.

P3-17

Title	Compression of Pre-Computed Per-Pixel Texture Features Using MDS
Author	Wai-Man Pang (Spatial Media Group, Computer Arts Lab., University of Aizu, Japan), Hon-Cheng Wong (Faculty of Information Technology, Macau University of Science and Technology, Macau)
Page	pp. 390 - 393
Keyword	Compressed texture features, Gabor wavelet transform, Multidimensional scaling, Compression
Abstract	There are many successful experiences on employing texture analysis to improve the accuracy and robustness on image segmentation. Usually, a per-pixel based texture analysis is required, this involves intensive computation especially for large images. While, precomputation and storing of the texture features involves large file space which is not cost effective. To adopt to this novel needs, we propose in this paper the use of multidimensional scaling (MDS) technqiue to reduce the size of per-pixel texture features of an image, while preserving the textural discrminiability for segmentation. As per-pixel texture features will create very large dissimilarity matrix, and make the solving of MDS intractable. A sampling-based MDS is therefore introduced to tackle the problem with a divide-and-conquer approach. A compression ratio of 1:24 can be achieved with an average error lower than 7%. Preliminary experiments on segmentation using the compressed data show satisfactory results as good as using the uncompressed features. We foresee that such a method will enable texture features to be stored and transferred more effectively on low processing power devices or embedded system like mobile phones.

P3-18

Title	Temporal Signal Energy Correction and Low-Complexity Encoder Feedback for Lossy Scalable Video Coding
Author	Marijn J.H. Loomans, Cornelis J. Koeleman (VDG Security BV, Netherlands), Peter H.N. de With (Eindhoven University of Technology, Netherlands)
Page	pp. 394 - 397
Keyword	Scalable Video Coding, Wavelets, Embedded systems
Abstract	We address two problems found in embedded Scalable Video Codec implementations: the temporal signal energy distribution and frame-to-frame quality fluctuations. To solve these problems, we move the temporal energy correction to the leaves of the temporal tree, and feed back the decoded first frame of the GOP into the temporal coding chain. The first modification saves on required memory size, bandwidth and computations, while reducing floating/fixed-point conversion errors and the second is achieved without entropy decoding and an unmodified decoder.

P3-19

Title	Improving Colorization-Based Coding by Using Local Correlation between Luminance and Chrominance in Texture Component
Author	Yoshitaka Inoue, Takamichi Miyata, Yoshinori Sakai (Tokyo Institute of Technology, Japan)
Page	pp. 398 - 401
Keyword	image coding, colorization, total variation, correlation between luminance and chrominance
Abstract	Recently, a novel approach to color image compression based on colorization has been presented. Although the conventional method of colorization-based coding outperforms JPEG in terms of subjective quality, the decoded chrominance components lose the local oscillation that the original images had. We focus on the local correlation that exists between luminance and chrominance in separated texture components, and we present a new colorization-based coding method. Experimental results showed that our coding method can improve the coding efficiency.

P3-20

Title	Video Encoding with the Original Picture as the Reference Picture
Author	Taiga Muromoto, Naoya Sagara, Kenji Sugiyama (Seikei University, Japan)
Page	pp. 402 - 405
Keyword	Inter-picture prediction, Reference picture, Quantization error, Group of picture
Abstract	Inter-picture prediction uses a local decoded picture for the reference, in order to avoid a mismatch between encoding and decoding. However, this scheme does not necessarily result in optimal coding efficiency since it requires encoding the processing altogether. Therefore, we study the use of the original picture as the reference. In this case, although the mismatch causes degradation of the picture quality, the bit amount is reduced. Therefore, we propose an adaptive method based on rate distortion optimization. The original picture is used only in the macroblock, if it is lower cost than the local decoded picture is used. Experimental results show a 0.1 to 1.0 dB gain in PSNR in each sequence.

P3-21

Title	A New Hybrid Parallel Intra Coding Method Based on Interpolative Prediction
Author	Cui Wang (Tokyo Institute of Technology, Japan), Akira Kubota (Chuo University, Japan), Yoshinori Hatori (Tokyo Institute of Technology, Japan)
Page	pp. 406 - 409
Keyword	hybrid parallel coding, interpolative prediction, new shape of block
Abstract	The hybrid coding method to combine the predictive coding with the orthogonal transformation and the quantization is mainly used recently. This paper proposes a new hybrid parallel intra coding method based on interpolative prediction which uses correlations between neighboring pixels. For high performance and parallel, the optimal quantizing scheme and the new shape of the block are used. The result of experiments shows that the proposed technique achieves 1~4 dB improvement in Luminance PSNR, especially for image with more details.

P3-22

Title	RBF-Based VBR Controller for Real-Time H.264/SVC Video Coding
Author	Sergio Sanz-Rodríguez, Fernando Díaz-de-María (Carlos III University of Madrid, Spain)
Page	pp. 410 - 413
Keyword	Rate Control, Variable Bit Rate, Scalable Video Coding, H.264/SVC, streaming
Abstract	In this paper we propose a novel VBR controller for real-time H.264/SVC video coding. Since consecutive pictures within the same scene often exhibit similar degrees of complexity, the proposed VBR controller allows for just an incremental variation of QP with respect to that of the previous picture, so preventing unnecessary QP fluctuations. For this purpose, a RBF network has been carefully designed to estimate the QP increment at each dependency (spatial or CGS) layer. A mobile live streaming application scenario was simulated to assess the performance of the proposed VBR controller, which was compared to a recently proposed CBR controller for H.264/SVC. The experimental results show a remarkably consistent quality, notably outperforming the reference CBR controller.

P3-23

Title	Scalable Video Compression Framework with Adaptive Multiresolution Directional Filter Bank Design
Author	Lingchen Zhu, Hongkai Xiong (Shanghai Jiao Tong University, China)
Page	pp. 414 - 417
Keyword	Scalable video coding, directional filter banks, multiscale geometric, sparse coding
Abstract	Regarding orientation resolution as an isolated variable from scale, this paper proposes a dual (scale and orientation) multiresolution transform into scalable video coding (SVC) framework. By projecting 2D signals (textures and edges) onto nonuniformly divided orientation subspaces, the dual multiresolution SVC (DMSVC) can capture 2-D curve smoothness with less coefficients, and provide more flexible spatial decomposition structures than traditional wavelet-based SVC (WSVC). In the spatial decomposition module of DMSVC, the nonuniform directional distribution along scale of each frame is detected by phase congruency in the overcomplete wavelet domain. The corresponding orientational multi-resolution is achieved by nonuniform directional filter banks (NUDFB) which is fulfilled via a non-symmetric binary tree (NSBT) structured frequency division. The wavelet basis function in each scale is converted to an adaptive set of nonuniform directional basis by employing nonuniform directional filter banks. Experimental results validate a superior coding performance and visual quality over WSVC especially on those sequences full of directional edges and textures.

P3-24

Title	A Four-Description MDC for High Loss-Rate Channels
Author	Meilin Yang, Mary Comer, Edward J. Delp (School of Electrical and Computer Engineering, Purdue University, U.S.A.)
Page	pp. 418 - 421
Keyword	MDC: four-description MDC, high packet loss rate, Gilbert model
Abstract	One of the most difficult problems in video transmission is communication over error-prone channels, especially when retransmission is unacceptable. To address this problem, Multiple Description Coding (MDC) has been proposed as an effective solution due to its robust error resilience. Considering applications in scalable, multicast and P2P environments, it is advantageous to use more than two descriptions (which is designated multi-description MDC in this paper). In this paper, we present a new four-description MDC for high loss-rate channel using a hybrid structure of temporal and spatial correlations. A Gilbert model is used as the channel model for burst packet loss simulation. Experimental results demonstrate the efficacy of the proposed method.

P3-25

Title	Bi-Directional Optical Flow for Improving Motion Compensation
Author	Alexander Alshin, Elena Alshina, Tammy Lee (Samsung Electronics Co., Ltd., Republic of Korea)
Page	pp. 422 - 425
Keyword	bi-directional prediction, optical flow
Abstract	New method improving B-slice prediction is proposed. By combining the optical flow concept and high accuracy gradients evaluation we construct the algorithm which allows pixel-wise refinement of motion. This approach does not require any signaling for decoder. According to tests with WQVGA sequences bit-saving of 2%-6% can be achieved using this tool.

[Image/video processing and related topics]

P3-26

Title Two-Dimensional Chebyshev Polynomials for Image Fusion

Author Zaid Omar, Nikolaos Mitianoudis, Tania Stathaki (Imperial College London, U.K.)

Page pp. 426 - 429

Keyword Image and data fusion, Chebyshev polynomials, orthogonal moments

Abstract This report documents in detail the research carried out by the author throughout his first year. The paper presents a novel method for fusing images in a domain concerning multiple sensors and modalities. Using Chebyshev polynomials as basis functions, the image is decomposed to perform fusion at feature level. Results show favourable performance compared to previous efforts on image fusion, namely ICA and DT-CWT, in noise affected images. The work presented here aims at providing a novel framework for future studies in image analysis and may introduce innovations in the fields of surveillance, medical imaging and remote sensing.

P3-27

Title	Image Denoising with Hard Color-Shrinkage and Grouplet Transform
Author	Takahiro Saito, Ken-ichi Ishikawa, Yasutaka Ueda, Takashi Komatsu (Kanagawa University, Japan)
Page	pp. 430 - 433
Keyword	Color-image processing, denoising, wavelet transform, grouplet transform, shrinkage
Abstract	To remove signal-dependent noise of a digital color camera, we propose a denoising method with our hard color-shrinkage in the tight-frame grouplet transform domain. The classic hard-shrinkage works well for monochrome-image denoising. To utilize inter-channel color cross-correlations, a noisy image undergoes the color transformation from the RGB to the luminance-and-chrominance color space, and the luminance and the chrominance components are separately denoised; but this approach cannot cope with actual signal-dependent noise. To utilize the noise’s signal-dependencies, we construct the hard color-shrinkage where the inter-channel color cross-correlations are directly utilized in the RGB color space. The hard color-shrinkage alleviates denoising artifacts, and improves picture quality of denoised images.

P3-28

Title	Improved FMO Based H.264 Frame Layer Rate Control for Low Bit Rate Video Transmission
Author	Rhandley Domingo Cajote (University of the Philippines, Diliman, Philippines), Supavadee Aramvith (Chulalongkorn University, Thailand)
Page	pp. 434 - 437
Keyword	FMO, Rate Control, H.264/AVC, video coding
Abstract	The use of Flexible Macroblock Ordering (FMO) in H.264/AVC as an error-resilient tool incurs extra overhead bits that reduces coding efficiency at low bit rate. To improve coding efficiency, we present an improved frame-layer H.264/AVC rate control that takes into consideration the effects of using FMO for video transmission. In this paper, we propose a new header bits model, an enhanced frame complexity measure and a quantization parameter (QP) adjustment scheme. Simulation results show that the proposed method performed better than the existing frame layer rate control with FMO enabled using different number of slice groups.

P3-29

Title	Improvement of Spatial Resolution by Integration of High-Speed Sub-Frame Images
Author	Daisuke Kashiwagura, Kanae Matsuzaki, Takayuki Hamamoto (Tokyo University of Science, Japan)
Page	pp. 438 - 441
Keyword	super resolution, sub-frame image, high frame rate, motion estimation
Abstract	The super resolution technique based on the integration of successive frames depends on the accuracy of the motion estimation. However it requires much amount of calculation and tends to make estimation errors for some images. In this paper, we propose a super resolution method by the motion estimation based on the block matching using high-speed sub-frame images.

P3-30

Title	Improved Autoregressive Image Model Estimation for Directional Image Interpolation
Author	Ruiqin Xiong (Peking University, China), Wenpeng Ding (Beijing University of Technology, China), Siwei Ma, Wen Gao (Peking University, China)
Page	pp. 442 - 445
Keyword	image interpolation, model estimation, autoregressive model, regularization
Abstract	For image interpolation algorithms employing autoregressive models, a mechanism is required to estimate the model parameters piecewisely and accurately so that local structures of image can be exploited efficiently. This paper proposes a new strategy for better estimating the model. Different from conventional schemes which build the model solely upon the covariance matrix of low-resolution image, the proposed strategy utilizes the covariance matrix of high-resolution image itself, with missing pixels properly initialized. To make the estimation robust, we adopt a general solution which exploits the covariance matrices of both scales. Experimental results demonstrate that the proposed strategy improves model estimation and the interpolation performance remarkably.

[Quality, system, applications, and other topics]

P3-31

Title Subjective Evaluation of Hierarchical B-Frames Using Video-MUSHRA

Author Hussain Mohammed, Nikolaus Färber, Jens Garbas (Fraunhofer IIS, Germany)

Page pp. 446 - 449

Keyword H.264/AVC, Hierarchical B-Frames, Subjective Quality, MUSHRA

Abstract Hierarchical B-Frames (HBF) has emerged as an efficient video coding tool in recent years. As shown in the literature, this approach results in excellent PSNR gains of >1 dB. However these PSNR gains are not sufficiently assessed in a scientific manner by subjective tests. Hence in this paper, we evaluate HBF coding pattern subjectively by using the MUSHRA test methodology. While MUSHRA is well established in audio coding research, its application to video is a novelty of this paper. We compare HBF with simple IPP coding pattern at either same PSNR or same bit rate. Our results indicate that, HBF gains are clearly subjectively perceptible. Hence, it can be shown that PSNR gains also correlate with a subjective gain. Interestingly, even at same PSNR, HBF is found to be subjectively superior to simple IPP coding.

P3-32

Title	Intra Prediction Architecture for H.264/AVC QFHD Encoder
Author	Gang He, Dajiang Zhou, Jinjia Zhou, Satoshi Goto (Waseda University, Japan)
Page	pp. 450 - 453
Keyword	H.264, intra prediction, hardware architecture
Abstract	This paper proposes a high-performance intra prediction architecture that can support H.264/AVC high profile. The proposed MB/block co-reordering can avoid data dependency and improve pipeline utilization. Therefore, the timing constraint of real-time 4kx2k encoding can be achieved with negligible quality loss. 16x16 prediction engine and 8x8 prediction engine work parallel for prediction and coefficients generating. A reordering interlaced reconstruction is also designed for fully pipelined architecture. It takes only 160 cycles to process one macroblock (MB). Hardware utilization of prediction and reconstruction modules is almost 100%. Furthermore, PE-reusable 8x8 intra predictor and hybrid SAD & SATD mode decision are proposed to save hardware cost. The design is implemented by 90nm CMOS technology with 113.2k gates and can encode 4kx2k video sequences at 60 fps with operation frequency of 310MHz.

P3-33

Title	Compressed Signature for Video Identification
Author	Nikola Sprljan, Paul Brasnett, Stavros Paschalakis (Mitsubishi Electric R&D Centre Europe, U.K.)
Page	pp. 454 - 457
Keyword	video descriptor, lossless compression
Abstract	This paper presents a new application-specific lossless compression scheme developed for video identification descriptors, also known as video fingerprints or signatures. In designing such a descriptor, one usually has to balance the descriptor size against discriminating power and temporal localisation performance. The proposed compression scheme alleviates this problem by efficiently exploiting the temporal redundancies present in the video fingerprint, allowing highly accurate fingerprints which also entail low transmission and storage costs. In this paper we provide a detailed description of our compression scheme and a comparative evaluation against well known state-of-the-art generic compression tools.

P3-34

Title	A Subjective Image Quality Metric for Bit-Inversion-Based Watermarking
Author	Tadahiko Kimoto, Fumihiko Kosaka (Toyo University, Japan)
Page	pp. 458 - 461
Keyword	image watermark, subjective quality, perceptually adaptive system
Abstract	An image watermarking scheme using the previously proposed bit embedding method is developed. Based on the properties of the bit embedding method, the perceptual model of two kinds of objective quality measures is assumed. Then, the measurements of human subjective image quality are analyzed from the viewpoint of the correlation with these two measures. Thereby, the estimating function that can yield an estimate of the subjective quality from two objective measurements is determined. By using the estimating function, the perceptually adaptive watermarking can be achieved.

Session T2 Tutorial Session 2
Time: 11:15 - 12:00 Friday, December 10, 2010
Chair: Takayuki Hamamoto (Tokyo Science University, Japan)

T2-1 (Time: 11:15 - 12:00)

Title	(Tutorial) Quality Assessment for Image Compression Purpose
Author	Chaker Larabi (University of Poitiers, France)
Abstract	In the last years, image quality assessment became a very hot research topic especially for compression of image and video, first because of the large availability of multimedia applications and contents, and then because many scientists/engineers need to make a selection of algorithms and tools. This tutorial is designed to cover several aspects of the field of image quality assessment. After a brief introduction about the needs of quality assessment for multimedia applications, a review of the main approaches will be made by giving a description of the metrics categories and the subjective paradigms. At this point, it is important to distinguish between Full reference, reduced reference and no reference metrics but also understand the difference between image and video quality assessment. The focus of this course will be put on how to answer the question: which quality procedure for which application and which content? Another important topic for this course is on how to measure the performance of a given metric for a given application. Several practical examples will allow to better handle the quality assessment problem.

Session S2 Special Session 2: Beyond H.264/MPEG-4 AVC
Time: 13:15 - 15:15 Friday, December 10, 2010
Chair: Seishi Takamura (NTT Corporation, Japan)

S2-1 (Time: 13:15 - 13:45)

Title	Recent Advances in Video Coding Using Static Background Models
Author	Andreas Krutz, Alexander Glantz, Thomas Sikora (Technische Universität Berlin, Germany)
Page	pp. 462 - 465
Keyword	Video Coding, Model-based Video Coding, H.264/AVC
Abstract	Sprite coding, as standardized in MPEG-4 Visual, can result in superior performance compared to common hybrid video codecs both objectively and subjectively. However, state-ofthe- art video coding standard H.264/AVC clearly outperforms MPEG-4 Visual sprite coding in broad bit rate ranges. Based on the sprite coding idea, this paper proposes a video coding technique that merges the advantages of H.264/AVC and sprite coding. For that, sophisticated algorithms for global motion estimation, sprite generation and object segmentation – all needed for thorough sprite coding – are incorporated into an H.264/AVC coding environment. The proposed approach outperforms H.264/AVC especially in lower bit rate ranges. Savings up to 21% can be achieved.

S2-2 (Time: 13:45 - 14:15)

Title	Novel Video Coding Paradigm with Reduction/Restoration Processes
Author	Toshie Misu, Yasutaka Matsuo, Shinichi Sakaida, Yoshiaki Shishikui, Eisuke Nakasu (Science & Technology Research Laboratories, NHK, Japan)
Page	pp. 466 - 469
Keyword	video coding, super-resolution, image reduction, image restoration, nonuniform sampling
Abstract	To optimally design distortions in lossy video coding, we propose the use of a novel coding paradigm with adaptive nonlinear transforms as pre/post-processors of a conventional video codec. The preprocessor decimates less important pixels based on an image analysis. A conventional video encoder such as MPEG-4 AVC/H.264 further eliminates the redundancy of the decimated images. On the decoder side, the postprocessor restores small decoded images of the conventional decoder to the original resolution using an inverse mapping including a super-resolution technique that uses a priori knowledge on the decimation in the preprocessing. Results of the experiments showed the proposed coding scheme poses distortion that has a more straightforward appearance than that of a directly encoded/decoded image by a sole conventional H.264 codec.

S2-3 (Time: 14:15 - 14:45)

Title	Towards Efficient Intra Prediction Based on Image Inpainting Methods
Author	Dimitar Doshkov, Patrick Ndjiki-Nya, Haricharan Lakshman, Martin Koeppel, Thomas Wiegand (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany)
Page	pp. 470 - 473
Keyword	Intra prediction, Texture synthesis, PDEs, Template matching, Inpainting
Abstract	In this paper, novel intra prediction methods based on image inpainting approaches are proposed. The H.264/AVC intra prediction modes are not well suited for processing complex textures at low bit rates. Our algorithm utilizes an efficient combination of partial differential equations (PDEs) and patch based texture synthesis in addition to the standard directional predictors. Bit rate savings up to 3.5% compared to that of the H.264/AVC standard are shown.

S2-4 (Time: 14:45 - 15:15)

Title	Low Complexity Video Coding and the Emerging HEVC Standard
Author	Kemal Ugur (Nokia Research Center, Finland), Kenneth Andersson (Ericsson Research, Sweden), Arild Fuldseth, Gisle Bjontegaard, Lars Peter Enderssen (Tandberg Telecom (Cisco), Norway), Jani Lainema, Antti Hallapuro, Dmytro Rusanovskyy, Cixun Zhang (Nokia Research Center, Finland), Andrey Norkin, Clinton Priddle, Thomas Rusert, Jonatan Samuelsson, Rickard Sjoberg, Zhuangfei Wu (Ericsson Research, Sweden), Justin Ridge (Nokia, U.S.A.)
Page	pp. 474 - 477
Keyword	HEVC, standardization, video coding, H.264/AVC
Abstract	This paper describes a low complexity video codec with high coding efficiency. It was proposed to the High Efficiency Video Coding (HEVC) standardization effort of MPEG and VCEG, and has been partially adopted into the initial HEVC Test Model under Consideration design. The proposal utilizes a quad-tree structure with a support of large macroblocks of size 64x64 and 32x32, in addition to macroblocks of size 16x16. The entropy coding is done using a low complexity variable length coding based scheme with improved context adaptation over the H.264/AVC design. In addition, the proposal includes improved interpolation and deblocking filters, giving better coding efficiency while having low complexity. Finally, an improved intra coding method is presented. The subjective quality of the proposal is evaluated extensively and the results show that the proposed method achieves similar visual quality as H.264/AVC High Profile anchors with around 50% and 35% bit rate reduction for low delay and random-access experiments respectively at high definition sequences. This is achieved with less complexity than H.264/AVC Baseline Profile, making the proposal especially suitable for resource constrained environments.

Session D2 Panel Discussion 2: Beyond H.264/MPEG-4 AVC
Time: 15:15 - 16:15 Friday, December 10, 2010
Chair: Kohtaro Asai (Mitsubishi Electric Corporation, Japan)

D2-1 (Time: 15:15 - 16:15)

Title	(Panel Discussion) Beyond H.264/MPEG-4 AVC
Author	Chair: Kohtaro Asai (Mitsubishi Electric Corporation, Japan), Panelists: Thomas Wiegand (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany), Kemal Ugur (Nokia Research Center, Finland), Woo-Jin Han (Samsung Electronics, Republic of Korea), Andrew Segall (Sharp Labs of America, U.S.A.), Teruhiko Suzuki (Sony, Japan)

Session P4 Poster Session 4
Time: 16:30 - 18:00 Friday, December 10, 2010
Chair: Shinichi Sakaida (Science & Technology Research Laboratories, NHK, Japan)

[3DTV/FTV/multi-view-related topics]

P4-1

Title On-Line Statistical Analysis Based Fast Mode Decision for Multi-View Video Coding

Author Chia-Chi Chan (Dept. Communication Engineering, National Central University, Taiwan), Jheng-Ping Lin (ZyXEL Corp., Taiwan), Chih-Wei Tang (Dept. Communication Engineering, National Central University, Taiwan)

Page pp. 478 - 481

Keyword Multi-view video coding, fast mode decision, statistical analysis, RD cost, motion and disparity estimation

Abstract The high computational complexity of multi-view video codecs makes it necessary to speed up for their realization in consumer electronics. Since fast encoding algorithms are expected to adapt to different video sequences, this paper proposes a fast algorithm that consists of fast mode decision and fast disparity estimation for multi-view video coding. The fast mode decision algorithm applies to both temporal and inter-view predictions. The candidates for mode decision are reduced based on a set of thresholds. Differ from the previous fast mode decision algorithms for MVC, this scheme determines the thresholds according to the on-line statistical analysis of motion and disparity costs of the first GOP in each view. Since the inter-view prediction is time consuming, we propose a fast disparity estimation algorithm to save encoding time. Experimental results show that our proposed scheme reduces the computational complexity significantly with negligible degradation of coding efficiency.

P4-2

Title	Optimal Rate Allocation for View Synthesis along a Continuous Viewpoint Location in Multiview Imaging
Author	Vladan Velisavljevic (Deutsche Telekom Laboratories, Germany), Gene Cheung (National Institute of Informatics, Japan), Jacob Chakareski (Ecole Polytechnique Federale de Lausanne, Switzerland)
Page	pp. 482 - 485
Keyword	Multi-view imaging, Rate allocation
Abstract	We consider the scenario of view synthesis via depth-image based rendering in multi-view imaging. We formulate a resource allocation problem of jointly assigning an optimal number of bits to compressed texture and depth images such that the maximum distortion of a synthesized view over a continuum of viewpoints between two encoded reference views is minimized, for a given bit budget. We construct simple yet accurate image models that characterize the pixel values at similar depths as first-order Gaussian auto-regressive processes. Based on our models, we derive an optimization procedure that numerically solves the formulated min-max problem using Lagrange relaxation. Through simulations we show that, for two captured views scenario, our optimization provides a significant gain (up to 2dB) in quality of the synthesized views for the same overall bit rate over a heuristic quantization that selects only two quantizers - one for the encoded texture images and the other for the depth images.

P4-3

Title	Panoramic Scene Generation from Multi-view Images with Close Foreground Objects
Author	Soon-Young Lee (Seoul National University, Republic of Korea), Jae-Young Sim (Ulsan National Institute of Science and Technology, Republic of Korea), Chang-Su Kim (Korea University, Republic of Korea), Sang-Uk Lee (Seoul National University, Republic of Korea)
Page	pp. 486 - 489
Keyword	3D scene representation, Image panorama, view expansion, multi-view image processing
Abstract	An algorithm to generate a panorama from multi-view images, which contain foreground objects with varying depths, is proposed in this work. The proposed algorithm constructs a foreground panorama and a background panorama separately, and then merges them into a complete panorama. First, the foreground panorama is obtained by finding the translational displacements of objects between source images. Second, the background panorama is initialized using warped source images and then optimized to preserve spatial consistency and satisfy visual constraints. Then, the background panorama is extended by inserting seams and merged with the foreground panorama. Experimental results demonstrate that the proposed algorithm provides visually satisfying panoramas with all meaningful foreground objects, but without severe artifacts in the backgrounds.

P4-4

Title	A Sub-Pixel Virtual View Synthesis Method for Multiple View Synthesis
Author	Xin Tong, Ping Yang (Tsinghua University, China), Xiaozhen Zheng, Jianhua Zheng (Hisilicon Technologies Co. Ltd., China), Yun He (Tsinghua University, China)
Page	pp. 490 - 493
Keyword	SPVVS, virtual view synthesis
Abstract	A sub-pixel virtual view synthesis (SPVVS) method is proposed in this paper. In the proposed method, by promoting the sampling rate of the target virtual view, the sub-pixel information of the corresponding pixel position among multiple input views with sub-pixel displacement can be utilized. A directional adaptive image interpolation is manipulated to generate a high resolution intermediate image, which is then down-sampled to obtain the target synthesized virtual view. The realization procedure of SPVVS is also presented. Experiment result shows significant improvements on subjective quality compared to traditional integer pixel synthesis method. Artifacts such as ‘hat’ effects can be significantly reduced.

P4-5

Title	Improving the Visual Quality of AVC/H.264 by Combining It with Content Adaptive Depth Map Compression
Author	Christian Keimel, Klaus Diepold (TU Muenchen, Germany), Michel Sarkis (Sony Deutschland GmbH, Germany)
Page	pp. 494 - 497
Keyword	AVC/H.264, 3D scene analysis, 3DTV, Depth map comression, content adaptive meshing
Abstract	The future of video coding for 3DTV lies in the combination of depth maps and corresponding textures. Most current video coding standards, however, are only optimized for visual quality and are not able to efficiently compress depth maps. We present in this work a content adaptive depth map meshing with tritree and entropy encoding for 3D videos. We show that this approach outperforms the intra frame prediction of AVC/H.264 for the coding of depth maps of still images. We also demonstrate by combining AVC/H.264 with our algorithm that we are able to increase the visual quality of the encoded texture on average by 6~dB. This work is currently limited to still images but an extension to intra coding of 3D video is straightforward.

P4-6

Title	Error Concealment for MVC and 3D Video Coding
Author	Olgierd Stankiewicz, Krzysztof Wegner, Marek Domański (Poznan University of Technology, Poland)
Page	pp. 498 - 501
Keyword	Error concealment, MVC, 3D video, depth maps, cross-checking
Abstract	In this paper we propose a novel approach to error concealment that can be applied to MVC and other 3D video coding technologies. The image content, that is lost due to errors, is recovered with use of multiple error-concealment techniques. In our work we have used three techniques: well-known temporal- and intra-based techniques and a novel inter-view technique. Proposed inter-view recovery employs Depth Image Based Rendering (DIBR), which requires neighboring views and corresponding depth maps. Those depth maps can be delivered in the bit-stream or estimated in the receiver. In order to obtain the final reconstruction, the best technique is selected locally. For that, an original recovery quality measurement method, based on cross-checking, has been proposed. The idea has been implemented and assessed experimentally, with use of 3D video test sequences. The objective and subjective results show that the proposed approach provide good quality of reconstructed video.

P4-7

Title	Difference Detection Based Early Mode Termination for Depth Map Coding in MVC
Author	Minghui Wang, Xin Jin, Satoshi Goto (Waseda University, Japan)
Page	pp. 502 - 505
Keyword	MVC, mode decision, depth map, difference detection
Abstract	Depth map coding is a new topic in multiview video coding (MVC) following the development of depth-image-based rendering (DIBR). Since depth map is monochromatic and has less texture than color map, fast algorithm is necessary and possible to reduce the computation burden of the encoder. This paper proposed a difference detection based early mode termination strategy. The difference detection (DD) algorithms are categorized to reconstructed frame based (RDD) and original frame based.(ODD). And a simplified ODD (sODD) is also proposed. Early mode termination based on these three DD algorithms are implemented and evaluated in the reference software of JMVC8.0 respectively. Simulation results indicate that RDD based one has no performance lost and reduce 25% runtime on average. ODD and sODD based ones can save 54.3% and 43.6% runtime respectively and have a acceptable R-D performance lost.

P4-8

Title	Fast Stereo Matching with Predictive Search Range
Author	Yu-Cheng Tseng, Po-Hsiung Hsu, Tian-Sheuan Chang (Dept. of Electronics Engineering & Institute of Electronics, National Chiao Tung University, Taiwan)
Page	pp. 506 - 509
Keyword	disparity estimation, stereo matching
Abstract	Local stereo matching could deliver accurate disparity maps by the associated method, like adaptive support-weight, but suffers from the high computational complexity, O(NL), where N is pixel count in spatial domain, and L is search range in disparity domain. This paper proposes a fast algorithm that groups similar pixels into super-pixels for spatial reduction, and predicts their search range by simple matching for disparity reduction. The proposed algorithm could be directly applied to other local stereo matching, and reduce its computational complexity to only 8.2%-17.4% with slight 1.5%-3.2% of accuracy degradation.

P4-9

Title	Influences of Frame Delay and Packet Loss between Left and Right Frames in Stereoscopic Video Communications
Author	Shuliang Lin, Yuichiro Sawa, Norishige Fukushima, Yutaka Ishibashi (Nagoya Institute of Technology, Japan)
Page	pp. 510 - 513
Keyword	stereo video, network delay, packet loss, subjective assessment, inter-media synchronization
Abstract	This paper analyzes the influences of frame delay and packet loss in stereoscopic vision when stereoscopic video transferred over a IP network. We employ live action videos which are transferred to a head-mount-display (HMD) and do the assessment on stereoscopic perception. As a result, we found that speed and movement direction of the attention object play a great role on the deterioration when frame delay and packet loss occurs.

[Beyond H.264/MPEG-4 AVC and related topics]

P4-10

Title A Novel Inloop Filter for Video-Compression Based on Temporal Pixel Trajectories

Author Marko Esche, Andreas Krutz, Alexander Glantz, Thomas Sikora (Technische Universität Berlin, Germany)

Page pp. 514 - 517

Keyword video-compression, temporal inloop filter, pixel trajectories, deblocking

Abstract The objective of this work is to investigate the performance of a new inloop filter for video compression, which uses temporal rather than spatial information to improve the quality of reference frames used for prediction. The new filter has been integrated into the H.264/AVC baseline encoder and tested on a wide range of sequences. Experimental results show that the filter achieves a bit rate reduction of up to 12% and more than 4% on average without increasing the complexity of either encoder or decoder significantly.

P4-11

Title	Fast Rate-Distortion Optimized Transform for Intra Coding
Author	Xin Zhao (Institute of Computing Technology, Chinese Academy of Sciences; Graduate University of Chinese Academy of Sciences, China), Li Zhang, Siwei Ma, Wen Gao (Institute of Digital Media, Peking University, China)
Page	pp. 518 - 521
Keyword	rate-distortion optimization (RDO), Intra coding, mode-dependent directional transform (MDDT), rate-distortion optimized transform (RDOT)
Abstract	In our previous work, rate-distortion optimized transform (RDOT) is introduced. The proposed RDOT achieves remarkable coding gain for KTA Intra coding, but the computational complexity is increased drastically at the encoder. To solve this problem, we propose a fast RDOT scheme using macroblock- and block-level R-D cost thresholding. With the proposed methods, unnecessary mode trials can be efficiently skipped from the encoding process. Experimental results show that, with negligible performance degradation, about 88.9% of the total encoding time is saved.

P4-12

Title	A Hierarchical Variable-Sized Block Transform Coding Scheme for Coding Efficiency Improvement on H.264/AVC
Author	Bumshik Lee, Jaeil Kim, Sangsoo Ahn, Munchurl Kim (Korea Advanced Institute of Science and Technology, Republic of Korea), Hui Yong Kim, Jongho Kim, Jin Soo Choi (Electronic Telecommunications Research Institute (ETRI), Republic of Korea)
Page	pp. 522 - 525
Keyword	Quadtree Transform, Variable Block-size Transform, Discrete Cosine Transform, H.264/AVC
Abstract	In this paper, a rate-distortion optimized variable block transform coding scheme is proposed based on a quadtree-structured transform for macroblock (MB) coding with a set of the order-4 and -8 integer cosine transform (ICT) kernels of H.264/AVC as well as a new order-16 ICT kernel. The set of order-4, -8 and -16 ICT kernels are applied for inter-predictive coding in square (4x4, 8x8 or 16x16) or non-square (16x8 or 8x16) transform for each MB in a quadtree structured manner. The proposed quadtree-structured variable block transform scheme using the order-16 ICT kernel achieves significant bitrate reduction up to 15%, compared to the High profile of H.264/AVC. Even if the number of candidates for the transform types increases, the encoding time can be reduced to average 4-6% over the H.264/AVC.

P4-13

Title	Enhanced Region-Based Adaptive Interpolation Filter
Author	Shohei Matsuo, Yukihiro Bandoh, Seishi Takamura, Hirohisa Jozawa (NTT Cyber Space Laboratories, NTT Corporation, Japan)
Page	pp. 526 - 529
Keyword	motion compensation, adaptive interpolation filter, region-division, edge, image locality
Abstract	Adaptive interpolation filter (AIF) was proposed to improve motion compensation. The conventional AIF optimizes the filter coefficients on a frame-by-frame basis. However, when the image is divided into multiple regions, each of which has different characteristics, the coding efficiency could improve by performing optimization on a region-by-region basis. In this paper, we proposed a region-based AIF (RBAIF). Simulation results showed that RBAIF offered about 0.43% and 5.05% higher coding gain than the conventional AIF and the H.264/AVC filter, respectively.

P4-14

Title	Fractional-Sample Motion Compensation Using Generalized Interpolation
Author	Haricharan Lakshman, Benjamin Bross, Heiko Schwarz, Thomas Wiegand (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany)
Page	pp. 530 - 533
Keyword	video coding, motion-compensated prediction, reference picture upsampling, B-splines
Abstract	Typical interpolation methods in video coding perform filtering of reference picture samples using FIR filters for motion compensated prediction. This process can be viewed as a signal decomposition using basis functions which are restricted by the interpolating constraint. Using the concept of generalized interpolation provides a greater degree of freedom for selecting basis functions. We implemented generalized interpolation using a combination of IIR and FIR filters. The complexity of the proposed scheme is comparable to that of an 8-tap FIR filter. Bit rate savings up to 20% compared to the H.264/AVC 6-tap filter are shown.

[Image/video coding and related topics]

P4-15

Title Image Coding Approach Based On Image Decomposition

Author Yunhui Shi, Yanli Hou, Baocai Yin, Wenpeng Ding (Beijing University of Technology, China)

Page pp. 534 - 537

Keyword image decomposition, texture synthesis, region selection, image coding

Abstract Textures in many images or video scenes are difficult to code because of the large amount of visible detail. This paper proposes an image coding approach to solve this problem, in which we incorporate image decomposition and texture synthesis technology into the image coding framework. The key idea of our approach is to first decompose the original image into cartoon component u and texture component v with different basic characteristics, and then to synthesize the selected texture regions in texture component v. The cartoon component u and the non-synthetic regions in texture component v are compressed by JPEG. Experimental results show bit-rate savings of over 30% compared with JPEG at similar visual quality levels.

P4-16

Title	A Real-Time System of Distributed Video Coding
Author	Kazuhito Sakomizu, Takahiro Yamasaki, Satoshi Nakagawa, Takashi Nishi (Oki Electric Industry Co., Ltd., Japan)
Page	pp. 538 - 541
Keyword	distributed video coding, real-time system, Slepian-Wolf theorem, Wyner-Ziv theorem
Abstract	This paper presents a real-time system of distributed video coding (DVC). The decoding process of DVC is normally complex, which causes difficulty in real-time implementation. To address this problem, we propose a new configuration of DVC with three methods. Then we implement the system with parallelization techniques. Experimental results show that the encoder on i.MX31 400 MHz could operates at about CIF 13 fps, and the decoder on Core 2 Quad 2.83 GHz operates at more than CIF 30 fps.

P4-17

Title	Block-Based Second Order Prediction on AVS-Part 2
Author	Binbin Yu, Shangwen Li, Lu Yu (Institute of Information and Communication Engineering, Zhejiang University, China)
Page	pp. 542 - 545
Keyword	Block-based Second Order Prediction, Motion-compensated prediction, Mode prediction, Directional operators, AVS
Abstract	AVS-Part 2 is a mainstream video coding standard with high compression efficiency similar to H.264/AVC. A technique named Second Order Prediction (SOP) has been presented based on H.264/AVC to decrease the signal correlation after motion-compensated prediction. To achieve better coding performance, this paper presents a method named Block-based Second Order Prediction (BSOP) to ameliorate SOP to adapt to the features of the motion-compensation in AVS-P2 with analysis and demonstration in detail. Experimental results show that the proposed BSOP can outperform AVS-P2 P-picture coding by 3.99% bit-rate saving (0.126dB BD-PSNR gain) on average, and performs better than SOP implemented on AVS by 1.81% bit-rate saving.

P4-18

Title	Improved Local PDF Estimation in the Wavelet Domain for Generalized Lifting
Author	Julio C. Rolon (National Polytechnic Institute, Mexico), Philippe Salembier (Technical University of Catalonia, Spain)
Page	pp. 546 - 549
Keyword	Generalized lifting, wavelets, pdf estimation, lossy image coding
Abstract	Generalized Lifting has been studied for lossy image compression in [2,3]. It has been demonstrated that the method achieves a significant reduction of the wavelet coefficients energy and entropy. The definition of the GL relies on an estimation of the pdf of the pixel to encode conditioned to a surrounding context. The objective of this paper is to present an improved method for the estimation of the pdf at the local level. We follow the idea of self similarity proposed in [1] for denoising, and propose to estimate the pdf using all the causal contexts within a window. Experimental results show an important increment in the energy and entropy gains when compared to previous strategies [2,3].

P4-19

Title	Image Coding by Using Non-Linear Texture Decomposition and Image Summarization
Author	Chihiro Suzuki, Takamichi Miyata, Yoshinori Sakai (Tokyo Institute of Technology, Japan)
Page	pp. 550 - 553
Keyword	Texture Synthesis, TV-regularization, Bidirectional Similarity, Image Coding
Abstract	TV-regularization can be used to decompose any natural image into a structure image(S) and a texture image(T). We proposed a novel image coding method that is coding separately these images. We make a compaction image that has all texture patterns. Then the compaction image is divided into S and T. Then, the encoder sends the compaction’s S, T and input’s S to the decoder. At the decoder, original-size T is synthesized from compaction’s S by matching the compaction’s S and the original-size S.

P4-20

Title	Coding Efficiency Improvement Using Inter-Picture Processing of Quantization Error
Author	Kenji Sugiyama, Naoya Sagara, Masao Arizumi (Seikei University, Japan)
Page	pp. 554 - 557
Keyword	Compatibility, I-picture coding, B-picture coding, Motion compensation, Quantization error
Abstract	As standard video encoder techniques have matured, their rate of improvement has slowed. As an alternative, a new coding concept with semi-compatibility has been proposed, and an enhancement to I-pictures efficiency has been discussed. This applied method reduces the quantization error using motion compensated inter-picture processing. In this report, we apply this method to P-pictures to improve the efficiency of B-pictures. The quantization error component of the prediction signal is canceled by averaging the bi-directional prediction. Experiments using MPEG-4 show significant improvement in the coding efficiency using the proposed method. The maximum PSNR gain reaches 2.3 dB in a static sequence. At least 0.5 dB can be achieved in a high motion sequence.

P4-21

Title	Multiple Description Video Transcoding with Temporal Drift Control
Author	Pedro Correia, Pedro Assunção, Vitor Silva (Instituto de Telecomunicações, Portugal)
Page	pp. 558 - 561
Keyword	Multiple Description Transcoding, Multiple Description Scalar Quantisation, Drift Distortion
Abstract	This paper proposes a multiple description (MD) transcoding scheme capable of preventing drift by distortion accumulation in temporally predicted motion compensated slices. Drift compensation is achieved by generating a controlled amount of side information to be used for decoding whenever a description fails to reach the end user terminal. The side information is generated by re-encoding the transcoding residue with an independent quantisation parameter which also controls redundancy. A simplified architecture is devised to reduce transcoding complexity in regard to the number of processing functions and buffer requirements. The experimental results show that temporally predicted frames do not suffer from drift and their quality is significantly improved at reduced redundancy cost in comparison with a classic MD transcoding scheme.

P4-22

Title	H.264/AVC to Wavelet-Based Scalable Video Transcoding Supporting Multiple Coding Configurations
Author	Eduardo Peixoto, Toni Zgaljic, Ebroul Izquierdo (Queen Mary, University of London, U.K.)
Page	pp. 562 - 565
Keyword	Transcoding, Scalable Video Coding
Abstract	Scalable Video Coding (SVC) enables low complexity adaptation of the compressed video, providing an efficient solution for video content delivery through heterogeneous networks and to different displays. However, legacy video and most commercially available content capturing devices use conventional non-scalable coding, e.g., H.264/AVC. This paper proposes an efficient transcoder from H.264/AVC to a wavelet-based SVC to exploit the advantages offerend by the SVC technology. The proposed transcoder is able to cope with different coding configurations in H.264/AVC, such as IPP or IBBP with multiple reference frames. To reduce the transcoder's complexity, motion information and presence of the residual data extracted from the decoded H.264/AVC video are exploited. Experimental results show a good performance of the proposed transcoder in terms of decoded video quality and system complexity.

P4-23

Title	Edge-Adaptive Transforms for Efficient Depth Map Coding
Author	Godwin Shen, Woo-shik Kim, Sunil Kumar Narang, Antonio Ortega (University of Southern California, U.S.A.), Jaejoon Lee, HoCheon Wey (Samsung Advanced Institute of Technology, Republic of Korea)
Page	pp. 566 - 569
Keyword	Multiview plus depth (MVD), Depth coding, Rate-distortion optimization
Abstract	In this work a new set of edge-adaptive transforms (EATs) is presented as an alternative to the standard DCTs used in image and video coding applications. These transforms avoid filtering across edges in each image block, thus, they avoid creating large high frequency coefficients. These transforms are then combined with the DCT in H.264/AVC and a transform mode selection algorithm is used to choose between DCT and EAT in an RD-optimized manner. These transforms are applied to coding depth maps used for view synthesis in a multi-view video coding system, and provides up to 29% bit rate reduction for a fixed quality in the synthesized views.

P4-24

Title	Direction-Adaptive Hierarchical Decomposition for Image Coding
Author	Tomokazu Murakami, Keita Takahashi, Takeshi Naemura (The University of Tokyo, Japan)
Page	pp. 570 - 573
Keyword	Image decomposition, directional prediction, L1 norm, directional transform, wavelet
Abstract	A new model of decomposing an image hierarchically into direction-adaptive subbands using pixel-wise direction estimation is presented. For each decomposing operation, an input image is divided into two parts: a base image subsampled from the input image and subband components. The subband components consist of residuals of estimating the pixels skipped through the subsampling, which ensures the invertibility of the decomposition. The estimation is performed in a direction-adaptive way, whose optimal direction is determined by a L1 norm criterion for each pixel, aiming to achieve good energy compaction that is suitable for image coding. Furthermore, since the L1 norms are obtained from the base image alone, we do not need to retain the directional information explicitly, which is another advantage of our model. Experimental results show that the proposed model can achieve lower entropy than conventional Haar or D5/3 discrete wavelet transform in case of lossless coding.

[Image/video processing and related topics]

P4-25

Title A Robust Video Super-Resolution Algorithm

Author Xinfeng Zhang (Institute of Computing Technology, Chinese Academy of Sciences, China), Ruiqin Xiong, Siwei Ma, Wen Gao (School of Electronic Engineering and Computer Science, Peking University, China)

Page pp. 574 - 577

Keyword super-resolution, kernel regression, irregular interpolation

Abstract In this paper, we proposed a robust video super-resolution reconstruction method based on spatial-temporal orienta-tion-adaptive kernel regression. First, we proposed a robust registration efficiency model to reflect the temporal information reliability. Second, we proposed a spatial-temporal steering kernel considering motions between frames and structures in each low resolution frame. Simula-tion results demonstrate that our new super-resolution method substantially improves both the subjective quality and objective quality than other resolution enhancement methods.

P4-26

Title	An Efficient Method for the Detection of Ringing Artifacts and De-Ringing in JPEG Image
Author	Shen-Chuan Tai, Bo-Jhih Chen, Mankit Choi (Institute of Computer and Communication Engineering, Department of Electrical Engineering, National Cheng Kung University, Taiwan)
Page	pp. 578 - 581
Keyword	JPEG, ringing artifacts, image compression
Abstract	JPEG standard is commonly used for the still image compression. However, the DCT-based coding of JPEG is one of the lossy compression tools and introduces most artifacts in the decompressed image such as blocking and ringing artifacts. In this paper, our proposed method focuses on how to efficiently detect the blocks which cause ringing artifacts. Moreover, these ringing blocks will be further filtered and the texture region will be able to preserve as well as the smooth region. Simulation results show that our proposed method can outperform the related algorithms in subjectively and objectively. By comparing JPEG compressed image, the decompressed image using our algorithm can achieve the better PSNR as well as the visual performance, especially at the lower quality coding (the higher compression rate).

P4-27

Title	Low Delay Distributed Video Coding Using Data Hiding
Author	Krishna Rao Vijayanagar, Bowen Dan, Joohee Kim (Illinois Institute of Technology, U.S.A.)
Page	pp. 582 - 585
Keyword	Distributed Video Coding, Data hiding, Low Delay DVC
Abstract	Distributed Video Coding (DVC) is a popular topic in the research community and the past years have seen several different implementations. DVC has been proposed as a solution for applications that have limited battery resources and low hardware complexity, thus necessitating a low complexity encoder. An ideal application would be in remote surveillance/monitoring or live video conferencing. However, current solutions use iteratively decodable channel codes like LDPCA or Turbo codes that have large latencies. In order to make real-time communication possible. The proposed architecture makes efficient use of Skip blocks to reduce the bitrate, eliminates the iterative decoding nature of the Wyner-Ziv (WZ) channel and uses a simple data-hiding based compression algorithm. This drastically cuts down on the time complexity of the decoding procedure while still maintaining an rate-distortion performance better than that of H.264/AVC Intra coding and other current DVC solutions.

P4-28

Title	FFT-Based Full-Search Block Matching Using Overlap-Add Method
Author	Hidetake Sasaki, Zhen Li, Hitoshi Kiya (Tokyo Metropolitan University, Japan)
Page	pp. 586 - 589
Keyword	block matching, FFT, overlap-add method, pattern recognition, motion estimation
Abstract	One category of fast full-search block matching algorithms (BMAs) is based on the fast Fourier transformation (FFT). In conventional methods in this category, the macroblock size must be adjusted to the search window size by zero-padding. In these methods, the memory consumption and computational complexity heavily depend on the size difference between the macroblock and the search window. Thus, we propose a novel FFT-based BMA to solve this problem. The proposed method divides the search window into multiple sub search windows to versatilely control the difference between the macroblock and the search window sizes. Simulation results show the effectiveness of the proposed method.

[Quality, system, applications, and other topics]

P4-29

Title Temporal Inconsistency Measure for Video Quality Assessment

Author Songnan Li, Lin Ma, Fan Zhang, King Ngi Ngan (The Chinese University of Hong Kong, Hong Kong)

Page pp. 590 - 593

Keyword video quality assessment, spatial visual quality measure, temporal inconsistency measure

Abstract Visual quality assessment plays a crucial role in many vision related signal processing applications. In the literature, more efforts have been spent on spatial visual quality measure. Although a large number of video quality metrics have been proposed, the methods to use temporal information for quality assessment are less diversified. In this paper, we propose a novel method to measure the temporal impairments. The proposed method can be incorporated into any image quality metric to extend it into a video quality metric. Moreover, it is easy to apply the proposed method in video coding system to incorporate with MSE for rate-distortion optimization.

P4-30

Title	The Dependence of Visual Noise Perception on Background Color and Luminance
Author	Makoto Shohara, Kazunori Kotani (Japan Advanced Institute of Science and Technology, Japan)
Page	pp. 594 - 597
Keyword	Visual system, Noise measurement, Color measurement, Noise generators, Shot noise
Abstract	This paper describes the dependency of noise perception on background color and luminance of noise quantitatively. We use the luminance and chromatic noise models derived from shot noise model. And we conduct subjective and quantitative experiments using a modified gray scale method. The subjective experiment results show the perceived color noise depends on the background color, but the perceived luminance noise does not. In addition, the perceived chromatic noise level is about 8 times smaller than calculated color noise.

P4-31

Title	An Adaptive Low-Complexity Global Motion Estimation Algorithm
Author	Md Nazmul Haque, Moyuresh Biswas, Mark R. Pickering, Michael R. Frater (The University of New South Wales, Australia)
Page	pp. 598 - 601
Keyword	global motion estimation, video coding, image registration, gradient-descent optimization
Abstract	A limitation of current global motion estimation approaches is the additional complexity of the gradient-descent optimization that is typically required to calculate the optimal set of global motion parameters. In this paper we propose a new low-complexity algorithm for global motion estimation. The complexity of the proposed algorithm is reduced by performing the majority of the operations in the gradient-descent optimization using logic operations rather than full-precision arithmetic operations. This use of logic operations means that the algorithm can be implemented much more easily in hardware platforms such as field programmable gate arrays (FPGAs). Experimental results show that the execution time for software implementations of the new algorithm is reduced by a factor of almost four when compared to existing fast implementations without any significant loss in registration accuracy.

P4-32

Title	Scalable Multiple Description Video Coding Using Successive Refinement of Side Quantizers
Author	Muhammad Majid, Charith Abhayaratne (The University of Sheffield, U.K.)
Page	pp. 602 - 605
Keyword	Multiple description coding, quality scalability, resilience
Abstract	In this paper, we present a new method for scalable multiple description video coding based on motion compensated temporal filtering and multiple description scalar quantizer with successive refinement. In our method quality scalability is achieved by successively refining the side quantizers of a multiple description scalar quantizer. The rate of each description is allocated by considering different refinement levels for each spatio-temporal subband. The performance of the proposed scheme under lossless and lossy channel conditions are presented and compared with single scalable description video coding.

P4-33

Title	Bit-Plane Compressive Sensing with Bayesian Decoding for Lossy Compression
Author	Sz-Hsien Wu (Electronic Engineering, National Chiao Tung University, Taiwan), Wen-Hsiao Peng (Computer Science, National Chiao Tung University, Taiwan), Tihao Chiang (Electronic Engineering, National Chiao Tung University, Taiwan)
Page	pp. 606 - 609
Keyword	Compressive Sensing, Bayesian estimation, Bit-plane
Abstract	This paper addresses the problem of reconstructing a compressively sampled sparse signal from its lossy and possibly insufficient measurements. The process involves estimations of sparsity pattern and sparse representation, for which we derived a vector estimator based on the Maximum a Posteriori Probability (MAP) rule. By making full use of signal prior knowledge, our scheme can use a measurement number close to sparsity to achieve perfect reconstruction. It also shows a much lower error probability of sparse pattern than prior work, given insufficient measurements. To better recover the most significant part of the sparse representation, we further introduce the notion of bit-plane separation. When applied to image compression, the technique in combination with our MAP estimator shows promising results as compared to JPEG: the difference in compression ratio is seen to be within a factor of two, given the same decoded quality.

P4-34

Title	A Reduced-Reference Metric Based on the Interest Points in Color Images
Author	Michael Nauge, Mohamed-Chaker Larabi, Christine Fernandez (University of Poitiers, France)
Page	pp. 610 - 613
Keyword	Metric, Interest point, Reduced reference, quality, saliency
Abstract	In the last decade, an important research effort has been dedicated to quality assessment from the subjective and the objective points of view. The focus was mainly on Full Reference (FR) metrics because of the ability to compare to an original. Only few works were oriented to Reduced Reference (RR) or No Reference (NR) metrics, very useful for applications where the original image is not available such as transmission or monitoring. In this work, we propose an RR metric based on two concepts, the interest points of the image and the objects saliency on color images. This metric needs a very low amount of data (lower than 8 bytes) to be able to compute the quality scores. The results show a high correlation between the metric scores and the human judgement and a better quality range than well-known metrics like PSNR or SSIM. Finally, interest points have shown that they can predict the quality of color images.

Title	3D Space Representation Using Epipolar Plane Depth Image
Author	Takashi Ishibashi, Tomohiro Yendo, Mehrdad Panahpour Tehrani (Graduate School of Engineering, Nagoya University, Japan), Toshiaki Fujii (Graduate School of Science and Engineering, Tokyo Institute of Technology, Japan), Masayuki Tanimoto (Graduate School of Engineering, Nagoya University, Japan)
Page	pp. 22 - 25
Keyword	FTV, EPI, EPDI, Ray-Space, GDM
Abstract	We propose a novel 3D space representation for multi-view video, using epipolar plane depth images (EPDI). Multi-view video plus depth (MVD) is used as common data format for FTV(Free-viewpoint TV), which enables synthesizing virtual view images. Due to large amount of data and complexity of the multi-view video coding (MVC), compression of MVD is a challenging issue. We address this problem and propose a new representation that is constructed from MVD using rayspace. MVD is converted into image and depth ray-spaces. The proposed representation is obtained by converting each of ray-spaces into a global depth map and a texture map using EPDI. Experiments demonstrates the analysis of this representation, and its efficiency.

Title	Complementary Coding Mode Design Based on R-D Cost Minimization for Extending H.264 Coding Technology
Author	Tomonobu Yoshino, Sei Naito, Shigeyuki Sakazawa, Shuichi Matsumoto (KDDI R&D Laboratories, Japan)
Page	pp. 50 - 53
Keyword	low bit-rate video coding, high resolution video coding, H.264, SKIP mode
Abstract	To improve high resolution video coding efficiency under low bit-rate condition, an appropriate coding mode is required from an R-D optimization (RDO) perspective, although a coding mode defined within the H.264 standard is not always optimal for RDO criteria. With this in mind, we previously proposed extended SKIP modes with close-to-optimal R-D characteristics. However, the additional modes did not always satisfy the optimal R-D characteristics, especially for low bit-rate coding. In response, in this paper, we propose an enhanced coding mode capable of providing a candidate corresponding to the minimum R-D cost by controlling the residual signal associated with the extended SKIP mode. The experimental result showed that the PSNR improvement against H.264 and our previous approach reached 0.42 dB and 0.24 dB in the maximum case, respectively.

Title	Hyperspectral Image Compression Suitable for Spectral Analysis Application
Author	Kazuma Shinoda, Yukio Kosugi, Yuri Murakami, Masahiro Yamaguchi, Nagaaki Ohyama (Tokyo Institute of Technology, Japan)
Page	pp. 74 - 77
Keyword	Hyperspectral image, Image compression, Vegetation index, JPEG2000
Abstract	This paper presents a HSI compression considering the error of both vegetation index and spectral data. The proposed method separates a hyperspectral data into spectral data for vegetation index and residual data. Both of the data are encoded by using a seamless coding individually. By holding the spectral channels required for vegetation index in the head of the code-stream, a precise vegetation analysis can be done in a low bit rate. Additionally, by decoding the residual data, the spectral data can be reconstructed in low distortion.

Title	Super-Resolution Decoding of JPEG-Compressed Image Data with the Shrinkage in the Redundant DCT Domain
Author	Takashi Komatsu, Yasutaka Ueda, Takahiro Saito (Kanagawa University, Japan)
Page	pp. 114 - 117
Keyword	JPEG, decoding, redundant DCT, shrinkage, super-resolution
Abstract	Alter, Durand and Froment introduced the total-variation (TV) minimization approach to the artifact-free JPEG decoding. They formulated the decoding problem as the constrained TV restoration problem, in which the TV semi-norm of its restored color image is minimized under the constraint that each DCT coefficient of the restored color image should be in the quantization interval of its corresponding DCT coefficient of the JPEG-compressed data. This paper proposes a new restoration approach to the JPEG decoding. Instead of the TV regularization, our new JPEG-decoding method employs a shrinkage operation in the redundant DCT domain, to mitigate degradations caused by the JPEG coding.

Title	Memory-Efficient Parallelization of JPEG-LS with Relaxed Context Update
Author	Simeon Wahl, Zhe Wang, Chensheng Qiu, Marek Wroblewski, Lars Rockstroh, Sven Simon (University of Stuttgart, Germany)
Page	pp. 142 - 145
Keyword	JPEG-LS, Lossless Image Coding, Parallelization, Context Update
Abstract	A relaxation to the context update of JPEG-LS by delaying the update procedure is proposed, in order to achieve a guaranteed degree of parallelism with a negligible effect on the compression ratio. The lossless mode of JPEG-LS including the run-mode is considered. A descewing scheme is provided generating a bit-stream that preserves the order needed for the decoder to mimic the prediction in a consistent way.

Title	Focus on Visual Rendering Quality through Content-Based Depth Map Coding
Author	Emilie Bosc, Luce Morin, Muriel Pressigout (INSA of Rennes, France)
Page	pp. 158 - 161
Keyword	3D video coding, adaptive coding, depth coding
Abstract	Multi-view video plus depth (MVD) data is a set of multiple sequences capturing the same scene at different viewpoints, with their associated per-pixel depth value. Overcoming this large amount of data requires an effective coding framework. Yet, a simple but essential question refers to the means assessing the proposed coding methods. While the challenge in compression is the optimization of the rate-distortion ratio, a widely used objective metric to evaluate the distortion is the Peak-Signal-to-Noise-Ratio (PSNR), because of its simplicity and mathematically easiness to deal with such purposes. This paper points out the problem of reliability, concerning this metric, when estimating 3D video codec performances. We investigated the visual performances of two methods, namely H.264/MVC and Locally Adaptive Resolution (LAR) method, by encoding depth maps and reconstructing existing views from those degraded depth images. The experiments revealed that lower coding efficiency, in terms of PSNR, does not imply a lower rendering visual quality and that LAR method preserves the depth map properties correctly.

Title	Adaptive Direct Vector Derivation for Video Coding
Author	Yusuke Itani, Shunichi Sekiguchi, Yoshihisa Yamada (Information Technology R&D Center, Mitsubishi Electric Corporation, Japan)
Page	pp. 190 - 193
Keyword	direct mode, motion vector predictor, HEVC, H.264, extended macroblock
Abstract	This paper proposes a new method for improving direct prediction scheme that has been employed in conventional video coding standards such as AVC/H.264. We extend direct prediction concept to achieve better adaptation to local statistics of video source with the assumption of the use of larger motion blocks than conventional macroblock size. Experimental results show the proposed method provides up to 3.3% bitrate saving in low-bitrate coding.

Title	Dictionary Learning-Based Distributed Compressive Video Sensing
Author	Hung-Wei Chen, Li-Wei Kang, Chun-Shien Lu (Academia Sinica, Taiwan)
Page	pp. 210 - 213
Keyword	compressive sensing, sparse representation, dictionary learning, single-pixel camera, l1-minimization
Abstract	We address an important issue of fully low-cost and low-complex video compression for use in resource-extremely limited sensors/devices. Conventional motion estimation-based video compression or distributed video coding (DVC) techniques all rely on the high-cost mechanism, namely, sensing/sampling and compression are disjointedly performed, resulting in unnecessary consumption of resources. That is, most acquired raw video data will be discarded in the (possibly) complex compression stage. In this paper, we propose a dictionary learning-based distributed compressive video sensing (DCVS) framework to “directly” acquire compressed video data. Embedded in the compressive sensing (CS)-based single-pixel camera architecture, DCVS can compressively sense each video frame in a distributed manner. At DCVS decoder, video reconstruction can be formulated as an l1-minimization problem via solving the sparse coefficients with respect to some basis functions. We investigate adaptive dictionary/basis learning for each frame based on the training samples extracted from previous reconstructed neighboring frames and argue that much better basis can be obtained to represent the frame, compared to fixed basis-based representation and recent popular “CS-based DVC” approaches without relying on dictionary learning.

Title	Automatic Moving Object Extraction Using X-means Clustering
Author	Kousuke Imamura, Naoki Kubo, Hideo Hashimoto (Kanazawa University, Japan)
Page	pp. 246 - 249
Keyword	moving object extraction, x-means clustering, watershed algorithm, voting method
Abstract	The present paper proposes an automatic extraction technique of moving objects using x-means clustering. The proposed technique is an extended k-means clustering and can determine the optimal number of clusters based on the Bayesian Information Criterion(BIC). In the proposed method, the feature points are extracted from a current frame, and x-means clustering classifies the feature points based on their estimated affine motion parameters. A label is assigned to the segmented region, which is obtained by morphological watershed, by voting for the feature point cluster in each region. The labeling result represents the moving object extraction. Experimental results reveal that the proposed method provides extraction results with the suitable object number.

Title	On the Duality of Rate Allocation and Quality Indices
Author	Thomas Richter (University of Stuttgart, Germany)
Page	pp. 270 - 273
Keyword	JPEG 2000, SSIM
Abstract	In a recent work, the author proposed to study the performance of still image quality indices such as the SSIM by using them as bjective function of rate allocation algorithms. The outcome of that work was not only a multi-scale SSIM optimal JPEG 2000 implementation, but also a first-order approximation of the MS-SSIM that is surprisingly similar to more traditional contrast-sensitivity and visual masking based approaches. It will be seen in this work that the only difference between the latter works and the MS-SSIM index is the choice of the exponent of the masking term, and furthermore, that a slight modification of the SSIM definition reproducing the traditional exponent is able to improve the performance of the index at or below the visual threshold. It is hence demonstrated that the duality of quality indices and rate allocation helps to improve both the visual performance of the compression codec and the performance of the index.

Title	A Fast Graph Cut Algorithm for Disparity Estimation
Author	Cheng-Wei Chou, Jang-Jer Tsai, Hsueh-Ming Hang, Hung-Chih Lin (National Chiao Tung University, Taiwan)
Page	pp. 326 - 329
Keyword	FTV, stereo correspondence, disparity estimation, graph cut
Abstract	In this paper, we propose a fast graph cut (GC) algorithm for disparity estimation. Two accelerating techniques are suggested: one is the early termination rule, and the other is prioritizing the alpha-beta swap pair search order. Our simulations show that the proposed fast GC algorithm outperforms the original GC scheme by 210% in the average computation time while its disparity estimation quality is almost similar to that of the original GC.

Title	Decoder-Side Hierarchical Motion Estimation for Dense Vector Fields
Author	Sven Klomp, Marco Munderloh, Jörn Ostermann (Leibniz Universität Hannover, Germany)
Page	pp. 362 - 365
Keyword	video coding, motion compensation, dense vector field, block matching
Abstract	Recent research revealed that the data rate can be reduced by performing an additional motion estimation at the decoder. This paper addresses an improved hierarchical motion estimation algorithm to be used in a decoder-side motion estimation system. A special motion vector latching is used to be more robust for very small block sizes and to better adapt to object borders. A dense motion vector field is estimated which reduces the rate by 6.9% in average compared to H.264/AVC.

Title	Bitwise Prediction Error Correction for Distributed Video Coding
Author	Axel Becker-Lakus, Ka-Ming Leung, Zhonghua Ma (Canon Information Systems Research Australia (CiSRA), Australia)
Page	pp. 382 - 385
Keyword	Distributed Video Coding, Wyner-Ziv Coding, Side Information Generation
Abstract	Side information plays a key role in the performance of a Distributed Video Coding (DVC) system. However, the generation of side information often relies on complex motion estimation/interpolation operation. The correlation between the source data and the side information, sometimes referred as virtual channel, is also very difficult to model accurately. In this paper, we propose a bitwise prediction error correction method to improve the quality of the side information during Wyner-Ziv decoding. Whenever a bit error is detected in a bit plane, the less significant bits of the corresponding pixel are adjusted to match the initial prediction. The proposed method has been evaluated using a pixel-domain DVC system and delivers a better coding performance with improved decoding quality and reduced bitrate.

Title	Two-Dimensional Chebyshev Polynomials for Image Fusion
Author	Zaid Omar, Nikolaos Mitianoudis, Tania Stathaki (Imperial College London, U.K.)
Page	pp. 426 - 429
Keyword	Image and data fusion, Chebyshev polynomials, orthogonal moments
Abstract	This report documents in detail the research carried out by the author throughout his first year. The paper presents a novel method for fusing images in a domain concerning multiple sensors and modalities. Using Chebyshev polynomials as basis functions, the image is decomposed to perform fusion at feature level. Results show favourable performance compared to previous efforts on image fusion, namely ICA and DT-CWT, in noise affected images. The work presented here aims at providing a novel framework for future studies in image analysis and may introduce innovations in the fields of surveillance, medical imaging and remote sensing.

Title	Subjective Evaluation of Hierarchical B-Frames Using Video-MUSHRA
Author	Hussain Mohammed, Nikolaus Färber, Jens Garbas (Fraunhofer IIS, Germany)
Page	pp. 446 - 449
Keyword	H.264/AVC, Hierarchical B-Frames, Subjective Quality, MUSHRA
Abstract	Hierarchical B-Frames (HBF) has emerged as an efficient video coding tool in recent years. As shown in the literature, this approach results in excellent PSNR gains of >1 dB. However these PSNR gains are not sufficiently assessed in a scientific manner by subjective tests. Hence in this paper, we evaluate HBF coding pattern subjectively by using the MUSHRA test methodology. While MUSHRA is well established in audio coding research, its application to video is a novelty of this paper. We compare HBF with simple IPP coding pattern at either same PSNR or same bit rate. Our results indicate that, HBF gains are clearly subjectively perceptible. Hence, it can be shown that PSNR gains also correlate with a subjective gain. Interestingly, even at same PSNR, HBF is found to be subjectively superior to simple IPP coding.

Title	On-Line Statistical Analysis Based Fast Mode Decision for Multi-View Video Coding
Author	Chia-Chi Chan (Dept. Communication Engineering, National Central University, Taiwan), Jheng-Ping Lin (ZyXEL Corp., Taiwan), Chih-Wei Tang (Dept. Communication Engineering, National Central University, Taiwan)
Page	pp. 478 - 481
Keyword	Multi-view video coding, fast mode decision, statistical analysis, RD cost, motion and disparity estimation
Abstract	The high computational complexity of multi-view video codecs makes it necessary to speed up for their realization in consumer electronics. Since fast encoding algorithms are expected to adapt to different video sequences, this paper proposes a fast algorithm that consists of fast mode decision and fast disparity estimation for multi-view video coding. The fast mode decision algorithm applies to both temporal and inter-view predictions. The candidates for mode decision are reduced based on a set of thresholds. Differ from the previous fast mode decision algorithms for MVC, this scheme determines the thresholds according to the on-line statistical analysis of motion and disparity costs of the first GOP in each view. Since the inter-view prediction is time consuming, we propose a fast disparity estimation algorithm to save encoding time. Experimental results show that our proposed scheme reduces the computational complexity significantly with negligible degradation of coding efficiency.

Title	A Novel Inloop Filter for Video-Compression Based on Temporal Pixel Trajectories
Author	Marko Esche, Andreas Krutz, Alexander Glantz, Thomas Sikora (Technische Universität Berlin, Germany)
Page	pp. 514 - 517
Keyword	video-compression, temporal inloop filter, pixel trajectories, deblocking
Abstract	The objective of this work is to investigate the performance of a new inloop filter for video compression, which uses temporal rather than spatial information to improve the quality of reference frames used for prediction. The new filter has been integrated into the H.264/AVC baseline encoder and tested on a wide range of sequences. Experimental results show that the filter achieves a bit rate reduction of up to 12% and more than 4% on average without increasing the complexity of either encoder or decoder significantly.

Title	Image Coding Approach Based On Image Decomposition
Author	Yunhui Shi, Yanli Hou, Baocai Yin, Wenpeng Ding (Beijing University of Technology, China)
Page	pp. 534 - 537
Keyword	image decomposition, texture synthesis, region selection, image coding
Abstract	Textures in many images or video scenes are difficult to code because of the large amount of visible detail. This paper proposes an image coding approach to solve this problem, in which we incorporate image decomposition and texture synthesis technology into the image coding framework. The key idea of our approach is to first decompose the original image into cartoon component u and texture component v with different basic characteristics, and then to synthesize the selected texture regions in texture component v. The cartoon component u and the non-synthetic regions in texture component v are compressed by JPEG. Experimental results show bit-rate savings of over 30% compared with JPEG at similar visual quality levels.

Title	A Robust Video Super-Resolution Algorithm
Author	Xinfeng Zhang (Institute of Computing Technology, Chinese Academy of Sciences, China), Ruiqin Xiong, Siwei Ma, Wen Gao (School of Electronic Engineering and Computer Science, Peking University, China)
Page	pp. 574 - 577
Keyword	super-resolution, kernel regression, irregular interpolation
Abstract	In this paper, we proposed a robust video super-resolution reconstruction method based on spatial-temporal orienta-tion-adaptive kernel regression. First, we proposed a robust registration efficiency model to reflect the temporal information reliability. Second, we proposed a spatial-temporal steering kernel considering motions between frames and structures in each low resolution frame. Simula-tion results demonstrate that our new super-resolution method substantially improves both the subjective quality and objective quality than other resolution enhancement methods.

Title	Temporal Inconsistency Measure for Video Quality Assessment
Author	Songnan Li, Lin Ma, Fan Zhang, King Ngi Ngan (The Chinese University of Hong Kong, Hong Kong)
Page	pp. 590 - 593
Keyword	video quality assessment, spatial visual quality measure, temporal inconsistency measure
Abstract	Visual quality assessment plays a crucial role in many vision related signal processing applications. In the literature, more efforts have been spent on spatial visual quality measure. Although a large number of video quality metrics have been proposed, the methods to use temporal information for quality assessment are less diversified. In this paper, we propose a novel method to measure the temporal impairments. The proposed method can be incorporated into any image quality metric to extend it into a video quality metric. Moreover, it is easy to apply the proposed method in video coding system to incorporate with MSE for rate-distortion optimization.

28th Picture Coding Symposium Technical Program

Session Schedule

List of Papers

28th Picture Coding Symposium
Technical Program