(Back to Session Schedule)

28th Picture Coding Symposium

Session P2  Poster Session 2
Time: 9:30 - 11:00 Thursday, December 9, 2010
Chair: Kazunori Kotani (Japan Advanced Institute of Science and Technology, Japan)


[3DTV/FTV/multi-view-related topics]

P2-1
TitleFocus on Visual Rendering Quality through Content-Based Depth Map Coding
AuthorEmilie Bosc, Luce Morin, Muriel Pressigout (INSA of Rennes, France)
Pagepp. 158 - 161
Keyword3D video coding, adaptive coding, depth coding
AbstractMulti-view video plus depth (MVD) data is a set of multiple sequences capturing the same scene at different viewpoints, with their associated per-pixel depth value. Overcoming this large amount of data requires an effective coding framework. Yet, a simple but essential question refers to the means assessing the proposed coding methods. While the challenge in compression is the optimization of the rate-distortion ratio, a widely used objective metric to evaluate the distortion is the Peak-Signal-to-Noise-Ratio (PSNR), because of its simplicity and mathematically easiness to deal with such purposes. This paper points out the problem of reliability, concerning this metric, when estimating 3D video codec performances. We investigated the visual performances of two methods, namely H.264/MVC and Locally Adaptive Resolution (LAR) method, by encoding depth maps and reconstructing existing views from those degraded depth images. The experiments revealed that lower coding efficiency, in terms of PSNR, does not imply a lower rendering visual quality and that LAR method preserves the depth map properties correctly.

P2-2
TitleBit Allocation of Vertices and Colors for Patch-Based Coding in Time-Varying Meshes
AuthorToshihiko Yamasaki, Kiyoharu Aizawa (The University of Tokyo, Japan)
Pagepp. 162 - 165
Keyword3DTV, Time-varying mesh (TVM), inter-frame compression, vector quantization (VQ)
AbstractThis paper discusses the optimal bit rates assignment for vertices and color and for reference frames (I frames) and target frames (P frames) in the patch-based compression method for Time-Varying Meshes (TVMs). TMVs are non-isomorphic 3D mesh sequences generated from multi-view images. Experimental results demonstrated that, the bit rate for vertices affects the visual quality of the rendered 3D model very much whereas that for color does not contribute to the quality improvement. Therefore, as many bits as possible should be assigned to vertices and 8-10 bits per vertex (bpv) is enough for color. In inter-frame coding, the bit rates for the target frames improves the visual quality proportionally but at the same time it is demonstrated that less bits (5~6 bpv) are enough to achieve the same visual quality as the intra-frames.

P2-3
TitleMotion Activity-Based Block Size Decision for Multi-view Video Coding
AuthorHuanqiang Zeng, Kai-Kuang Ma (School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore), Canhui Cai (Institute of Information Science and Technology, Huaqiao University, China)
Pagepp. 166 - 169
Keywordmulti-view video coding, motion estimation, disparity estimation, block size decision, motion activity
AbstractMotion estimation and disparity estimation using variable block sizes have been exploited in multi-view video coding to effectively improve the coding efficiency, but at the expense of yielding higher computational complexity. In this paper, a fast block size decision algorithm, called motion activity-based block size decision (MABSD), is proposed. In our approach, the various motion estimation and disparity estimation block sizes are classified into four classes, and only one of them will be chosen to further identify the optimal block size within that class according to the measured motion activity of the current macroblock. The above-mentioned motion activity can be measured by the maximum city-block distance of a set of motion vectors taken from the adjacent macroblocks in the current view and its neighboring view. Experimental results have shown that compared with exhaustive block size decision, which is a default approach set in the JMVM reference software, the proposed MABSD algorithm achieves a reduction of computational complexity by 42% on average, while incurring only 0.01 dB loss in peak signal-to-noise ratio (PSNR) and 1% increment on the total bit rate.

P2-4
TitleColor Based Depth Up-Sampling for Depth Compression
AuthorMeindert Onno Wildeboer, Tomohiro Yendo, Mehrdad Panahpour Tehrani (Nagoya University, Japan), Toshiaki Fujii (Tokyo Institute of Technology, Japan), Masayuki Tanimoto (Nagoya University, Japan)
Pagepp. 170 - 173
Keyworddepth up-sampling, depth coding, depth-map, FTV, 3DTV
Abstract3D scene information can be represented in several ways. In applications based on a (N-)view plus (N-)depth representation, both view and depth data is compressed. In this paper we present a depth compression method containing an depth up-sample filter which uses the color view as prior. Our method of depth down-/up-sampling is able to maintain clear object boundaries in the reconstructed depth maps. Our experimental results show that the proposed depth re-sampling filter, used in combination with a standard state-of-the art video encoder, can increase both the coding efficiency and rendering quality.

P2-5
TitleEfficient Free Viewpoint Video-On-Demand Scheme Realizing Walk-Through Experience
AuthorAkio Ishikawa, Hiroshi Sankoh, Sei Naito, Shigeyuki Sakazawa (KDDI R&D Laboratories Inc., Japan)
Pagepp. 174 - 177
KeywordFreeviewpoint Television, Multi-view Video, Freeviewpoint Video Transmission, Walk-through, Multi-texturing
AbstractThis paper presents an efficient VOD scheme for FTV, and proposes a data format and its data generation method to provide a walk-through experience. We employ a hybrid rendering approach to describe a 3D scene using 3D model data for objects and textures. In this paper we propose an efficient texture data format, which removes the redundancy due to occlusion of objects by employing an orthogonal projection image for each object. The advantage of the data format is great simplification at the server to choose the transmitted images that correspond to the requested viewpoint.

P2-6
Title3-D Video Coding Using Depth Transition Data
AuthorWoo-Shik Kim, Antonio Ortega (University of Southern California, U.S.A.), Jaejoon Lee, HoCheon Wey (Samsung Electronics Co., Ltd., Republic of Korea)
Pagepp. 178 - 181
Keyword3-D video coding, multiview plus depth, view synthesis
AbstractThe objective is to develop new 3-D video coding system to provide better coding efficiency with improved subjective quality. We analyzed rendered view distortions in DIBR system, and found that the depth map coding distortion leads to “erosion artifacts”, which lead to significant perceptual quality degradation. To solve this, we propose a solution using depth transition data which indicates the camera position where depth changes. Simulation results show that significant subjective quality improvement with maximum PSNR gains of 0.5 dB.

P2-7
TitleSubjective Assessment of Frame Loss Concealment Methods in 3D Video
AuthorJoao Carreira, Luis Pinto, Nuno Rodrigues, Sergio Faria, Pedro Assuncao (Institute of Telecommunications / Polytechnic Institute of Leiria - ESTG, Portugal)
Pagepp. 182 - 185
Keyword3D video, Frame loss concealment, Subjective Assessment
AbstractThis paper investigates the subjective impact resulting from different concealment methods for coping with lost frames in 3D video communication systems. It is assumed that a high priority channel is assigned to the main view and only the auxiliary view is subject to either transmission errors or packet loss, leading to missing frames at the decoder output. Three methods are used for frame concealment under different loss ratios. The results show that depth is well perceived by users and the subjective impact of frame loss not only depends on the concealment method but also exhibits high correlation with the disparity of the original sequence. It is also shown that under heavy loss conditions it is better to switch from 3D to 2D rather than presenting concealed 3D video to users.

P2-8
TitleA Novel Upsampling Scheme for Depth Map Compression in 3DTV System
AuthorYanjie Li, Lifeng Sun (Tsinghua University, China)
Pagepp. 186 - 189
KeywordDepth map compression, Resolution reduction, Upsampling, Rate-distortion
AbstractIn this paper, we propose a novel two-step depth map upsampling scheme to address the problem on 3D videos. The first step utilizes the full resolution 2D color map to help reconstruct a more accurate full resolution depth map. And the second step further flats the reconstructed depth map to ensure its local uniformity. Test results show that the proposed upsampling scheme achieves up to 2dB coding gains for the rendering of free-viewpoint video, and improves its perceptual quality significantly.


[Beyond H.264/MPEG-4 AVC and related topics]

P2-9
TitleAdaptive Direct Vector Derivation for Video Coding
AuthorYusuke Itani, Shunichi Sekiguchi, Yoshihisa Yamada (Information Technology R&D Center, Mitsubishi Electric Corporation, Japan)
Pagepp. 190 - 193
Keyworddirect mode, motion vector predictor, HEVC, H.264, extended macroblock
AbstractThis paper proposes a new method for improving direct prediction scheme that has been employed in conventional video coding standards such as AVC/H.264. We extend direct prediction concept to achieve better adaptation to local statistics of video source with the assumption of the use of larger motion blocks than conventional macroblock size. Experimental results show the proposed method provides up to 3.3% bitrate saving in low-bitrate coding.

P2-10
TitleInter Prediction Based on Spatio-Temporal Adaptive Localized Learning Model
AuthorHao Chen, Ruimin Hu, Zhongyuan Wang, Rui Zhong (Wuhan University, China)
Pagepp. 194 - 197
KeywordInter prediction, STALL, LSP
AbstractInter prediction based on block matching motion estimation is important for video coding. But this method suffers from the additional overhead in data rate representing the motion information that needs to be transmitted to the decoder. To solve this problem, we present an improved implicit motion information inter prediction algorithm for P slice in H.264/AVC based on the spatio-temporal adaptive localized learning (STALL) model. According to 4*4 block transform structure in H.264/AVC, we first adaptively choose nine spatial neighbors and nine temporal neighbors, and a localized 3D casual cube is designed as training window. By using these information, the model parameters could be adaptively computed based on the Least Square Prediction (LSP) method. Finally, we add a new inter prediction mode into H.264/AVC standard for P slice. The experimental results show that our algorithm improves encoding efficiency compared with H.264/AVC standard, with relatively increases in complexity.

P2-11
TitleIntra Picture Coding with Planar Representations
AuthorJani Lainema, Kemal Ugur (Nokia Research Center, Finland)
Pagepp. 198 - 201
Keywordvideo coding, intra coding, planar coding, H.264/AVC, HEVC
AbstractIn this paper we introduce a novel concept for Intra coding of pictures especially suitable for representing smooth image segments. Traditional block based transform coding methods cause visually annoying blocking artifacts for image segments with gradually changing smooth content. The proposed solution overcomes this drawback by defining a fully continuous surface of sample values approximating the original image. The gradient of the surface is indicated by transmitting values for selected control points within the image segment and the surface itself is obtained by interpolating sample values in-between the control points. This approach is found to provide up to 30 percent bitrate reductions in the case of natural imagery and it has also been adopted to the initial HEVC codec design by JCT-VC.

P2-12
TitleAdaptive Global Motion Temporal Prediction for Video Coding
AuthorAlexander Glantz, Andreas Krutz, Thomas Sikora (Technische Universität Berlin, Germany)
Pagepp. 202 - 205
KeywordH.264/AVC, video coding, global motion, temporal filtering, prediction
AbstractDepending on the content of a video sequence, the amount of bits spent for the transmission of motion vectors can be enormous. A global motion model can be a better representation of movement in these regions than a motion vector. This paper presents a novel prediction technique that is based on global motion compensation and temporal filtering. The new approach is incorporated into H.264/AVC and outperforms the reference by up to 14%.

P2-13
TitleHighly Efficient Video Compression Using Quadtree Structures and Improved Techniques for Motion Representation and Entropy Coding
AuthorDetlev Marpe, Heiko Schwarz, Thomas Wiegand, Sebastian Bosse, Benjamin Bross, Philipp Helle, Tobias Hinz, Heiner Kirchhoffer, Haricharan Lakshman, Tung Nguyen, Simon Oudin, Mischa Siekmann, Karsten Suehring, Martin Winken (Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Germany)
Pagepp. 206 - 209
KeywordVideo coding, H.265, HEVC
AbstractThis paper describes a novel video coding scheme that can be considered as a generalization of the block-based hybrid video coding approach of H.264/AVC. While the individual building blocks of our approach are kept simple similarly as in H.264/AVC, the flexibility of the block partitioning for prediction and transform coding has been substantially in-creased. This is achieved by the use of nested and pre-configurable quadtree structures, such that the block parti-tioning for temporal and spatial prediction as well as the space-frequency resolution of the corresponding prediction residual can be adapted to the given video signal in a highly flexible way. In addition, techniques for an improved motion representation as well as a novel entropy coding concept are included. The presented video codec was submitted to a Call for Proposals of ITU-T VCEG and ISO/IEC MPEG and was ranked among the five best performing proposals, both in terms of subjective and objective quality.


[Image/video coding and related topics]

P2-14
TitleDictionary Learning-Based Distributed Compressive Video Sensing
AuthorHung-Wei Chen, Li-Wei Kang, Chun-Shien Lu (Academia Sinica, Taiwan)
Pagepp. 210 - 213
Keywordcompressive sensing, sparse representation, dictionary learning, single-pixel camera, l1-minimization
AbstractWe address an important issue of fully low-cost and low-complex video compression for use in resource-extremely limited sensors/devices. Conventional motion estimation-based video compression or distributed video coding (DVC) techniques all rely on the high-cost mechanism, namely, sensing/sampling and compression are disjointedly performed, resulting in unnecessary consumption of resources. That is, most acquired raw video data will be discarded in the (possibly) complex compression stage. In this paper, we propose a dictionary learning-based distributed compressive video sensing (DCVS) framework to “directly” acquire compressed video data. Embedded in the compressive sensing (CS)-based single-pixel camera architecture, DCVS can compressively sense each video frame in a distributed manner. At DCVS decoder, video reconstruction can be formulated as an l1-minimization problem via solving the sparse coefficients with respect to some basis functions. We investigate adaptive dictionary/basis learning for each frame based on the training samples extracted from previous reconstructed neighboring frames and argue that much better basis can be obtained to represent the frame, compared to fixed basis-based representation and recent popular “CS-based DVC” approaches without relying on dictionary learning.

P2-15
TitleMedium-Granularity Computational Complexity Control for H.264/AVC
AuthorXiang Li, Mathias Wien, Jens-Rainer Ohm (Institute of Communications Engineering, RWTH Aachen University, Germany)
Pagepp. 214 - 217
KeywordComputational complexity control, computational scalability
AbstractToday, video applications on handheld devices become more and more popular. Due to limited computational capability of handheld devices, complexity constrained video coding draws much attention. In this paper, a medium-granularity computational complexity control (MGCC) is proposed for H.264/AVC. First, a large dynamic range in complexity is achieved by taking 16x16 motion estimation in a single reference frame as the basic computational unit. Then a high coding efficiency is obtained by an adaptive computation allocation at MB level. Simulations show that coarse-granularity methods cannot work when the normalized complexity is below 15\%. In contrast, the proposed MGCC performs well even when the complexity is reduced to 8.8\%. Moreover, an average gain of 0.3 dB over coarse-granularity methods in BD-PSNR is obtained for 11 sequences when the complexity is around 20\%.

P2-16
TitleAn Improved Wyner-Ziv Video Coding With Feedback Channel
AuthorFeng Ye, Aidong Men, Bo Yang, Manman Fan, Kan Chang (School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, China)
Pagepp. 218 - 221
KeywordWyner-Ziv video coding, motion activity, 3DRS motion estimation, side information
AbstractThis paper presents a improved feedback-assisted low complexity WZVC scheme. The performance of this scheme is improved by two enhancements: an improved mode-based key frame encoding and a 3DRS-assisted (three-dimensional recursive search assisted) motion estimation algorithm for WZ encoding. Experimental results show that our coding scheme can achieve significant gain compared to state-of-the-art TDWZ codec while still low encoding complexity.

P2-17
TitleBackground Aided Surveillance-Oriented Distributed Video Coding
AuthorHongbin Liu (Harbin Institute of Technology, China), Siwei Ma (Peking University, China), Debin Zhao (Harbin Institute of Technology, China), Wen Gao (Peking University, China), Xiaopeng Fan (Harbin Institute of Technology, China)
Pagepp. 222 - 225
Keywordsurveillance, background, distributed video coding
AbstractThis paper presents a background aided surveillance-oriented distributed video coding system. A high quality background frame is encoded for each group of pictures (GOP), which can provides high quality SI for the background parts of the Wyner-Ziv (WZ) frames. Consequently, bit rate for the WZ frames can be reduced. Experimental results demonstrate that the proposed system can decrease the bit rate by up to 67.4% when compared with traditional DVC codec.

P2-18
TitleContent-Adaptive Spatial Scalability for Scalable Video Coding
AuthorYongzhe Wang (Shanghai Jiao Tong University, China), Nikolce Stefanoski (Disney Research Zurich, Switzerland), Xiangzhong Fang (Shanghai Jiao Tong University, China), Aljoscha Smolic (Disney Research Zurich, Switzerland)
Pagepp. 226 - 229
KeywordH.264/AVC, scalable video coding, spatial scalability, content-adaptation, non-linear image warping
AbstractThis paper presents an enhancement of the SVC extension of the H.264/AVC standard by content-adaptive spatial scalability (CASS). The video streams (spatial layers), which are used as input to the encoder, are created by content-adaptive and art-directable retargeting of existing high resolution video. The non-linear dependencies between such video streams are efficiently exploited by CASS for scalable coding. This is achieved by integrating warping-based non-linear texture prediction and warp coding into the SVC framework.

P2-19
TitleColorization-Based Coding by Focusing on Characteristics of Colorization Bases
AuthorShunsuke Ono, Takamichi Miyata, Yoshinori Sakai (Tokyo Institute of Technology, Japan)
Pagepp. 230 - 233
KeywordColorization, Colorization-based coding, Representative pixels, Redundancy, Correct color
AbstractA novel approach to image compression called colorization-based coding has recently been proposed. It automatically extracts representative pixels from an original color image at an encoder and restores a full color image by using colorization at a decoder. However, previous studies on colorization-based coding extract redundant representative pixels. We propose a new colorization-based coding method focuses on the colorization basis. Experimental results revealed that our method can drastically suppress the information amount compared conventional method while objective quality is maintained.

P2-20
TitleWyner-Ziv Coding of Multispectral Images for Space and Airborne Platforms
AuthorShantanu Rane, Yige Wang, Petros Boufounos, Anthony Vetro (Mitsubishi Electric Research Laboratories, U.S.A.)
Pagepp. 234 - 237
KeywordMultispectral, Wyner-Ziv coding, LDPC code
AbstractThis paper investigates the application of lossy distributed source coding to high resolution multispectral images. The choice of distributed source coding is motivated by the need for very low encoding complexity on space and airborne platforms. The data consists of red, blue, green and infra-red channels and is compressed in an asymmetric Wyner-Ziv setting. One image channel is compressed using traditional JPEG and transmitted to the ground station where it is available as side information for Wyner-Ziv coding of the other channels. Encoding is accomplished by quantizing the image data, applying a Low-Density Parity Check code to the remaining three image channels, and transmitting the resulting syndromes. At the ground station, the image data is recovered from the syndromes by exploiting the correlation in the frequency spectrum of the band being decoded and the JPEG-decoded side information band. In experiments with real uncompressed images obtained by a satellite, the rate-distortion performance is found to be vastly superior to JPEG compression of individual image channels and rivals that of JPEG2000 at much lower encoding complexity.

P2-21
TitleReversible Component Transforms by the LU Factorization
AuthorHisakazu Kikuchi, Junghyeun Hwang, Shogo Muramatsu (Niigata University, Japan), Jaeho Shin (Dongguk University, Republic of Korea)
Pagepp. 238 - 241
Keywordcomponent transform, image compression, LU factorization, lifting, round-off error
AbstractA scaled transform is defined for a given irreversible linear transformation based on the LU factorization of a nonsingular matrix so that the transformation may be computed in a lifting form and hence may be reversible. Round-off errors in the lifting computation and the computational complexity are analyzed. Some reversible component transforms are presented and experimented to give some remarks to image compression applications. Discussions are developed with coding gain and actual bit rates.

P2-22
TitleMOS-Based Bit Allocation in SNR-Temporal Scalable Coding
AuthorYuya Yamasaki, Toshiyuki Yoshida (University of Fukui, Japan)
Pagepp. 242 - 245
Keywordscalable coding, video coding, MOS, bit allocation
AbstractIn a scalable video coding, improvement in the motion smoothness (frame rate) and in the spatial quality (SNR) conflicts within a given bit rate. Since the motion and the spatial activities in a target video varies from scene to scene, these activities should be taken into account in order to optimally allocate the temporal and quality scalability. This paper proposes a fundamental idea for allocating bit rates in the temporal and quality scalability based on the maximization of the estimated mean opinion score (MOS) for each scene. A reduction technique of MOS fluctuation in the enhancement layer is discussed as an application of the proposed technique.


[Image/video processing and related topics]

P2-23
TitleAutomatic Moving Object Extraction Using X-means Clustering
AuthorKousuke Imamura, Naoki Kubo, Hideo Hashimoto (Kanazawa University, Japan)
Pagepp. 246 - 249
Keywordmoving object extraction, x-means clustering, watershed algorithm, voting method
AbstractThe present paper proposes an automatic extraction technique of moving objects using x-means clustering. The proposed technique is an extended k-means clustering and can determine the optimal number of clusters based on the Bayesian Information Criterion(BIC). In the proposed method, the feature points are extracted from a current frame, and x-means clustering classifies the feature points based on their estimated affine motion parameters. A label is assigned to the segmented region, which is obtained by morphological watershed, by voting for the feature point cluster in each region. The labeling result represents the moving object extraction. Experimental results reveal that the proposed method provides extraction results with the suitable object number.

P2-24
TitleAccurate Motion Estimation for Image of Spatial Periodic Pattern
AuthorJun-ichi Kimura, Naohisa Komatsu (School of Fundamental Science and Engineering, Waseda University, Japan)
Pagepp. 250 - 253
Keywordmotion estimation, motion vector
AbstractWe investigate the mechanism of the error of motion estimation for image including spatial periodic patterns. With a motion estimation model, we conclude that block matching distortion caused by motion vector sampling error (DVSE) degrades the motion estimation accuracy. We propose a new motion estimation using a maximum value of DVSE for each block. Simulations show that precision of proposed method is over 98% which are superior to full search method.

P2-25
TitleDirection-Adaptive Image Upsampling Using Double Interpolation
AuthorYi-Chun Lin, Yi-Nung Liu, Shao-Yi Chien (National Taiwan University, Taiwan)
Pagepp. 254 - 257
Keyworddouble interpolation, upsampling, direction-adaptive, zigzagging, bicubic
AbstractDouble interpolation quality evaluation can be used as a measurement of an interpolation operation. By using this double interpolation framework, a high efficient direction-adaptive upsampling algorithm is proposed without any threshold setting and post-processing. With the proposed upsampling algorithm, the problem of zigzagging artifacts on the edge no longer exists. Moreover, the proposed algorithm has low computation complexity. The experimental results show that the proposed algorithm has high quality image output.

P2-26
TitleBlind GOP Structure Analysis of MPEG-2 and H.264/AVC Decoded Video
AuthorGilbert Yammine, Eugen Wige, Andre Kaup (Multimedia Communications and Signal Processing - University of Erlangen-Nuremberg, Germany)
Pagepp. 258 - 261
KeywordGOP Structure, Blind Analysis, Noise Estimation
AbstractIn this paper, we provide a simple method for analyzing the GOP structure of an MPEG-2 or H.264/AVC decoded video without having access to the bitstream. Noise estimation is applied on the decoded frames and the variance of the noise in the different I-, P-, and B-frames is measured. After the encoding process, the noise variance in the video sequence shows a periodic pattern, which helps in the extraction of the GOP period, as well as the type of frames. This algorithm can be used along with other algorithms to blindly analyze the encoding history of a video sequence. The method has been tested on several MPEG-2 DVB and DVD streams, as well as on H.264/AVC encoded sequences, and shows successful results in both cases.

P2-27
TitleDistributed Video Coding Based on Adaptive Slice Size Using Received Motion Vectors
AuthorKyungyeon Min, Seanae Park, Donggyu Sim (Kwangwoon University, Republic of Korea)
Pagepp. 262 - 265
KeywordDVC, crossover, adaptive slice control
AbstractIn this paper, we propose a new distributed video coding (DVC) method based on adaptive slice size using received motion vectors (MVs). In the proposed algorithm, the MVs estimated at a DVC decoder are transmitted to a corresponding encoder. In the proposed encoder, a predicted side information (PSI) is reconstructed with the transmitted MVs and key frames. Therefore, the PSI can be generated same to side information (SI) at the decoder. We can, also, calculate an exact crossover rate between the SI and original input frame using PSI and the original frame. As a result, the proposed method can transmit minimum parity bits to maximize error correction ability of a channel decoder with minimal computational complexity. Experimental results show that the proposed algorithm is better than several conventional DVC methods.

P2-28
TitleImproved Watermark Sharing Scheme Using Minimum Error Selection and Shuffling
AuthorAroba Khan, Yohei Yokoyama, Kiyoshi Tanaka (Shinshu University, Japan)
Pagepp. 266 - 269
KeywordWatermark Sharing, Halftoning, error diffussion
AbstractIn this work, we focus on a watermark sharing scheme using error diffusion called DHCED, and try to overcome some drawbacks of this method. The proposed method simultaneously generates carrier halftone images that share the watermark information by selecting the minimum error caused in the noise function for watermark embedding. Also, the proposed method shuffles watermark image before embedding not only to increase the secrecy of the embedded watermark information but also improve the watermark detection ratio as well as the watermark appearance in the detection process.


[Quality, system, applications, and other topics]

P2-29
TitleOn the Duality of Rate Allocation and Quality Indices
AuthorThomas Richter (University of Stuttgart, Germany)
Pagepp. 270 - 273
KeywordJPEG 2000, SSIM
AbstractIn a recent work, the author proposed to study the performance of still image quality indices such as the SSIM by using them as bjective function of rate allocation algorithms. The outcome of that work was not only a multi-scale SSIM optimal JPEG 2000 implementation, but also a first-order approximation of the MS-SSIM that is surprisingly similar to more traditional contrast-sensitivity and visual masking based approaches. It will be seen in this work that the only difference between the latter works and the MS-SSIM index is the choice of the exponent of the masking term, and furthermore, that a slight modification of the SSIM definition reproducing the traditional exponent is able to improve the performance of the index at or below the visual threshold. It is hence demonstrated that the duality of quality indices and rate allocation helps to improve both the visual performance of the compression codec and the performance of the index.

P2-30
TitleImage Quality Assessment Based on Local Orientation Distributions
AuthorYue Wang (Graduate University of Chinese Academy of Sciences, China), Tingting Jiang, Siwei Ma, Wen Gao (Institute of Digital Media, Peking University, China)
Pagepp. 274 - 277
Keywordmage quality assessment (IQA), human visual system (HVS), Histograms of Oriented Gradients (HOG)
AbstractImage quality assessment (IQA) is very important for many image and video processing applications, e.g. compression, archiving, restoration and enhancement. An ideal image quality metric should achieve consistency between image distortion prediction and psychological perception of human visual system (HVS). Inspired by that HVS is quite sensitive to image local orientation features, in this paper, we propose a new structural information based image quality metric, which evaluates image distortion by computing the distance of Histograms of Oriented Gradients (HOG) descriptors. Experimental results on LIVE database show that the proposed IQA metric is competitive with state-of-the-art IQA metrics, while keeping relatively low computing complexity.

P2-31
TitleDistance and Relative Speed Estimation of Binocular Camera Images Based on Defocus and Disparity Information
AuthorMitsuyasu Ito, Yoshiaki Takada, Takayuki Hamamoto (Tokyo University of Science, Japan)
Pagepp. 278 - 281
KeywordITS, focus blur, disparity information, distance estimation, relative speed
AbstractIn this paper, we discuss a method of distance and relative speed estimation for ITS. In this method, we use different focus positions of two cameras for obtaining the amount of focus blur. Next, we propose the method of distance estimation by the amount and disparity information. According to the result of simulation, proposed method was reasonably. In addition, we compose a prototype system for the real-time estimation. As a result of the implementation of processing, system was properly validated.

P2-32
TitleComparing Two Eye-Tracking Databases: The Effect of Experimental Setup and Image Presentation Time on the Creation of Saliency Maps
AuthorUlrich Engelke (Blekinge Institute of Technology, Sweden), Hantao Liu (Delft University of Technology, Netherlands), Hans-Jürgen Zepernick (Blekinge Institute of Technology, Sweden), Ingrid Heynderickx (Philips Research Laboratories, Netherlands), Anthony Maeder (University of Western Sydney, Australia)
Pagepp. 282 - 285
KeywordEye tracking experiments, Visual saliency, Correlation analysis, Natural images
AbstractVisual attention models are typically designed based on human gaze patterns recorded through eye tracking. In this paper, two similar eye tracking experiments from independent laboratories are presented, in which humans observed natural images under task-free condition. The resulting saliency maps are analysed with respect to two criteria; the consistency between the experiments and the impact of the image presentation time. It is shown, that the saliency maps between the experiments are strongly correlated independent of presentation time. It is further revealed that the presentation time can be reduced without substantially sacrificing the accuracy of the convergent saliency map. The results provide valuable insight into the similarity of saliency maps from independent laboratories and are highly beneficial for the creation of converging saliency maps at reduced experimental time and cost.

P2-33
TitleSuccessive Refinement of Overlapped Cell Side Quantizers for Scalable Multiple Description Coding
AuthorMuhammad Majid, Charith Abhayaratne (The University of Sheffield, U.K.)
Pagepp. 286 - 289
KeywordMultiple description coding, scalability, robustness
AbstractScalable multiple description coding (SMDC) provides reliability and facility to truncate the descriptions according to the user rate-distortion requirements. In this paper we generalize the conditions of successive refinement of the side quantizer of a multiple description scalar quantizer that has overlapped quantizer cells generated by a modified linear index assignment matrix. We propose that the split or refinement factor for each of the refinement side quantizers should be greater than the maximum side quantizer bin spread and should not be integer multiples of each other for satisfying the SMDC distortion conditions and verify through simulation results on scalable multiple description image coding.

P2-34
TitleVQ Based Data Hiding Method for Still Images by Tree-Structured Links
AuthorHisashi Igarashi, Yuichi Tanaka, Madoka Hasegawa, Shigeo Kato (Utsunomiya University, Japan)
Pagepp. 290 - 293
KeywordData Hiding, Vector Quantization, Tree-Structured Links
AbstractIn this paper, we propose a data embedding method into still images based on Vector Quantization (VQ). Several VQ-based data embedding methods, called 'Mean Gray-Level Embedding method (MGLE)' or 'Pair wise Nearest-Neighbor Embedding method (PNNE)' have been proposed. Those methods are, however, not sufficiently effective. Meanwhile an efficient adaptive data hiding method called 'Adaptive Clustering Embedding method (ACE)' was pro-posed, but is somewhat complicated because the VQ indices have to be adaptively clustered in the embedding process. In our proposed method, output vectors are linked with tree structure and information is embedded by using some of linked vectors. The simulation results show that our proposed method indicates higher SNR than the conventional methods under the same amounts of embedded data.