Catálogo de publicaciones - libros

Compartir en
redes sociales

Advances in Multimedia Information Processing: 7th Pacific Rim Conference on Multimedia, Hangzhou, China, November 2-4, 2006, Proceedings

Yueting Zhuang ; Shi-Qiang Yang ; Yong Rui ; Qinming He (eds.)

En conferencia: 7º Pacific-Rim Conference on Multimedia (PCM) . Hangzhou, China . November 2, 2006 - November 4, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Applications; Multimedia Information Systems; Information Storage and Retrieval; Computer Communication Networks; Information Systems Applications (incl. Internet); Image Processing and Computer Vision

Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink


Tipo de recurso:


ISBN impreso


ISBN electrónico


Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

A Motion Vector Predictor Architecture for AVS and MPEG-2 HDTV Decoder

Junhao Zheng; Di Wu; Lei Deng; Don Xie; Wen Gao

In the advanced Audio Video coding Standard (AVS), many efficient coding tools are adopted in motion compensation, such as new motion vector prediction, direct mode matching, variable block-sizes etc. However, these features enormously increase the computational complexity and the memory bandwidth requirement and make the traditional MV predictor more complicated. This paper proposes an efficient MV predictor architecture for both AVS and MPEG-2 decoder. The proposed architecture exploits the parallelism to accelerate the speed of operations and uses the dedicated design to optimize the memory access. In addition, it can reuse the on-chip buffer to support the MV error-resilience for MPEG-2 decoding. The design has been described in Verilog HDL and synthesized using 0.18m CMOS cells library by Design Compiler. The circuit costs about 62k logic gates when the working frequency is set to 148.5MHz. This design can support the real-time MV predictor of HDTV 1080i video decoding for both AVS and MPEG-2.

Pp. 424-431

Inter-camera Coding of Multi-view Video Using Layered Depth Image Representation

Seung-Uk Yoon; Eun-Kyung Lee; Sung-Yeol Kim; Yo-Sung Ho; Kugjin Yun; Sukhee Cho; Namho Hur

The multi-view video is a collection of multiple videos, capturing the same scene at different viewpoints. If we acquire multi-view videos from multiple cameras, it is possible to generate scenes at arbitrary view positions. It means that users can change their viewpoints freely and can feel visible depth with view interaction. Therefore, the multi-view video can be used in a variety of applications including three-dimensional TV (3DTV), free viewpoint TV, and immersive broadcasting. However, since the data size of the multi-view video linearly increases as the number of cameras, it is necessary to develop an effective framework to represent, process, and display multi-view video data. In this paper, we propose inter-camera coding methods of multi-view video using layered depth image (LDI) representation. The proposed methods represents various information included in multi-view video hierarchically based on LDI. In addition, we reduce a large amount of multi-view video data to a manageable size by exploiting spatial redundancies among multiple videos and reconstruct the original multiple viewpoints successfully from the constructed LDI.

Pp. 432-441

Optimal Priority Packetization with Multi-layer UEP for Video Streaming over Wireless Network

Huanying Zou; Chuang Lin; Hao Yin; Zhen Chen; Feng Qiu; Xuening Liu

Most of current packetization schemes consider only bit error or packet erasure, both of which are common in wireless networks. This paper addresses these two problems together, and proposes an optimal packetization scheme for video streaming over wireless network, which is independent of video coding method. To combat the packet erasure, priority packetization combined with multi-layer unequal error protection (UEP) is applied on video frames. Multi-layer UEP contains low-complexity duplication of high-priority packet in application layer and different retransmission limit in media access control layer. Content-aware rate-distortion optimization is also introduced in order to countermine the distortion caused by bit errors. Simulations show that our scheme gains 2.17 dB or more compared with the conventional scheme.

Pp. 442-449

Fuzzy Particle Swarm Optimization Clustering and Its Application to Image Clustering

Wensheng Yi; Min Yao; Zhiwei Jiang

Image classification and clustering is a challenging problem in computer vision. This paper proposed a kind of particle swarm optimization clustering approach: FPSOC to process image clustering problem. This approach considers each particle as a candidate cluster center. The particles fly in the solution space to search suitable cluster centers. This method is different from previous work in that it employs fuzzy concept in particle swarm optimization clustering and adopts attribute selection mechanism to avoid the ‘curse of dimensionality’ problem. The experimental results show that the presented approach can properly process image clustering problem.

Pp. 459-467

A New Fast Motion Estimation for H.264 Based on Motion Continuity Hypothesis

Juhua Pu; Zhang Xiong; Lionel M. Ni

H.264 video standard, in spite of its high quality, is too time-consuming for widespread acceptance in video applications, mainly due to its computationally complex motion estimation (ME). To reduce this complexity, we propose motion continuity hypothesis, which means that all motion vectors (MVs) of a block are usually located in a small area. This area is formalized as modified valid region (MVR), an improved version of valid region which is proposed by the present authors in a previous paper. Then, this paper develops a new fastME algorithm for H.264, called MVR-based fast ME (MVRF), which searches only a much smaller area in reference frames(RFs) for motion estimation than full search264 does, so it reduces up to 43% search pixels. MVRF is so deliberately chosen that on average, up to 98% MVs determined by MVRF coincide with those by full search H.264, therefore keeping the recovery quality and bit-rate almost the same as those of full search H.264.

Pp. 468-476

Statistical Robustness in Multiplicative Watermark Detection

Xingliang Huang; Bo Zhang

The requirement of robustness is of fundamental importance for all watermarking schemes in various application scenarios. When talking about watermark robustness, we usually mean that the receiver performance degrades smoothly with the attack power. Here we look from another angle, i.e., robustness in statistics. A new detector structure which is robust to small uncertainties in host signal modeling for multiplicative watermarking in the discrete Fourier transform (DFT) domain is presented. By relying on robust statistics theory, an -contamination model is applied to describe the magnitudes of the DFT spectrum, based on which we are able to derive a minimax detector that is most robust in a well-defined sense. Experiments on real images demonstrate that the new watermark detector performs more stably than classical ones.

Pp. 477-484

Adaptive Visual Regions Categorization with Sets of Points of Interest

Hichem Houissa; Nozha Boujemaa; Hichem Frigui

The Query By Visual Thesaurus (QBVT) paradigm has strongly contributed to the visual information retrieval objective when no starting example is available. The Visual Thesaurus is a representative summary of all the visual patches in the database. Its reliable construction helps the user expression a ”mental image” by composing the visual patches according to the details he has in mind. In this paper, we introduce a relational clustering algorithm (CARD) to build the Visual Thesaurus from regions finely described by variable signature dimensions. The resulting visual categories depict the variability of regions based on local color points of interest. Therefore, we extend first the notion of image matching to regions using non-traditional metrics suitable for the multi-dimensional variables. We also, introduce an appropriate relational clustering for regions categorization using the similarity matrix induced by the latter metrics. Moreover, we propose an efficient method to speed up distance computation and reduce the feature representatives based on adaptive clustering. Our approach was tested on generic images and gives perceptually relevant visual categories.

Pp. 485-493

A Publishing Framework for Digitally Augmented Paper Documents: Towards Cross-Media Information Integration

Xiaoqing Lu; Zhiwu Lu

Paper keeps as a key information medium and this has motivated the development of new technologies for digitally augmented paper (DAP) that enable printed content to be linked with multimedia information. Among those technologies, one simplest approach is to print some visible patterns on paper (e.g., barcodes in the margin) as cross-media links. Due to the latest progress in printing industry, some more sophisticated methods have been developed, that is, some kinds of patterns printed on the background of a page in a high resolution are almost invisible and then we are affected little when reading. For all these pattern-embedding based approaches to integrate printed and multimedia information, we aim to present a unified publishing framework independent of particular patterns and readers(e.g., cameras to capture patterns) used to realize DAP. The presented framework manages semantic information about printed documents, multimedia resources, and patterns as links between them and users are provided with a platform for publishing DAP documents.

Pp. 494-501

Web-Based Semantic Analysis of Chinese News Video

Huamin Feng; Zongqiang Pang; Kun Qiu; Guosen Song

The semantic analysis of the Chinese news video with the help of World Wide Web is proposed. First, we segment the news video into a series of story units. Second, we extract the key phrases from the corresponding ASR transcript of news story, and optimize the key phrases through computing both the correlation among key phrases and the correlation between key phrases and event. Finally, we get the news Web-page corresponding to the event from World Wide Web via the search engine, and obtain the information of news video through analyzing the news Web-page. In order to extract effectively the searching key phrases from the ASR transcription containing mistakes, this paper also presents a novel method of optimizing key phrase for searching, the experiment result with the set of Chinese news video (CCTV4_NEWS) from the TRECVID2005 shows that our approach is effective.

Pp. 502-509

Texture Synthesis Based on Minimum Energy Cut and Its Applications

Shuchang Xu; Xiuzi Ye; Yin Zhang; Sanyuan Zhang

In this paper, a simple but efficient texture synthesis algorithm is presented. New image is synthesized by a patch-based approach. Motivated by energy equation, the method can manipulate the overlap region perfectly. After the most reasonable cut path through overlap regions is found, satisfying resultant images whose size specified by user can be produced. As a general method, our algorithm is also applied to image composition and texture transfer—rendering a target image with given source texture image. Experiments show that our algorithm is very efficient and easy to implement....

Pp. 518-526