Catálogo de publicaciones - libros

Compartir en
redes sociales


Advances in Multimedia Information Processing: 7th Pacific Rim Conference on Multimedia, Hangzhou, China, November 2-4, 2006, Proceedings

Yueting Zhuang ; Shi-Qiang Yang ; Yong Rui ; Qinming He (eds.)

En conferencia: 7º Pacific-Rim Conference on Multimedia (PCM) . Hangzhou, China . November 2, 2006 - November 4, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Applications; Multimedia Information Systems; Information Storage and Retrieval; Computer Communication Networks; Information Systems Applications (incl. Internet); Image Processing and Computer Vision

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-48766-1

ISBN electrónico

978-3-540-48769-2

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

Robust Recognition of Noisy and Partially Occluded Faces Using Iteratively Reweighted Fitting of Eigenfaces

Wangmeng Zuo; Kuanquan Wang; David Zhang

Robust recognition of noisy and partially occluded faces is essential for an automated face recognition system, but most appearance-based methods (e.g., Eigenfaces) are sensitive to these factors. In this paper, we propose to address this problem using an iteratively reweighted fitting of the Eigenfaces method (IRF-Eigenfaces). Unlike Eigenfaces fitting, in which a simple linear projection operation is used to extract the feature vector, the IRF-Eigenfaces method first defines a generalized objective function and then uses the iteratively reweighted least-squares (IRLS) fitting algorithm to extract the feature vector by minimizing the generalized objective function. Our simulated and experimental results on the AR database show that IRF-Eigenfaces is far superior to both Eigenfaces and to the local probabilistic method in recognizing noisy and partially occluded faces.

Pp. 844-851

Pitching Shot Detection Based on Multiple Feature Analysis and Fuzzy Classification

Wen-Nung Lie; Guo-Shiang Lin; Sheng-Lung Cheng

Pitching-shot is known to be a root-shot for subsequent baseball video content analysis, e.g., event or highlight detection, and video structure parsing. In this paper, we integrate multiple feature analysis and fuzzy classification techniques to achieve pitching-shot detection in commercial baseball video. The adopted features include color (e.g., field color percentage and dominant color), temporal motion, and spatial activity distribution. On the other hand, domain knowledge of the baseball game forms the basis for fuzzy inference rules. Experiment results show that our detection rate is capable of achieving 95.76%.

Pp. 852-860

A Novel Spatial-Temporal Position Prediction Motion-Compensated Interpolation for Frame Rate Up-Conversion

Jianning Zhang; Lifeng Sun; Yuzhuo Zhong

In this paper, a novel spatial-temporal position prediction motion-compensated interpolation method (MCI) for frame rate up-conversion is proposed using the transmitted Motion Vectors (MVs). Based on our previous proposed GMPP algorithm, the new method uses the motion vectors correction (MVC) first. Then joint spatial-temporal position prediction algorithm is applied on the transmitted MVs to predict more accurately the positions the interpolated blocks really move to, which makes the MVs used for interpolation more nearer to the true motion. Then the weighted-adaptive spatial-temporal MCI algorithm is used to complete the final interpolation. Applied to the H.264 decoder, the new proposed method can achieve significant increase on PSNR and obvious decrease of the block artifacts, which can be widely used in video streaming and distributed video coding applications.

Pp. 870-879

Web Image Clustering with Reduced Keywords and Weighted Bipartite Spectral Graph Partitioning

Su Ming Koh; Liang-Tien Chia

There has been recent work done in the area of search result organization for image retrieval. The main aim is to cluster the search results into semantically meaningful groups. A number of works benefited from the use of the bipartite spectral graph partitioning method [3][4]. However, the previous works mentioned use a set of keywords for each corresponding image. This will cause the bipartite spectral graph to have a high number of vertices and thus high in complexity. There is also a lack of understanding of the weights used in this method. In this paper we propose a two level reduced keywords approach for the bipartite spectral graph to reduce the complexity of bipartite spectral graph. We also propose weights for the bipartite spectral graph by using hierarchical term frequency-inverse document frequency (). Experimental data show that this weighted bipartite spectral graph performs better than the bipartite spectral graph with a unity weight. We further exploit the weights in merging the clusters.

Pp. 880-889

An Architecture to Connect Disjoint Multimedia Networks Based on Node’s Capacity

Jaime Lloret; Juan R. Diaz; Jose M. Jimenez; Fernando Boronat

TCP/IP protocol suite allows building multimedia networks of nodes according to nodes’ content sharing. Some of them have different types of protocols (some examples given in unstructured P2P file-sharing networks are Gnutella 2, FastTrack, OpenNap, eDonkey and so on). This paper proposes a new protocol to connect disjoint multimedia networks using the same resource or content sharing to allow multimedia content distribution. We show how nodes connect with nodes from other multimedia networks based on nodes’ capacity. The system is scalable and fault-tolerant. The designed protocol, its mathematical model, the messages developed and their bandwidth cost are described. The architecture has been developed to be applied in multiple types of multimedia networks (P2P file-sharing, CDNs and so on). We have developed a general-purpose application tool with all designed features. Results show the number of octets, the number of messages and the number of broadcasts sent through the network when the protocol is running.

Pp. 890-899

Quantitative Measure of Inlier Distributions and Contour Matching for Omnidirectional Camera Calibration

Yongho Hwang; Hyunki Hong

This paper presents a novel approach to both the calibration of the omnidirectional camera and the contour matching in architectural scenes. The proposed algorithm divides an entire image into several sub-regions, and then examines the number of the inliers in each sub-region and the area of each region. In our method, the standard deviations are used as quantitative measure to select a proper inlier set. Since the line segments of man-made objects are projected to contours in omnidirectional images, contour matching problem is important for more precise camera recovery. We propose a novel contour matching method using geometrical information of the omnidirectional camera.

Pp. 900-908

High-Speed All-in-Focus Image Reconstruction by Merging Multiple Differently Focused Images

Kazuya Kodama; Hiroshi Mo; Akira Kubota

This paper deals with high-speed all-in-focus image reconstruction by merging multiple differently focused images. Previously, we proposed a method of generating an all-in-focus image from multi-focus imaging sequences based on spatial frequency analysis using three-dimensional FFT. In this paper, first, we combine the sequence into a two-dimensional image having fine quantization step size. Then, just by applying a certain convolution using two-dimensional FFT to the image, we realize high-speed reconstruction of all-in-focus images robustly. Some simulations utilizing synthetic images are shown and conditions achieving the good quality of reconstructed images are discussed. We also show experimental results of high-speed all-in-focus image reconstruction compared with those of the previous method by using real images.

Pp. 909-918

A Real-Time Video Deinterlacing Scheme for MPEG-2 to AVS Transcoding

Qian Huang; Wen Gao; Debin Zhao; Cliff Reader

Real-time motion compensated (MC) deinterlacing is defined to be deinterlacing at the decoder in real-time at low cost using the transmitted motion vectors. Although the possibility of this was shown ten years ago, unfortunately few such studies have been reported so far. The major difficulty is that motion vectors derived from video decoders, which generally refer to average motion over several field periods instead of motion between adjacent fields, are far from perfect. In this paper, a real-time MC deinterlacing scheme is proposed for transcoding from MPEG-2 to AVS, which is the Audio Video coding Standard of China targeting at higher coding efficiency and lower complexity than existing standards for high definition video coding. Experimental results show that the presented scheme is more insensitive to incorrect motion vectors than conventional algorithms.

Pp. 919-926

Persian Text Watermarking

Ali Asghar Khodami; Khashayar Yaghmaie

Digital watermarking applies to variety of media including image, video, audio and text. Because of the nature of digital text, its watermarking methods are special. Moreover, these methods basically depend on the script used in the text. This paper reviews application of digital watermarking to Farsi (Persian) and similar scripts (like Arabic, Urdu and Pashto) which are substantially different from English and other western counterparts, especially in using connected alphabets. Focusing on the special characteristics of these scripts, application of common methods used for text watermarking is studied. By comparing the results, suitable methods which results in the highest payload will be presented.

Pp. 927-934

Three Dimensional Reconstruction of Structured Scenes Based on Vanishing Points

Guanghui Wang; Shewei Wang; Xiang Gao; Yubing Li

The paper is focused on the problem of 3D reconstruction of structured scenes from uncalibrated images based on vanishing points. Under the assumption of three-parameter-camera model, we prove that with a certain preselected world coordinate system, the camera projection matrix can be uniquely determined from three mutually orthogonal vanishing points that can be obtained from images. We also prove that global consistent projection matrices can be recovered if an additional set of correspondences across multiple images is present. Compared with previous stereovision techniques, the proposed method avoids the bottleneck problem of image matching and is easy to implement, thus more accurate and robust results are expected. Extensive experiments on synthetic and real images validate the effectiveness of the proposed method.

Pp. 935-942