Catálogo de publicaciones - libros

Compartir en
redes sociales

Advances in Multimedia Information Processing: 8th Pacific Rim Conference on Multimedia, Hong Kong, China, December 11-14, 2007. Proceedings

Horace H.-S. Ip ; Oscar C. Au ; Howard Leung ; Ming-Ting Sun ; Wei-Ying Ma ; Shi-Min Hu (eds.)

En conferencia: 8º Pacific-Rim Conference on Multimedia (PCM) . Hong Kong, China . December 11, 2007 - December 14, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Applications; Multimedia Information Systems; Information Storage and Retrieval; Computer Communication Networks; Information Systems Applications (incl. Internet); Image Processing and Computer Vision

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2007	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-77254-5

ISBN electrónico

978-3-540-77255-2

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2007

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/978-3-540-77255-2_81

Random Subspace Two-Dimensional PCA for Face Recognition

Nam Nguyen; Wanquan Liu; Svetha Venkatesh

The two-dimensional Principal Component Analysis (2DPCA) is a robust method in face recognition. Much recent research shows that the 2DPCA is more reliable than the well-known PCA method in recognising human face. However, in many cases, this method tends to be overfitted to sample data. In this paper, we proposed a novel method named random subspace two-dimensional PCA (RS-2DPCA), which combines the 2DPCA method with the random subspace (RS) technique. The RS-2DPCA inherits the advantages of both the 2DPCA and RS technique, thus it can avoid the overfitting problem and achieve high recognition accuracy. Experimental results in three benchmark face data sets − the ORL database, the Yale face database and the extended Yale face database B − confirm our hypothesis that the RS-2DPCA is superior to the 2DPCA itself.

- Session-11: Face and 3D Model Analysis | Pp. 655-664

doi: 10.1007/978-3-540-77255-2_82

Robust Speaking Face Identification for Video Analysis

Yi Wu; Wei Hu; Tao Wang; Yimin Zhang; Jian Cheng; Hanqing Lu

We investigate the problem of automatically identifying speaking faces for video analysis using only the visual information. Intuitively, mouth should be first accurately located in each face, but this is extremely challenging due to the complicated condition in video, such as irregular lighting, changing face poses and low resolution etc. Even though we get the accurate mouth location, it’s still very hard to align corresponding mouths. However, we demonstrate that high precision can be achieved by aligning mouths through face matching, which needs no accurate mouth location. The principal novelties that we introduce are: (i) proposing a framework for speaking face identification for video analysis; (ii) detecting the change of the aligned mouth through face matching; (iii) introducing a novel descriptor to describe the change of the mouth. Experimental results on videos demonstrated that the proposed approach is efficient and robust for speaking face identification.

- Session-11: Face and 3D Model Analysis | Pp. 665-674

doi: 10.1007/978-3-540-77255-2_83

Incremental AAM Using Synthesized Illumination Images

Hyung-Soo Lee; Jaewon Sung; Daijin Kim

Active Appearance Model is a well-known model that can represent a non-rigid object effectively. However, since it uses the fixed appearance model, the fitting results are often unsatisfactory when the imaging condition of the target image is different from that of training images. To alleviate this problem, incremental AAM was proposed which updates its appearance bases in an on-line manner. However, it can not deal with the sudden changes of illumination. To overcome this, we propose a novel scheme to update the appearance bases. When a new person appears in the input image, we synthesize illuminated images of that person and update the appearance bases of AAM using it. Since we update the appearance bases using synthesized illuminated images in advance, the AAM can fit their model to a target image well when the illumination changes drastically. The experimental results show that our proposed algorithm improves the fitting performance over both the incremental AAM and the original AAM.

- Session-11: Face and 3D Model Analysis | Pp. 675-684

doi: 10.1007/978-3-540-77255-2_84

Content-Based 3D Model Retrieval Based on the Spatial Geometric Descriptor

Dingwen Wang; Jiqi Zhang; Hau-San Wong; Yuanxiang Li

In this paper, we propose a novel shape descriptor for 3D objects, called spatial geometric descriptor (SGD), to represent the spatial geometric information of a 3D model by mapping its furthest distance, normal and area distribution onto spherical grids in a sequence of concentric shells. Then these spherical distribution functions are transformed to spherical harmonic coefficients which not only save the storage space but also provide multi-resolution shape description for any 3D model by adopting different dimensions for the coefficients. The feature vector extraction time can be reduced by adopting a single scan scheme on the mesh surface for a given 3D model. The retrieval performance is evaluated on the public Princeton Shape Benchmark (PSB) dataset and the experimental results show that our method not only outperforms Light Field Descriptor which is regarded as the best shape descriptor so far but also maintains an advantage of fast feature vector extraction procedure.

- Session-11: Face and 3D Model Analysis | Pp. 685-694

doi: 10.1007/978-3-540-77255-2_85

A Practical Server-Side Transmission Control Method for Multi-channel DTV Streaming System

Yuanhai Zhang; Wei Huangfu

In this paper, we propose a practical design and implementation of multi-channel High Definition (HD) and Standard Definition (SD) MPEG-2 video streaming system using server-side video rate adaptation and rate shaping over digital community network. For video rate adaptation, we employ Program Clock Reference (PCR) embedded in the MPEG-2 streams to enhance packet timing control precision and regulate the transmission rate in a refined way. For rate shaping, we introduce Traffic Control (TC) ingeniously to separate streams of different channels at the network card of server and avoid bandwidth contesting between them. Experimental results show that the proposed system can mitigate the quality degradation of video streaming due to the fluctuations of time-varying channel and simultaneously support 33-channel HDTV streams.

- Session-12: Multimedia Applications | Pp. 695-703

doi: 10.1007/978-3-540-77255-2_86

Using Irradiance Environment Map on GPU for Real-Time Composition

Jonghyub Kim; Yongho Hwang; Hyunki Hong

For the seamless integration of synthetic objects within video images, generating consistent illumination is critical. This paper presents an interactive rendering system using a Graphics Process Unit-based (GPU) irradiance environment map. A camcorder with a fisheye lens captures environmental information and constructs the environment map in real-time. The pre-filtering method, which approximates the irradiance of the scene using 9 parameters, renders diffuse objects within real images. This proposed interactive common illumination system based on the GPU can generate photo-realistic images at 18 ~ 20 frames per second.

- Session-12: Multimedia Applications | Pp. 704-713

doi: 10.1007/978-3-540-77255-2_87

Ranking Using Multi-features in Blog Search

Kangmiao Liu; Guang Qiu; Jiajun Bu; Chun Chen

Blog has received lots of attention since the revolution of Web 2.0 and has attracted millions of users to publish information on it. As time goes by, information seeking in this new media becomes an emergent issue. In our paper, we take multiple features unique in blogs into account and propose a novel algorithm to rank the blog posts in blog search. Coherence between the query type and blogger interest, document relevance and freshness are combined linearly to produce the final ranking score of a post. Specifically, we introduce a user modeling method to capture interests of bloggers. In our experiments, we invite volunteers to complete several tasks and their time cost in the tasks is taken as the primary criteria to evaluate the performance. The experimental results show that our algorithm outperforms traditional ones.

- Session-12: Multimedia Applications | Pp. 714-723

doi: 10.1007/978-3-540-77255-2_88

Design and Analysis of a Watermarking System for Care Labels

Benjamin Ragan-Kelley; Nicholas Tran

A watermarking system for embedding textile care labels directly onto fabric designs is proposed, and its stochastic properties are analyzed. Under the assumption that pixel values are independently and identically distributed with finite mean and variance, we derive i) the expected mean squared error between the original and watermarked images (transparency); and ii) an upper bound on the average absolute change to DCT coefficients of the watermarked image after one application of simulated fading (robustness). Experimental results demonstrate that the proposed scheme preserves image fidelity well and is very robust under simulated fading.

- Session-12: Multimedia Applications | Pp. 724-733

doi: 10.1007/978-3-540-77255-2_89

Stroke Correspondence Based on Graph Matching for Detecting Stroke Production Errors in Chinese Character Handwriting

Zhihui Hu; Howard Leung; Yun Xu

People may make mistakes in writing a Chinese character. In this paper, we apply error-tolerant graph matching to find the stroke production errors in people’s handwriting of Chinese characters. A set of edit operations to transform one graph into another are defined for achieving this purpose. The matching procedure is denoted as a search problem of finding the minimum edit distance. The A algorithm is used to perform the searching. Experiments show that the proposed method outperforms existing algorithms in identifying stroke production errors. The proposed method can help in Chinese handwriting education by providing feedback to correct users who have stroke production errors in writing a Chinese character.

- Session-12: Multimedia Applications | Pp. 734-743

doi: 10.1007/978-3-540-77255-2_90

Multi-modal Multi-label Semantic Indexing of Images Based on Hybrid Ensemble Learning

Wei Li; Maosong Sun; Christopher Habel

Automatic image annotation (AIA) refers to the association of words to whole images which is considered as a promising and effective approach to bridge the semantic gap between low-level visual features and high-level semantic concepts. In this paper, we formulate the task of image annotation as a multi-label multi class semantic image classification problem and propose a simple yet effective method: hybrid ensemble learning framework in which multi-label classifier based on uni-modal features and ensemble classifier based on bi-modal features are integrated into a joint classification model to perform multi-modal multi-label semantic image annotation. We conducted experiments on two commonly-used keyframe and image collections: MediaMill and Scene dataset including about 40,000 examples. The empirical studies demonstrated that the proposed hybrid ensemble learning method can enhance a given weak multi-label classifier to some extent, showing the effectiveness of our proposed method when limited number of multi-labeled training data is available.

- Session-13: Image Indexing, Identification and Processing | Pp. 744-754