Catálogo de publicaciones - libros

Compartir en
redes sociales

Advances in Multimedia Information Processing: 7th Pacific Rim Conference on Multimedia, Hangzhou, China, November 2-4, 2006, Proceedings

Yueting Zhuang ; Shi-Qiang Yang ; Yong Rui ; Qinming He (eds.)

En conferencia: 7º Pacific-Rim Conference on Multimedia (PCM) . Hangzhou, China . November 2, 2006 - November 4, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Applications; Multimedia Information Systems; Information Storage and Retrieval; Computer Communication Networks; Information Systems Applications (incl. Internet); Image Processing and Computer Vision

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2006	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-48766-1

ISBN electrónico

978-3-540-48769-2

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2006

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/11922162_72

Image Annotations Based on Semi-supervised Clustering with Semantic Soft Constraints

Rui Xiaoguang; Yuan Pingbo; Yu Nenghai

An efficient image annotation and retrieval system is highly desired for the increase of amounts of image information. Clustering algorithms make it possible to represent images with finite symbols. Based on this, many statistical models, which analyze correspondence between visual features and words, have been published for image annotation. But most of these models cluster only using visual features, ignoring semantics of images. In this paper, we propose a novel model based on semi-supervised clustering with semantic soft constraints which can utilize both visual features and semantic meanings. Our method first measures the semantic distance with generic knowledge (e.g. WordNet) between regions of the training images with manual annotations. Then a semi-supervised clustering algorithm with semantic soft constraints is proposed to cluster regions with semantic soft constraints which are formed by semantic distance. The experiment results show that our model improves performance of image annotation and retrieval system.

Pp. 624-632

doi: 10.1007/11922162_73

Photo Retrieval from Personal Memories Using Generic Concepts

Rui M. Jesus; Arnaldo J. Abrantes; Nuno Correia

This paper presents techniques for retrieving photos from personal memories collections using generic concepts that the users specify. It is part of a larger project for capturing, storing, and retrieving personal memories in different contexts of use. Semantic concepts are obtained by training binary classifiers using the Regularized Least Squares Classifier (RLSC)and can be combined to express more complex concepts. The results that were obtained so far are quite good and by adding more low level features, better results are possible. The paper describes the proposed approach, the classifier and features, and the results that were obtained.

Pp. 633-640

doi: 10.1007/11922162_74

PanoWalk: A Remote Image-Based Rendering System for Mobile Devices

Zhongding Jiang; Yandong Mao; Qi Jia; Nan Jiang; Junyi Tao; Xiaochun Fang; Hujun Bao

Real-time rendering of complex 3D scene on mobile devices is a challenging task. The main reason is that mobile devices have limited computational capabilities and are lack of powerful 3D graphics hardware support. In this paper, we propose a remote Image-Based Rendering system for mobile devices to interactively visualize real world and synthetic scenes under wireless network. Our system uses panoramic video as building block of representing scene data. The scene data is compressed with one MPEG like encoding scheme tailored for mobile device. The compressed data is stored on remote server. Our system carefully partitions the rendering task between client and server. The server is responsible for determining the required data for rendering novel views. It streams the required data to client in server pushing manner. After receiving data, mobile client carries out rendering locally using image warping and displays the resultant images onto its small screen. Experimental results show that our system can achieve real time rendering speed on mainstream mobile devices. It allows multiple mobile clients to explore the same or different scenes simultaneously.

Pp. 641-649

doi: 10.1007/11922162_75

A High Quality Robust Watermarking Scheme

Yu-Ting Pai; Shanq-Jang Ruan; Jürgen Götze

In recent years, digital watermarking has become a popular technique for hiding information in digital images to help protect against copyright infringement. In this paper we develop a high quality and robust watermarking algorithm that combines the advantages of block-based permutation with that of neighboring coefficient embedding. The proposed approach uses the relationship between the coefficients of neighboring blocks to hide more information into high frequency blocks without causing serious distortion to the watermarked image. In addition, an extraction method for improving robustness to mid-frequency filter attacks is proposed. Our experimental results show that the proposed approach is very effective in achieving perceptual invisibility with an increase in the peak signal to noise ratio (PSNR). Moreover, the proposed approach is robust to a variety of signal processing operations, such as compression (JPEG), image cropping, sharpening, blurring, and brightness adjustments. In those experimentation, the robustness is especially evident under the attack of blurring.

Pp. 650-657

doi: 10.1007/11922162_77

AVAS: An Audio-Visual Attendance System

Dongdong Li; Yingchun Yang; Zhenyu Shan; Gang Pan; Zhaohui Wu

Biometric identification technology is being applied to physical and information access control in some workplace with the improvements in the accuracy of biometric devices and declining price. This paper describes a multimodal biometric identification system for time and attendance application called AVAS (Audio-Visual Attendance System). This system takes users’ voice and face characteristics as their badge. The motivation behind using multimodal biometrics is to improve availability and accuracy of the system. The score differences between the genuine speaker class and the mistaken identified speaker class labeled by each classifier are taken into account, and Score Difference Weighted Sum rule (SDWS) is introduced to fuse the individual expert. We describe the functions of the AVAS in detail from three aspects, the interaction with users, the authentication implementation and the data management. The practical tests conducted on staff working environment gain distinct improvement about 9.8% with the proposed system.

Pp. 667-675

doi: 10.1007/11922162_78

Improved POCS-Based Deblocking Technique Using Wavelet Transform in Block Coded Image

Goo-Rak Kwon; Hyo-Kak Kim; Chun-Soo Park; Yoon Kim; Sung-Jea Ko

This paper presents a improved POCS-based deblocking technique, based on the theory of the projection onto convex sets (POCS) to reduce the blocking artifacts in decoded images. We propose a new smoothness constraint set (SCS) and its projection operator in the wave-let transform (WT) domain to remove unnecessary high-frequency components caused by blocking artifacts. In order to eliminate the blocking artifacts component while preserving the original edge component, we also propose a significant coefficient decision method (SCDM)for fast and efficient performance. Experimental results show that the proposed method can not only achieve a significantly enhanced subjective quality but also increase the PSNR improvement in the reconstructed image.

Pp. 676-685

doi: 10.1007/11922162_79

Sketch Case Based Spatial Topological Data Retrieval

Yuan Zhen-ming; Zhang Liang; Pan Hong

A large proportion of the information can be regarded as spatial data which is spatial position related. For accessing spatial databases, different query specification techniques have been proposed. But traditional query methods are tedious and cannot realize fuzzy query. A content-based spatial data retrieval system is presented to afford users a sketch interface which has the ability to accept fuzzy retrieval. Firstly the retrieval algorithm builds the spatial topological vector by refining the 9-intersection model metrically. Then the independent topological relations are extracted by training ICA assisted fuzzy SVMs, which can remove redundancy among the binary relations and reduce the dimension in complex spatial scene. In query processing the model is referenced, and the similarity is calculated by cosine distance function on the weight vectors of the query scene and each of spatial scenes in database. The experimental results show the recall factor and precision factor are improved compared with the query method without ICA and SVM.

Pp. 686-694

doi: 10.1007/11922162_81

Adaptive Search Range Scaling for B Pictures Coding

Zhigang Yang; Wen Gao; Yan Liu; Debin Zhao

This paper presents a frame-level adaptive search range scaling strategy for B pictures coding for H.264/AVC from the hardware-oriented viewpoint. After studying the relationship between search range of P and B picture, a simple search range scaling strategy is proposed at first, which is efficient for normal or low motion video. After that, this strategy is extended to high motion video by using the information of intra prediction and motion vector of each P picture to restrict the search range of adjacent B pictures. This adaptive search range scaling strategy can not only reduce approximate 60% search area of B pictures, but also keep almost the same coding performance as the reference software.

Pp. 704-713

doi: 10.1007/11922162_82

Video QoS Monitoring and Control Framework over Mobile and IP Networks

Bingjun Zhang; Lifeng Sun; Xiaoyu Cheng

With the development of network technology, multimedia applications in various video forms are widely used in network services. In order to leverage video QoS, it becomes a pressing problem to monitor and control video QoS during network transmission of video. In this paper, we propose a monitoring and control framework for video QoS over IP and mobile network. Also, we develop a low computational complexity and more effective video quality assessment (VQA) method based on human visual system (HVS), Improved Human Visual Model (I-HVM), and propose Adaptive and Dynamic Sampling Strategy (ADSS) of video feature, to monitor video quality at both ends of our framework. The experimental results show that our framework can monitor well video QoS over IP and mobile network. Consequence, to leverage video QoS, dynamic control can be applied to transmission decision of video service according to the monitoring results of video QoS by our framework.

Pp. 714-721

doi: 10.1007/11922162_83

Extracting Moving / Static Objects of Interest in Video

Sojung Park; Minhwan Kim

Extracting objects of interest in video is a challenging task that can improve the performance of video compression and retrieval. Usually moving objects in video were considered as objects of interest, so there were many researches to extract them. However, we know that some non-moving (static) objects also can be objects of interest. A segmentation method is proposed in this paper, which extracts static objects as well as moving objects that are likely to attract human’s interest. An object of interest is defined as the relatively large region that appears frequently over several frames and is not located near boundaries of the frames. A static object of interest should also have significant color and texture characteristics against its surround. We found that the objects of interest extracted by the proposed method were well matched with the objects of interest selected manually.

Pp. 722-729