Catálogo de publicaciones - libros
Advances in Multimedia Information Processing: 7th Pacific Rim Conference on Multimedia, Hangzhou, China, November 2-4, 2006, Proceedings
Yueting Zhuang ; Shi-Qiang Yang ; Yong Rui ; Qinming He (eds.)
En conferencia: 7º Pacific-Rim Conference on Multimedia (PCM) . Hangzhou, China . November 2, 2006 - November 4, 2006
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Computer Applications; Multimedia Information Systems; Information Storage and Retrieval; Computer Communication Networks; Information Systems Applications (incl. Internet); Image Processing and Computer Vision
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-48766-1
ISBN electrónico
978-3-540-48769-2
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Cobertura temática
Tabla de contenidos
doi: 10.1007/11922162_84
Building a Personalized Music Emotion Prediction System
Chan-Chang Yeh; Shian-Shyong Tseng; Pei-Chin Tsai; Jui-Feng Weng
With the development of multimedia technology, research on music is getting more and more popular. Nowadays researchers focus on studying the relationship between music and listeners’ emotions but they didn’t consider users’ differences. Therefore, we propose a Personalized Music Emotion Prediction (P-MEP) System to assist predicting listeners’ music emotion concerning with users’ differences. To analyze listeners’ emotional response to music, the P-MEP rules will be generated in the analysis procedure consisting of 5 phases. During the application procedure, the P-MEP System predicts the new listener’s emotional response to music. The result of the experiment shows that the generated P-MEP rules can be used to predict emotional response to music concerning with listeners’ differences.
Pp. 730-739
doi: 10.1007/11922162_85
Video Segmentation Using Joint Space-Time-Range Adaptive Mean Shift
Irene Y. H. Gu; Vasile Gui; Zhifei Xu
Video segmentation has drawn increasing interest in multimedia applications. This paper proposes a novel joint space-time-range domain adaptive mean shift filter for video segmentation. In the proposed method, segmentation of moving/static objects/background is obtained through inter-frame mode-matching in consecutive frames and motion vector mode estimation. Newly appearing objects/regions in the current frame due to new foreground objects or uncovered background regions are segmented by intra-frame mode estimation. Simulations have been conducted to several image sequences, and results have shown the effectiveness and robustness of the proposed method. Further study is continued to evaluate the results.
Pp. 740-748
doi: 10.1007/11922162_86
EagleRank: A Novel Ranking Model for Web Image Search Engine
Kangmiao Liu; Wei Chen; Chun Chen; Jiajun Bu; Can Wang; Peng Huang
The explosive growth of World Wide Web has already made it the biggest image repository. Despite some image search engines provide con-venient access to web images, they frequently yield unwanted results. Locating needed and relevant images remains a challenging task. This paper proposes a novel ranking model named EagleRank for web image search engine. In EagleRank, multiple sources of evidence related to the images are considered, including image surrounding text passages, terms in special HTML tags, website types of the images, the hyper-textual structure of the web pages and even the user feedbacks. Meanwhile, the flexibility of EagleRank allows it to combine other potential factors as well. Based on inference network model, EagleRank also gives sufficient support to Boolean AND and OR operators. Our experimental results indicate that EagleRank has better performance than traditional approaches considering only the text from web pages.
Pp. 749-759
doi: 10.1007/11922162_88
3D Mesh Construction from Depth Images with Occlusion
Jeung-Chul Park; Seung-Man Kim; Kwan-Heng Lee
The realistic broadcasting is a broadcasting service system using multi-modal immersive media to provide clients with realism that includes such things as photorealistic and 3D display, 3D sound, multi-view interaction and haptic interactions. In such a system, a client is able to see stereoscopic views, to hear stereo sound, and even to touch both the real actor and virtual objects using haptic devices. This paper presents a 3D mesh modeling considering self-occlusion from 2.5D depth video to provide broadcasting applications with multi-modal interactions. Depth video of a real object is generally captured by using a depth video camera from a single point of view such that it often includes self-occluded images. This paper presents a series of techniques that can construct a smooth and compact mesh model of an actor that contains self-occluded regions. Although our methods work only for an actor with a simple posture, it can be successfully applied to a studio environment where the body movement of the actor is relatively limited.
Pp. 770-778
doi: 10.1007/11922162_89
An Eigenbackground Subtraction Method Using Recursive Error Compensation
Zhifei Xu; Pengfei Shi; Irene Yu-Hua Gu
Eigenbackground subtraction is a commonly used method for moving object detection. The method uses the difference between an input image and the reconstructed background image for detecting foreground objects based on eigenvalue decomposition. In the method, foreground regions are represented in the reconstructed image using eigenbackground in the sense of least mean squared error minimisation. This results in errors that are spread over the entire reconstructed reference image. This will also result in degradation of quality of reconstructed background leading to inaccurate moving object detection. In order to compensate these regions, an efficient method is proposed by using recursive error compensation and an adaptively computed threshold. Experiments were conducted on a range of image sequences with variety of complexity. Performance were evaluated both qualitatively and quantitatively. Comparisons made with two existing methods have shown better approximations of the background images and more accurate detection of foreground objects have been achieved by the proposed method.
Pp. 779-787
doi: 10.1007/11922162_90
Attention Information Based Spatial Adaptation Framework for Browsing Videos Via Mobile Devices
Yi Wang; Houqiang Li; Zhengkai Liu; Chang Wen Chen
The limited display size of the mobile devices has been imposing significant barriers for mobile device users to enjoy browsing high-resolution videos. In this paper, we present a novel video adaptation scheme based on attention area detection for users to enrich browsing experience on mobile devices. During video compression, the attention information which refers to as attention objects in frames will be detected and embedded into bitstreams using the supplement enhanced information (SEI) tool. In this research, we design a special SEI structure for embedding the attention information. Furthermore, we also develop a scheme to adjust adaptive quantization parameters in order to improve the quality on encoding the attention areas. When the high-resolution bitstream is transmitted to mobile users, a fast transcoding algorithm we developed earlier will be applied to generate a new bitstream for attention areas in frames. The new low-resolution bitstream containing mostly attention information, instead of the high-resolution one, will be sent to users for display on the mobile devices. Experimental results show that the proposed spatial adaptation scheme is able to improve both subjective and objective video qualities.
Pp. 788-797
doi: 10.1007/11922162_92
Requantization Transcoding of H.264/AVC Bitstreams for Intra 4×4 Prediction Modes
Stijn Notebaert; Jan De Cock; Koen De Wolf; Rik Van de Walle
Efficient bitrate reduction of video content is necessary in order to satisfy the different constraints imposed by decoding devices and transmission networks. Requantization is a fast technique for bitrate reduction, and has been successfully applied for MPEG-2 bitstreams. Because of the newly introduced intra prediction in H.264/AVC, the existing techniques are rendered useless. In this paper we examine requantization transcoding of H.264/AVC bitstreams, focusing on the intra 4×4 prediction modes. Two architectures are proposed, one in the pixel domain and the other in the frequency domain, that compensate the drift introduced by the requantization of intra 4×4 predicted blocks. Experimental results show that these architectures perform approximately equally well as the full decode and recode architecture for low to medium bitrates. Because of the reduced computational complexity of these architectures, in particular the frequency-domain compensation architecture, they are highly suitable for real-time adaptation of video content.
Pp. 808-817
doi: 10.1007/11922162_93
Prediction Algorithms in Large Scale VOD Services on Grid Infrastructure
Bo Li; Depei Qian
VOD (Video on Demand) is one of significant services for next generation networks. Commonly large scale VOD services mean local networks to provide VOD services to communities about 500 to 1000 users accessing simultaneously. VOD services on grid infrastructure make resources sharing and management easy, which leads substantial cooperation among systems distributed in many places. This paper presents prediction algorithms trying to reduce the cost of external communications among large VOD nodes in a grid community. Basic algorithms can reduce overall costs about 30trained ANN can provide extra 10% performance.
Pp. 818-826
doi: 10.1007/11922162_94
A Hierarchical Framework for Fast Macroblock Prediction Mode Decision in H.264
Cheng-dong Shen; Si-kun Li
Many intra and inter prediction modes for macroblock are supported in the latest video compression standard H.264. Using the powerful Lagrangian minimization tool such as rate-distortion optimization, the mode with the optimal rate-distortion performance is determined. This achieves highest possible coding efficiency, but total calculation of cost for all candidate modes results in much higher computational complexity. In this paper, we propose a hierarchical framework for fast macroblock prediction mode decision in H.264 encoders. It is based on hierarchical mode classification method which assists fast mode decision by pre-selecting the class for macroblock using the extracted spatial and temporal features of macroblock. Since tests for many modes of non-selected classes will be skipped, much computation of rate-distortion optimization can be saved. Experimental results show that the proposed method can reduce the execution time of mode decision by 85% on the average without perceivable loss in coding rate and quality.
Pp. 827-834
doi: 10.1007/11922162_95
Compact Representation for Large-Scale Clustering and Similarity Search
Bin Wang; Yuanhao Chen; Zhiwei Li; Mingjing Li
Although content-based image retrieval has been researched for many years, few content-based methods are implemented in present image search engines. This is partly bacause of the great difficulty in indexing and searching in high-dimensional feature space for large-scale image datasets. In this paper, we propose a novel method to represent the content of each image as one or multiple hash codes, which can be considered as special keywords. Based on this compact representation, images can be accessed very quickly by their visual content. Furthermore, two advanced functionalities are implemented. One is content-based image clustering, which is simplified as grouping images with identical or near identical hash codes. The other is content-based similarity search, which is approximated by finding images with similar hash codes. The hash code extraction process is very simple, and both image clustering and similarity search can be performed in real time. Experiments on over 11 million images collected from the web demonstrate the efficiency and effectiveness of the proposed method.
Pp. 835-843