Catálogo de publicaciones - libros

Compartir en
redes sociales


Advances in Multimedia Information Processing: 8th Pacific Rim Conference on Multimedia, Hong Kong, China, December 11-14, 2007. Proceedings

Horace H.-S. Ip ; Oscar C. Au ; Howard Leung ; Ming-Ting Sun ; Wei-Ying Ma ; Shi-Min Hu (eds.)

En conferencia: 8º Pacific-Rim Conference on Multimedia (PCM) . Hong Kong, China . December 11, 2007 - December 14, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Applications; Multimedia Information Systems; Information Storage and Retrieval; Computer Communication Networks; Information Systems Applications (incl. Internet); Image Processing and Computer Vision

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-77254-5

ISBN electrónico

978-3-540-77255-2

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

A Novel Multiple Description Approach to Predictive Video Coding

Zhiqin Liang; Jiantao Zhou; Liwei Guo; Mengyao Ma; Oscar Au

Multiple description coding (MDC) is a source coding technique that exploits path diversity to combat packet losses over error-prone channels. In this paper, we proposed a novel drift-free multi-state MDC method. At the encoder side, the original video is compressed into multiple independently decodable H.263 streams, each with its own coding structure and prediction process, such that if one stream is lost, the other stream can still be used to produce video with acceptable quality. At the decoder side, each description is considered as a noisy observation of the original video. A Least square-error (LSE) based merge algorithm is proposed to combine the descriptions. The experimental results show that the proposed algorithm has similar coding efficiency to [1], yet with improved error resilience.

- Best Paper Session | Pp. 286-295

Video Multicast over Wireless Ad Hoc Networks Using Distributed Optimization

Yifeng He; Ivan Lee; Ling Guan

Video multicast over wireless ad hoc networks is a quite challenging task. In this paper, we propose an optimized video multicast scheme. Firstly, we apply prioritized coding scheme and network coding scheme to eliminate the decoding hierarchy and delivery redundancy. Then, we maximize the aggregate throughput at all the receivers by jointly optimizing both the source rate allocation and the routing scheme. The proposed algorithm is fully distributed, thus very suitable for wireless ad hoc networks. Simulation results show that the proposed video multicast scheme yields a superior video quality compared to the double-tree routing scheme.

- Best Paper Session | Pp. 296-305

Acoustic Features for Estimation of Perceptional Similarity

Yoshihiro Adachi; Shinichi Kawamoto; Shigeo Morishima; Satoshi Nakamura

This paper describes an examination of acoustic features for the estimation of perceptional similarity between speeches. We firstly extract some acoustic features including personality from speeches of 36 persons. Secondly, we calculate each distance between extracted features using Gaussian Mixture Model (GMM) or Dynamic Time Warping (DTW), and then we sort speeches based on the physical similarity. On the other hand, there is the permutation based on the perceptional similarity which is sorted according to the subject. We evaluate the physical features by the Spearman’s rank correlation coefficient with two permutations. Consequently, the results show that DTW distance with high STRAIGHT Cepstrum is an optimum feature for estimation of perceptional similarity.

- Session-6: Audio, Speech and Sound Processing | Pp. 306-314

Modeling Uncertain Speech Sequences Using Type-2 Fuzzy Hidden Markov Models

Xiao-Qin Cao; Jia Zeng; Hong Yan

The automatic speech recognizor (ASR) based on hidden Markov models (HMMs) is very sensitive to multi-talker, non-stationary babble noise, which consists of a large number of speakers talking simultaneously. One major reason is due to mismatches between the training and testing conditions, which makes the accurate parameters of the HMM incapable of describing the uncertain distributions of the observations in speech signals. This paper applies one extension of the HMM referred to as the type-2 fuzzy hidden Markov models (T2 FHMMs) to modeling uncertain speech sequences. More specifically, we use the type-2 fuzzy set (T2 FS) to describe uncertain parameters of the HMM that may vary anywhere in an interval with uniform possibilities. As a result, the likelihood of the T2 FHMM becomes an interval rather than a precise real number, which can be processed by the generalized linear model (GLM) for final classification decision-making. Experimental results of phoneme classification in the babble noise demonstrate a significant improvement compared with the HMM in terms of the robustness and classification rate.

- Session-6: Audio, Speech and Sound Processing | Pp. 315-324

A New Adaptation Method for Speaker-Model Creation in High-Level Speaker Verification

Shi-Xiong Zhang; Man-Wai Mak

Research has shown that speaker verification based on high-level speaker features requires long enrollment utterances to be reliable. However, in practical speaker verification, it is common to model speakers based a limited amount of enrollment data. To minimize the undesirable effect of insufficient enrollment data on system performance, this paper proposes a new adaptation method for creating speaker models based on high-level features. Different from conventional methods, the proposed adaptation method not only adapts the phoneme-dependent background model but also the phoneme-independent speaker model. The amount of adaptation in the latter is adjusted by a proportional factor derived from the phoneme-independent background models. The proposed method was compared with traditional MAP adaptation under the NIST2000 SRE framework. Experimental results show that the proposed method can solve the data-spareness problem effectively and achieves a better performance when compare with traditional MAP adaptation.

- Session-6: Audio, Speech and Sound Processing | Pp. 325-335

Dynamic Sound Rendering Based on Ray-Caching

Ken Chan; Rynson W. H. Lau; Jianmin Zhao

Dynamic sound rendering is attracting a lot of attention in recent years due to its applications in computer games and architecture simulation. Although physical based methods can produce realistic outputs, they typically involve recursive tracing of sound rays, which may be computationally too expensive for interactive dynamic environments. In this paper, we propose a ray caching method that exploits ray coherence to accelerate the ray-tracing process. The proposed method is tailored for interactive sound rendering based on two approximation techniques: spatial and angular approximation. The ray cache supports intra-frame, inter-frame and inter-observer sharing of rays. We show the performance of the new method through a number of experiments.

- Session-6: Audio, Speech and Sound Processing | Pp. 336-346

Spread-Spectrum Watermark by Synthesizing Texture

Wenyu Liu; Fan Zhang; Chunxiao Liu

Image watermarking is a mapping from watermark message to a set of image counterparts, where every version conveys the same meaning with the original image. Since textures that present single perceptual meaning have certain diversity, an intuitive idea of watermarking is to replace the texture region of an image with a similar-looking synthetic texture containing the watermark. We propose a spread-spectrum watermarking scheme by integrating existent work on texture extraction, segmentation and synthesis, and obtain suggestive results, including (1) the synthetic watermarks can resist adaptive Wiener filtering attack due to its power spectrum similar with the original image; (2) if using the spread-spectrum carrier which is designed elaborately according to the subspace spanned by the textures, hiding capacity can be improved by 20%, while effective hiding capacity under Wiener filtering attack can be doubled; (3) the proposed watermarking scheme also enlighten a sophisticate strategy for watermark attack.

- Session-7: Digital Watermarking | Pp. 347-356

Design of Secure Watermarking Scheme for Watermarking Protocol

Bin Zhao; Lanjun Dang; Weidong Kou; Jun Zhang; Xuefei Cao

Watermarking technique enables to hide an imperceptible watermark into a multimedia content for copyright protection. However, in most conventional watermarking schemes, the watermark is embedded solely by the seller, and both the seller and the buyer know the watermarked copy, which causes unsettled dispute at the phase of arbitration. To solve this problem, many watermarking protocols have been proposed using watermarking scheme in the encrypted domain. In this paper, we firstly discuss many security aspects in the encrypted domain, and then propose a new method of homomorphism conversion for probabilistic public key cryptosystem with homomorphic property. Based on our previous work, a new secure watermarking scheme for watermarking protocol is presented using a new embedding strategy in the encrypted domain. We employ an El Gamal variant cryptosystem with additive homomorphic property to reduce the computing overload of watermark embedding in the encrypted domain, and RA code to improve the robustness of the watermarked image against many moderate attacks after decryption. Security analysis and experiment demonstrate that the secure watermarking scheme is more suitable for implementing the existing watermarking protocols.

- Session-7: Digital Watermarking | Pp. 357-366

Digital Watermarking Based on Stochastic Resonance Signal Processor

Shuifa Sun; Sam Kwong; Bangjun Lei; Sheng Zheng

A signal processor based on an bi-stable aperiodic stochastic resonance (ASR) is introduced firstly. The processor can detect the base-band binary pulse amplitude modulation (PAM) signal. A digital image watermarking algorithm in the discrete cosine transform (DCT) domain is implemented based on the processor. In this algorithm, the watermark and the DCT alternating current (ac) coefficients of the image are viewed as the input signal and the channel noise of the processor input, respectively. In conventional watermarking systems, it’s difficult to explain why the detection bit error ratio (BER) of a watermarking system suffering from some kinds of attacks is lower than that of the system suffering from no attack. In the present watermarking algorithm, this phenomenon is systematically analyzed. It is shown that the DCT ac coefficients of the image as well as the noise imported by the attacks will cooperate within the bi-stable ASR system to improve the performance of the watermark detection.

- Session-7: Digital Watermarking | Pp. 367-376

A DWT Blind Image Watermarking Strategy with Secret Sharing

Li Zhang; Ping-ping Zhou; Gong-bin Qian; Zhen Ji

A blind image watermarking scheme based on secret sharing in discrete wavelet transform domain is proposed. Watermark was divided into shadows according to secret sharing scheme. And or more of those shadows can reconstruct the watermark, while -1 or less shadows could not do it. In order to achieve optimum embedding strategy, a closed loop embedding process is proposed, which is modified iteratively according to results of performance analysis. The convergence of closed loop watermarking is proved. Independent component analysis is utilized so that detector can not merely detect watermark but also can extract it. Before watermark reconstruction, one way hashing function is used to withstand cheating attacks. The experimental results show that it is robust against a wide range of attacks proposed by Stirmark and it is more safety than traditional watermarking techniques.

- Session-7: Digital Watermarking | Pp. 377-384