Catálogo de publicaciones - libros

Compartir en
redes sociales

Advances in Multimedia Information Processing: 6th Pacific Rim Conference on Multimedia, Jeju Island, Korea, November 11-13, 2005, Proceedings, Part I

Yo-Sung Ho ; Hyoung Joong Kim (eds.)

En conferencia: 6º Pacific-Rim Conference on Multimedia (PCM) . Jeju Island, South Korea . November 13, 2005 - November 16, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Multimedia Information Systems; Information Storage and Retrieval; Computer Communication Networks; Information Systems Applications (incl. Internet); Computer Graphics; Image Processing and Computer Vision

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2005	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-30027-4

ISBN electrónico

978-3-540-32130-9

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2005

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/11581772_71

Auto-summarization of Multimedia Meeting Records Based on Accessing Log

Weisheng He; Yuanchun Shi; Xin Xiao

Computer techniques have been leveraged to record human experiences in many public spaces, e.g. meeting rooms and classrooms. For the large amount of such records produced after long-term use, it is imperative to generate auto summaries of the original content for fast skimming and browsing. In this paper, we present ASBUL, a novel algorithm to produce summaries of multimedia meeting records based on the information about viewers’ accessing patterns. This algorithm predicts the interestingness of record segments to the viewers based on the analysis of previous accessing patterns, and produces summaries by picking the segments of the highest predicted interests. We report a user study which compares ASBUL-generated summaries with human-generated summaries and shows that ASBUL algorithm is generally effective in generating personalized summaries to satisfy different viewers without requiring any priori, especially in free-style meetings where information is less structured and viewers’ understandings are more diversified.

Palabras clave: User Study; Indexing Event; Multimedia Content; Similarity Factor; Access Pattern.

Pp. 809-819

doi: 10.1007/11581772_72

Towards a High-Level Audio Framework for Video Retrieval Combining Conceptual Descriptions and Fully-Automated Processes

Mbarek Charhad; Mohammed Belkhatir

The growing need for ’intelligent’ video retrieval systems leads to new architectures combining multiple characterizations of the video content that rely on highly expressive frameworks while providing fully-automated indexing and retrieval processes. As a matter of fact, addressing the problem of combining modalities within expressive frameworks for video indexing and retrieval is of huge importance and the only solution for achieving significant retrieval performance. This paper presents a multi-facetted conceptual framework integrating multiple characterizations of the audio content for automatic video retrieval. It relies on an expressive representation formalism handling high-level audio descriptions of a video document and a full-text query framework in an attempt to operate video indexing and retrieval on audio features beyond state-of-the-art architectures operating on low-level features and keyword-annotation frameworks. Experiments on the multimedia topic search task of the TRECVID 2004 evaluation campaign validate our proposal.

Palabras clave: Automatic Speech Recognition; Video Retrieval; Video Shot; Audio Feature; Query Graph.

Pp. 820-830

doi: 10.1007/11581772_73

A New Concept of Security Camera Monitoring with Privacy Protection by Masking Moving Objects

Kenichi Yabuta; Hitoshi Kitazawa; Toshihisa Tanaka

We present a novel framework for encoding images obtained by a security monitoring camera with protecting the privacy of moving objects in the images. We are motivated by the fact that although security monitoring cameras can deter crimes, they may infringe the privacy of those who and objects which are recorded by the cameras. Moving objects, whose privacy should be protected, in an input image (recorded by a monitoring camera) are encrypted and hidden in a JPEG bitstream. Therefore, a normal JPEG viewer generates a masked image, where the moving objects are unrecognizable or completely invisible. Only a special viewer with a password can reconstruct the original recording. Data hiding is achieved by watermarking and encrypting with the advanced encryption standard (AES). We illustrate a concept of our framework and an algorithm of the encoder and the special viewer. Moreover, we show an implementation example.

Palabras clave: Privacy protection; security camera; watermarking; JPEG encoding.

Pp. 831-842

doi: 10.1007/11581772_74

Feature Fusion-Based Multiple People Tracking

Junhaeng Lee; Sangjin Kim; Daehee Kim; Jeongho Shin; Joonki Paik

This paper presents a feature fusion-based tracking algorithm using optical flow under the non-prior training active feature model (NPT-AFM) framework. The proposed object tracking procedure can be divided into three steps: (i) localization of human objects, (ii) prediction and correction of the object’s location by utilizing spatio-temporal information, and (iii) restoration of occlusion using the NPT-AFM[15]. Feature points inside an ellipsoidal shape including objects are estimated instead of its shape boundary, and are updated as an element of the training set for the AFM. Although the proposed algorithm uses the greatly reduced number of feature points, the proposed feature fusion-based multiple people tracking algorithm enables the tracking of occluded people in complicated background.

Pp. 843-853

doi: 10.1007/11581772_75

Extracting the Movement of Lip and Tongue During Articulation

Hanhoon Park; Seung-Wook Hong; Jong-Il Park; Sung-Kyun Moon; Hyeongseok Ko

A method that extracts the 3-D shape and movement of lip and tongue and displays them simultaneously is presented. Lip movement is easily observable and thus extractable using a camera. However, it is difficult to extract the real movement of tongue exactly because the tongue may be occluded by the lip and teeth. In this paper, we use a magnetic resonance imaging (MRI) device to extract the sagittal view of the movement of tongue during articulation. Since the frame rate of the available MRI device is very low (5 fps), we obtain a smooth video sequence (20 fps) by a new contour-based interpolation method. The overall procedure of extracting the movement of lip and tongue is as follows. First, fiducial color markers attached on the lip are detected, and then the data of 3D movement of the lip are computed using a 3D reconstruction technique. Next, to extract the movement of tongue image, we applied a series of simple image processing algorithms to MRI images of tongue and then extracted the contour of tongue interactively. Finally, the data of lip and tongue are synchronized and temporally interpolated. An OpenGL based program is implemented to visualize the data interactively. We performed the experiment using the Korean basic syllables and some of the data are presented. It is confirmed that a lot of experiments using the results support theoretical and empirical observation of linguistics. The acquired data can be used not only as a fundamental database for scientific purpose but also as an educative material for language rehabilitation of the hearing-impaired. Also it can be used for making a high-quality lip-synchronized animation including tongue movement.

Palabras clave: Magnetic Resonance Imaging Image; Magnetic Resonance Imaging Data; Fiducial Marker; Tongue Movement; Facial Animation.

Pp. 854-863

doi: 10.1007/11581772_76

A Scheme for Ball Detection and Tracking in Broadcast Soccer Video

Dawei Liang; Yang Liu; Qingming Huang; Wen Gao

In this paper we propose a scheme for ball detection and tracking in broadcast soccer video. There are two alternate procedures in the scheme: ball detection and ball tracking. In ball detection procedure, ball candidates are first extracted from several consecutive frames using color, shape, and size cues. Then a weighted graph is constructed, with each node representing a candidate and each edge linking two candidates in adjacent frames. Finally, Viterbi algorithm is employed to extract the optimal path as ball’s locations. In ball tracking procedure, Kalman filter based template matching is utilized to track the ball in subsequent frames. Kalman filter and the template are initialized using detection results. In each tracking step, ball location is verified to update the template and to guide possible ball re-detection. Experimental results demonstrate that the proposed scheme is promising.

Palabras clave: Kalman Filter; Optimal Path; Weighted Graph; Viterbi Algorithm; Adjacent Frame.

Pp. 864-875

doi: 10.1007/11581772_77

A Shape-Based Retrieval Scheme for Leaf Images

Yunyoung Nam; Eenjun Hwang

Content-based image retrieval (CBIR) usually utilizes image features such as color, shape, and texture. For good retrieval performance, appropriate object features should be selected, well represented and efficiently evaluated for matching. If images have similar color or texture like leaves, shape-based image retrieval could be more effective than retrieval using color or texture. In this paper, we present an effective and robust leaf image retrieval system based on shape feature. For the shape representation, we revised the MPP algorithm in order to reduce the number of points to consider. Moreover, to improve the matching time, we proposed a new dynamic matching algorithm based on the Nearest Neighbor search method. We implemented a prototype system and performed various experiments to show its effectiveness. Its performance is compared with other methods including Centroid Contour Distance (CCD), Fourier Descriptor, Curvature Scale Space Descriptor (CSSD), Moment Invariants, and MPP. Experimental results on one thousand leaf images show that our approach achieves a better performance than other methods.

Palabras clave: Image Retrieval; Query Image; Shape Match; Moment Invariant; Shape Representation.

Pp. 876-887

doi: 10.1007/11581772_78

Lung Detection by Using Geodesic Active Contour Model Based on Characteristics of Lung Parenchyma Region

Chul-Ho Won; Seung-Ik Lee; Dong-Hun Kim; Jin-Ho Cho

In this paper, the curve stopping function based on the CT number of lung parenchyma from CT lung images is proposed to detect lung region in replacement of conventional edge indication function in geodesic active contour model. We showed that the proposed method was able to detect lung region more effectively than conventional method by applying three kinds of measurement numerically. And, we verified the effectiveness of our method visually by observing the detection procedure on actual CT images. Because lung parenchyma region could be precisely detected from actual EBCT lung images, we were sure that the proposed method could aid to early diagnosis of lung disease and local abnormality of lung function.

Palabras clave: Lung disease; CT images; geodesic active contour; early diagnosis.

Pp. 888-898

doi: 10.1007/11581772_79

Improved Automatic Liver Segmentation of a Contrast Enhanced CT Image

Kyung-Sik Seo; Jong-An Park

This paper presents an improved automatic liver segmentation method using a left partial histogram threshold (LPHT) algorithm. The LPHT algorithm removes other neighboring abdominal organs regardless of pixel variation of contrast enhanced computed tomography (CE-CT) images. After histogram transformation, adaptive multi-modal threshold is used to find the range of gray-level values of the liver structure. The LPHT algorithm is performed to removing other neighboring organs. Then, binary morphological filtering is processed to remove unnecessary objects and smooth the boundary. 48 CE-CT slices of twelve patients were selected to test the proposed automatic liver segmentation. As evaluation methods, normalized average area and area error rate were used. The results of experiments show similar performance between the proposed algorithm and the manual method by a medical doctor.

Palabras clave: Liver Region; Contrast Enhanced Compute Tomography; Automatic Segmentation; Liver Structure; Liver Segmentation.

Pp. 899-909

doi: 10.1007/11581772_80

Automated Detection of Tumors in Mammograms Using Two Segments for Classification

Mahmoud R. Hejazi; Yo-Sung Ho

A spread pattern of a tumor in medical images is an important factor for classification of the tumor. The spread pattern is generally not considered when we use only one segment for classification. In order to include the spread pattern for tumor analysis, we propose an approach for classification of tumors in mammograms using two segments for a mass. The proposed approach is performed in two stages. In the first stage, the system separates segments of the image that may correspond to tumors using a combination of morphological operations and a region growing technique. In the second stage, segmented regions are classified as normal, benign, or malignant tissues based on different measurements. The measurements pertain to shape, intensity variation around the mass, as well as the spread pattern. Experimental results with mammogram images of the MIAS database show reasonable improvements in correct detection of possible tumors, compared to other approaches.

Palabras clave: Tumor classification; spread pattern; segmentation; mammogram.

Pp. 910-921