Catálogo de publicaciones - libros

Compartir en
redes sociales

Image and Video Retrieval: 4th International Conference, CIVR 2005, Singapore, July 20-22, 2005, Proceedings

Wee-Kheng Leow ; Michael S. Lew ; Tat-Seng Chua ; Wei-Ying Ma ; Lekha Chaisorn ; Erwin M. Bakker (eds.)

En conferencia: 4º International Conference on Image and Video Retrieval (CIVR) . Singapore, Singapore . July 20, 2005 - July 22, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Graphics; Information Storage and Retrieval; Database Management; Information Systems Applications (incl. Internet); Multimedia Information Systems; Image Processing and Computer Vision

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2005	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-27858-0

ISBN electrónico

978-3-540-31678-7

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2005

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/11526346_11

EMD-Based Video Clip Retrieval by Many-to-Many Matching

Yuxin Peng; Chong-Wah Ngo

This paper presents a new approach for video clip retrieval based onEarth Mover’s Distance (EMD). Instead of imposing one-to-one matching constraint as in [11, 14], our approach allows many-to-many matching methodology and is capable of tolerating errors due to video partitioning and various video editing effects. We formulate clip-based retrieval as a graph matching problem in two stages. In the first stage, to allow the matching between a query and a long video, an online clip segmentation algorithm is employed to rapidly locate candidate clips for similarity measure. In the second stage, a weighted graph is constructed to model the similarity between two clips. EMD is proposed to compute the minimum cost of the weighted graph as the similarity between two clips. Experimental results show that the proposed approach is better than some existing methods in term of ranking capability.

- Video Retrieval Techniques | Pp. 71-81

doi: 10.1007/11526346_12

Visual Cue Cluster Construction via Information Bottleneck Principle and Kernel Density Estimation

Winston H. Hsu; Shih-Fu Chang

Recent research in video analysis has shown a promising direction, in which mid-level features (e.g., people, anchor, indoor) are abstracted from low-level features (e.g., color, texture, motion, etc.) and used for discriminative classification of semantic labels. However, in most systems, such mid-level features are selected manually. In this paper, we propose an information-theoretic framework, visual cue cluster construction (VC), to automatically discover adequate mid-level features. The problem is posed as mutual information maximization, through which optimal cue clusters are discovered to preserve the highest information about the semantic labels. We extend the Information Bottleneck framework to high-dimensional continuous features and further propose a projection method to map each video into probabilistic memberships over all the cue clusters. The biggest advantage of the proposed approach is to remove the dependence on the manual process in choosing the mid-level features and the huge labor cost involved in annotating the training corpus for training the detector of each mid-level feature. The proposed VC framework is general and effective, leading to exciting potential in solving other problems of semantic video analysis. When tested in news video story segmentation, the proposed approach achieves promising performance gain over representations derived from conventional clustering techniques and even the mid-level features selected manually.

- Video Story Segmentation and Event Detection | Pp. 82-91

doi: 10.1007/11526346_13

Story Segmentation in News Videos Using Visual and Text Cues

Yun Zhai; Alper Yilmaz; Mubarak Shah

In this paper, we present a framework for segmenting the news programs into different story topics. The proposed method utilizes both visual and text information of the video. We represent the news video by a Shot Connectivity Graph (SCG), where the nodes in the graph represent the shots in the video, and the edges between nodes represent the transitions between shots. The cycles in the graph correspond to the story segments in the news program. We first detect the cycles in the graph by finding the anchor persons in the video. This provides us with the coarse segmentation of the news video. The initial segmentation is later refined by the detections of the weather and sporting news, and the merging of similar stories. For the weather detection, the global color information of the images and the motion of the shots are considered. We have used the text obtained from automatic speech recognition (ASR) for detecting the potential sporting shots to form the sport stories. Adjacent stories with similar semantic meanings are further merged based on the visual and text similarities. The proposed framework has been tested on a widely used data set provided by NIST, which contains the ground truth of the story boundaries, and competitive evaluation results have been obtained.

- Video Story Segmentation and Event Detection | Pp. 92-102

doi: 10.1007/11526346_14

Boundary Error Analysis and Categorization in the TRECVID News Story Segmentation Task

Joaquim Arlandis; Paul Over; Wessel Kraaij

In this paper, an error analysis based on boundary error popularity (frequency) including semantic boundary categorization is applied in the context of the news story segmentation task from TRECVID. Clusters of systems were defined based on the input resources they used including video, audio and automatic speech recognition. A cross-popularity specific index was used to measure boundary error popularity across clusters, which allowed goal-driven selection of boundaries to be categorized. A wide set of boundaries was viewed and a summary of the error types is presented. This framework allowed conclusions about the behavior of resource-based clusters in the context of news story segmentation.

- Video Story Segmentation and Event Detection | Pp. 103-112

doi: 10.1007/11526346_15

Semantic Event Detection in Structured Video Using Hybrid HMM/SVM

Tae Meon Bae; Cheon Seog Kim; Sung Ho Jin; Ki Hyun Kim; Yong Man Ro

In this paper, we propose a new semantic event detection algorithm in structured video. A hybrid method that combines HMM with SVM to detect semantic events in video is proposed. The proposed detection method has some advantages that it is suitable to the temporal structure of event thanks to Hidden Markov Models (HMM) and guarantees high classification accuracy thanks to Support Vector Machines (SVM). The performance of the proposed method is compared with that of HMM based method, which shows the performance increase in both recall and precision of semantic event detection.

- Video Story Segmentation and Event Detection | Pp. 113-122

doi: 10.1007/11526346_16

What Can Expressive Semantics Tell: Retrieval Model for a Flash-Movie Search Engine

Dawei Ding; Jun Yang; Qing Li; Wenyin Liu; Liping Wang

Flash, as a multimedia format, becomes more and more popular on the Web. However, previous works on Flash are unpractical to build a content-based Flash search engine. To address this problem, our paper proposes expressive semantics (ETS model) for bridging the gap between low-level features and user queries. A Flash search engine is built based on the expressive semantics of Flash movies and our experiment results confirm that expressive semantics is a promising approach to understanding and hence searching Flash movies more efficiently.

- Semantics in Video Retrieval | Pp. 123-133

doi: 10.1007/11526346_17

The Use and Utility of High-Level Semantic Features in Video Retrieval

Michael G. Christel; Alexander G. Hauptmann

This paper investigates the applicability of high-level semantic features for video retrieval using the benchmarked data from TRECVID 2003 and 2004, addressing the contributions of features like outdoor, face, and animal in retrieval, and if users can correctly decide on which features to apply for a given need. Pooled truth data gives evidence that some topics would benefit from features. A study with 12 subjects found that people often disagree on the relevance of a feature to a particular topic, including disagreement within the 8% of positive feature-topic associations strongly supported by truth data. When subjects concur, their judgments are correct, and for those 51 topic-feature pairings identified as significant we conduct an investigation into the best interactive search submissions showing that for 29 pairs, topic performance would have improved had users had access to ideal classifiers for those features. The benefits derive from generic features applied to generic topics (27 pairs), and in one case a specific feature applied to a specific topic. Re-ranking submitted shots based on features shows promise for automatic search runs, but not for interactive runs where a person already took care to rank shots well.

- Semantics in Video Retrieval | Pp. 134-144

doi: 10.1007/11526346_18

Efficient Shape Indexing Using an Information Theoretic Representation

Eric Spellman; Baba C. Vemuri

Efficient retrieval often requires an indexing structure on the database in question. We present an indexing scheme for cases when the dissimilarity measure is the Kullback-Liebler (KL) divergence. Devising such a scheme is difficult because the KL-divergence is not a metric, failing to satisfy the triangle inequality or even .niteness in general. We de.ne an optimal represenative of a set of distributions to serve as the basis of such an indexing structure. This representative, dubbed the , minimizes the worst case KLdivergence from it to the elements of its set. This, along with a lower bound on the KL-divergence from the query to the elements of a set, allows us to prune the search, increasing e.ciency while guarenteeing that we never discard the nearest neighbors. We present results of querying the Princeton Shape Database which show significant speed-ups over an exhaustive search and over an analogous approach using a more mundane representative.

- Image Indexing and Retrieval | Pp. 145-153

doi: 10.1007/11526346_19

Efficient Compressed Domain Target Image Search and Retrieval

Javier Bracamonte; Michael Ansorge; Fausto Pellandini; Pierre-André Farine

In this paper we introduce a low complexity and accurate technique for target image search and retrieval. This method, which operates directly in the compressed JPEG domain, addresses two of the CBIR challenges stated by The Benchathlon Network regarding the search of a specific image: finding out if an exact same image exists in a database, and identifying this occurrence even when the database image has been compressed with a different coding bit-rate. The proposed technique can be applied in feature-containing or featureless image collections, and thus it is also suitable to search for image copies that might exist on the Web for law enforcement of copyrighted material. The reported method exploits the fact that the phase of the Discrete Cosine Transform coefficients contains a significant amount of information of a transformed image. By processing only the phase part of these coefficients, a simple, fast, and accurate target image search and retrieval technique is achieved.

- Image Indexing and Retrieval | Pp. 154-163

doi: 10.1007/11526346_20

An Effective Multi-dimensional Index Strategy for Cluster Architectures

Li Wu; Timo Bretschneider

Most modern image database systems employ content-based image re trieval techniques and various multi-dimensional indexing structures to speed up the query performance. While the first aspect ensures an intuitive re trieval for the user, the latter guarantees an efficient han dling of huge data amounts. How ever, beyond a system inherent threshold only the simultaneous paral lelisa tion of the indexing structure can improve the system’s performance. In such an ap proach one of the key factors is the de-clustering of the data. To tackle the high lighted issues, this pa per proposes an effective multi-dimensional in dex strat egy with de-clustering based on the vantage point tree with suitable simi lar ity measure for content-based re trieval. The conducted experiments show the effec tive and efficient behaviour for an actual image database.

- Image Indexing and Retrieval | Pp. 164-173