Catálogo de publicaciones - libros
Image and Video Retrieval: 5th Internatinoal Conference, CIVR 2006, Tempe, AZ, USA, July 13-15, 2006, Proceedings
Hari Sundaram ; Milind Naphade ; John R. Smith ; Yong Rui (eds.)
En conferencia: 5º International Conference on Image and Video Retrieval (CIVR) . Tempe, AZ, USA . July 13, 2006 - July 15, 2006
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Computer Graphics; Information Storage and Retrieval; Database Management; Information Systems Applications (incl. Internet); Multimedia Information Systems; Image Processing and Computer Vision
Disponibilidad
| Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
|---|---|---|---|---|
| No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-36018-6
ISBN electrónico
978-3-540-36019-3
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Cobertura temática
Tabla de contenidos
doi: 10.1007/11788034_31
Using Topic Concepts for Semantic Video Shots Classification
Stéphane Ayache; Georges Quénot; Jérôme Gensel; Shin’ichi Satoh
Automatic semantic classification of video databases is very useful for users searching and browsing but it is a very challenging research problem as well. Combination of visual and text modalities is one of the key issues to bridge the semantic gap between signal and semantic. In this paper, we propose to enhance the classification of high-level concepts using intermediate topic concepts and study various fusion strategies to combine topic concepts with visual features in order to outperform unimodal classifiers. We have conducted several experiments on the TRECVID’05 collection and show here that several intermediate topic classifiers can bridge parts of the semantic gap and help to detect high-level concepts.
- Session P1: Poster I | Pp. 300-309
doi: 10.1007/11788034_32
A Multi-feature Optimization Approach to Object-Based Image Classification
Qianni Zhang; Ebroul Izquierdo
This paper proposes a novel approach for the construction and use of multi-feature spaces in image classification. The proposed technique combines low-level descriptors and defines suitable metrics. It aims at representing and measuring similarity between semantically meaningful objects within the defined multi-feature space. The approach finds the best linear combination of predefined visual descriptor metrics using a Multi-Objective Optimization technique. The obtained metric is then used to fuse multiple non-linear descriptors is be achieved and applied in image classification.
Palabras clave: Semantic Concept; Semantic Object; Single Descriptor; Representative Block; Elementary Building Block.
- Session P1: Poster I | Pp. 310-319
doi: 10.1007/11788034_33
Eliciting Perceptual Ground Truth for Image Segmentation
Victoria Hodge; Garry Hollier; John Eakins; Jim Austin
In this paper, we investigate human visual perception and establish a body of ground truth data elicited from human visual studies. We aim to build on the formative work of Ren, Eakins and Briggs who produced an initial ground truth database. Human participants were asked to draw and rank their perceptions of the parts of a series of figurative images. These rankings were then used to score the perceptions, identify the preferred human breakdowns and thus allow us to induce perceptual rules for human decomposition of figurative images. The results suggest that the human breakdowns follow well-known perceptual principles in particular the Gestalt laws.
Palabras clave: Image Segmentation; Illusory Contour; Image Component; Perception Change; Human Visual Perception.
- Session P1: Poster I | Pp. 320-329
doi: 10.1007/11788034_34
Asymmetric Learning and Dissimilarity Spaces for Content-Based Retrieval
Eric Bruno; Nicolas Moenne-Loccoz; Stéphane Marchand-Maillet
This paper presents novel dissimilarity space specially designed for interactive multimedia retrieval. By providing queries made of positive and negative examples, the goal consists in learning the positive class distribution. This classification problem is known to be asymmetric, i.e. the negative class does not cluster in the original feature spaces. We introduce here the idea of Query-based Dissimilarity Space (QDS) which enables to cope with the asymmetrical setup by converting it in a more classical 2-class problem. The proposed approach is evaluated on both artificial data and real image database, and compared with state-of-the-art algorithms.
Palabras clave: Feature Space; Image Retrieval; Average Precision; Relevance Feedback; Positive Class.
- Session P2: Poster II | Pp. 330-339
doi: 10.1007/11788034_35
Video Navigation Based on Self-Organizing Maps
Thomas Bärecke; Ewa Kijak; Andreas Nürnberger; Marcin Detyniecki
Content-based video navigation is an efficient method for browsing video information. A common approach is to cluster shots into groups and visualize them afterwards. In this paper, we present a prototype that follows in general this approach. The clustering ignores temporal information and is based on a growing self-organizing map algorithm. They provide some inherent visualization properties such as similar elements can be found easily in adjacent cells. We focus on studying the applicability of SOMs for video navigation support. We complement our interface with an original time bar control providing – at the same time – an integrated view of time and content based information. The aim is to supply the user with as much information as possible on one single screen, without overwhelming him.
Palabras clave: Video Content; Colour Histogram; Winner Neuron; Temporal Segmentation; Shot Boundary Detection.
- Session P2: Poster II | Pp. 340-349
doi: 10.1007/11788034_36
Fuzzy SVM Ensembles for Relevance Feedback in Image Retrieval
Yong Rao; Padma Mundur; Yelena Yesha
Relevance feedback has been integrated into content-based retrieval systems to overcome the semantic gap problem. Recently, Support Vector Machines (SVMs) have been widely used to learn the users’ semantic query concept from users’ feedback. The feedback is either ‘relevant’ or ‘irrelevant’ which forces the users to make a binary decision during each retrieval iteration. However, human’s perception of visual content is quite subjective and therefore, the notion of whether or not an image is relevant is rather vague and hard to define. Part of the small training samples problem faced by traditional SVMs can be thought of as the result of strict binary decision-making. In this paper, we propose a Fuzzy SVM technique to overcome the small sample problem. Using Fuzzy SVM, each sample can be assigned a fuzzy membership to model users’ feedback gradually from ‘irrelevant’ to ‘relevant’ instead of strict binary labeling. We also propose to use Fuzzy SVM ensembles to further improve the classification results. We conduct extensive experiments to evaluate the performance of our proposed algorithm. Compared to the experimental results using traditional SVMs, we demonstrate that our proposed approach can significantly improve the retrieval performance of semantic image retrieval.
Palabras clave: Image Retrieval; Average Precision; Fuzzy Membership; Relevance Feedback; Fuzzy Membership Function.
- Session P2: Poster II | Pp. 350-359
doi: 10.1007/11788034_37
Video Mining with Frequent Itemset Configurations
Till Quack; Vittorio Ferrari; Luc Van Gool
We present a method for mining frequently occurring objects and scenes from videos. Object candidates are detected by finding recurring spatial arrangements of affine covariant regions. Our mining method is based on the class of frequent itemset mining algorithms, which have proven their efficiency in other domains, but have not been applied to video mining before. In this work we show how to express vector-quantized features and their spatial relations as itemsets. Furthermore, a fast motion segmentation method is introduced as an attention filter for the mining algorithm. Results are shown on real world data consisting of music video clips.
- Session P2: Poster II | Pp. 360-369
doi: 10.1007/11788034_38
Using High-Level Semantic Features in Video Retrieval
Wujie Zheng; Jianmin Li; Zhangzhang Si; Fuzong Lin; Bo Zhang
Extraction and utilization of high-level semantic features are critical for more effective video retrieval. However, the performance of video retrieval hasn’t benefited much despite of the advances in high-level feature extraction. To make good use of high-level semantic features in video retrieval, we present a method called pointwise mutual information weighted scheme(PMIWS). The method makes a good judgment of the relevance of all the semantic features to the queries, taking the characteristics of semantic features into account. The method can also be extended for the fusion of multi-modalities. Experiment results based on TRECVID2005 corpus demonstrate the effectiveness of the method.
Palabras clave: Semantic Feature; Mean Average Precision; Video Retrieval; Text Retrieval; Multimedia Document.
- Session P2: Poster II | Pp. 370-379
doi: 10.1007/11788034_39
Recognizing Objects and Scenes in News Videos
Muhammet Baştan; Pınar Duygulu
We propose a new approach to recognize objects and scenes in news videos motivated by the availability of large video collections. This approach considers the recognition problem as the translation of visual elements to words. The correspondences between visual elements and words are learned using the methods adapted from statistical machine translation and used to predict words for particular image regions (region naming), for entire images (auto-annotation), or to associate the automatically generated speech transcript text with the correct video frames (video alignment). Experimental results are presented on TRECVID 2004 data set, which consists of about 150 hours of news videos associated with manual annotations and speech transcript text. The results show that the retrieval performance can be improved by associating visual and textual elements. Also, extensive analysis of features are provided and a method to combine features are proposed.
Palabras clave: Automatic Speech Recognition; Manual Annotation; Mean Average Precision; News Video; Statistical Machine Translation.
- Session P2: Poster II | Pp. 380-390
doi: 10.1007/11788034_40
Face Retrieval in Broadcasting News Video by Fusing Temporal and Intensity Information
Duy-Dinh Le; Shin’ichi Satoh; Michael E. Houle
Human faces play an important role in efficiently indexing and accessing video contents, especially broadcasting news video. However, face appearance in real environments exhibits many variations such as pose changes, facial expressions, aging, illumination changes, low resolution and occlusion, making it difficult for current state of the art face recognition techniques to obtain reasonable retrieval results. To handle this problem, this paper proposes an efficient retrieval method by integrating temporal information into facial intensity information. First, representative faces are quickly generated by using facial intensities to organize the face dataset into clusters. Next, temporal information is introduced to reorganize cluster memberships so as to improve overall retrieval performance. For scalability and efficiency, the clustering is based on a recently-proposed model involving correlations among relevant sets (neighborhoods) of data items. Neighborhood queries are handled using an approximate search index. Experiments on the 2005 TRECVID dataset show promising results.
Palabras clave: Face Recognition; Temporal Information; Face Appearance; News Video; Cluster Candidate.
- Session P2: Poster II | Pp. 391-400