Catálogo de publicaciones - libros

Compartir en
redes sociales


Image and Video Retrieval: 5th Internatinoal Conference, CIVR 2006, Tempe, AZ, USA, July 13-15, 2006, Proceedings

Hari Sundaram ; Milind Naphade ; John R. Smith ; Yong Rui (eds.)

En conferencia: 5º International Conference on Image and Video Retrieval (CIVR) . Tempe, AZ, USA . July 13, 2006 - July 15, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Graphics; Information Storage and Retrieval; Database Management; Information Systems Applications (incl. Internet); Multimedia Information Systems; Image Processing and Computer Vision

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-36018-6

ISBN electrónico

978-3-540-36019-3

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

Interactive Experiments in Object-Based Retrieval

Sorin Sav; Gareth J. F. Jones; Hyowon Lee; Noel E. O’Connor; Alan F. Smeaton

Object-based retrieval is a modality for video retrieval based on segmenting objects from video and allowing end-users to use these objects as part of querying. In this paper we describe an empirical TRECVid-like evaluation of object-based search, and compare it with a standard image-based search into an interactive experiment with 24 search topics and 16 users each performing 12 search tasks on 50 hours of rushes video. This experiment attempts to measure the impact of object-based search on a corpus of video where textual annotation is not available.

Palabras clave: Search Task; Query Image; Interactive Experiment; Video Object; Video Retrieval.

- Session O1: Interactive Image and Video Retrieval | Pp. 1-10

Learned Lexicon-Driven Interactive Video Retrieval

Cees Snoek; Marcel Worring; Dennis Koelma; Arnold Smeulders

We combine in this paper automatic learning of a large lexicon of semantic concepts with traditional video retrieval methods into a novel approach to narrow the semantic gap. The core of the proposed solution is formed by the automatic detection of an unprecedented lexicon of 101 concepts. From there, we explore the combination of query-by-concept, query-by-example, query-by-keyword, and user interaction into the MediaMill semantic video search engine. We evaluate the search engine against the 2005 NIST TRECVID video retrieval benchmark, using an international broadcast news archive of 85 hours. Top ranking results show that the lexicon-driven search engine is highly effective for interactive video retrieval.

Palabras clave: Search Engine; Average Precision; Semantic Concept; Video Retrieval; Query Interface.

- Session O1: Interactive Image and Video Retrieval | Pp. 11-20

Mining Novice User Activity with TRECVID Interactive Retrieval Tasks

Michael G. Christel; Ronald M. Conescu

This paper investigates the applicability of Informedia shot-based interface features for video retrieval in the hands of novice users, noted in past work as being too reliant on text search. The Informedia interface was redesigned to better promote the availability of additional video access mechanisms, and tested with TRECVID 2005 interactive search tasks. A transaction log analysis from 24 novice users shows a dramatic increase in the use of color search and shot-browsing mechanisms beyond traditional text search. In addition, a within-subjects study examined the employment of user activity mining to suppress shots previously seen. This strategy did not have the expected positive effect on performance. User activity mining and shot suppression did produce a broader shot space to be explored and resulted in more unique answer shots being discovered. Implications for shot suppression in video retrieval information exploration interfaces are discussed.

Palabras clave: Image Query; Mean Average Precision; Image Search; Video Retrieval; Text Search.

- Session O1: Interactive Image and Video Retrieval | Pp. 21-30

A Linear-Algebraic Technique with an Application in Semantic Image Retrieval

Jonathon S. Hare; Paul H. Lewis; Peter G. B. Enser; Christine J. Sandom

This paper presents a novel technique for learning the underlying structure that links visual observations with semantics. The technique, inspired by a text-retrieval technique known as cross-language latent semantic indexing uses linear algebra to learn the semantic structure linking image features and keywords from a training set of annotated images. This structure can then be applied to unannotated images, thus providing the ability to search the unannotated images based on keyword. This factorisation approach is shown to perform well, even when using only simple global image features.

Palabras clave: Training Image; Average Precision; Query Term; Mean Average Precision; Factorisation Approach.

- Session O2: Semantic Image Retrieval | Pp. 31-40

Logistic Regression of Generic Codebooks for Semantic Image Retrieval

João Magalhães; Stefan Rüger

This paper is about automatically annotating images with keywords in order to be able to retrieve images with text searches. Our approach is to model keywords such as ’mountain’ and ’city’ in terms of visual features that were extracted from images. In contrast to other algorithms, each specific keyword-model considers not only its own training data but also the whole training set by utilizing correlations of visual features to refine its own model. Initially, the algorithm clusters all visual features extracted from the full imageset, captures its salient structure (e.g. mixture of clusters or patterns) and represents this as a generic codebook. Then keywords that were associated with images in the training set are encoded as a linear combination of patterns from the generic codebook. We evaluate the validity of our approach in an image retrieval scenario with two distinct large datasets of real-world photos and corresponding manual annotations.

Palabras clave: Gaussian Mixture Model; Image Retrieval; Machine Translation; Average Precision; Retrieval Performance.

- Session O2: Semantic Image Retrieval | Pp. 41-50

Query by Semantic Example

Nikhil Rasiwasia; Nuno Vasconcelos; Pedro J. Moreno

A solution to the problem of image retrieval based on query-by-semantic-example (QBSE) is presented. QBSE extends the idea of query-by-example to the domain of semantic image representations. A semantic vocabulary is first defined, and a semantic retrieval system is trained to label each image with the posterior probability of appearance of each concept in the vocabulary. The resulting vector is interpreted as the projection of the image onto a semantic probability simplex, where a suitable similarity function is defined. Queries are specified by example images, which are projected onto the probability simplex. The database images whose projections on the simplex are closer to that of the query are declared its closest neighbors. Experimental evaluation indicates that 1) QBSE significantly outperforms the traditional query-by-visual-example paradigm when the concepts in the query image are known to the retrieval system, and 2) has equivalent performance even in the worst case scenario of queries composed by unknown concepts.

Palabras clave: Discrete Cosine Transform; Retrieval System; Image Retrieval; Query Image; Semantic Concept.

- Session O2: Semantic Image Retrieval | Pp. 51-60

Corner Detectors for Affine Invariant Salient Regions: Is Color Important?

Nicu Sebe; Theo Gevers; Joost van de Weijer; Sietse Dijkstra

Recently, a lot of research has been done on the matching of images and their structures. Although the approaches are very different, most methods use some kind of point selection from which descriptors or a hierarchy are derived. We focus here on the methods that are related to the detection of points and regions that can be detected in an affine invariant way. Most of the previous research concentrated on intensity based methods. However, we show in this work that color information can make a significant contribution to feature detection and matching. Our color based detection algorithms detect the most distinctive features and the experiments suggest that to obtain optimal performance, a tradeoff should be made between invariance and distinctiveness by an appropriate weighting of the intensity and color information.

Palabras clave: Color Information; JPEG Compression; Salient Point; Moment Matrix; Corner Detector.

- Session O3: Visual Feature Analysis | Pp. 61-71

Keyframe Retrieval by Keypoints: Can Point-to-Point Matching Help?

Wanlei Zhao; Yu-Gang Jiang; Chong-Wah Ngo

Bag-of-words representation with visual keypoints has recently emerged as an attractive approach for video search. In this paper, we study the degree of improvement when point-to-point (P2P) constraint is imposed on the bag-of-words. We conduct investigation on two tasks: near-duplicate keyframe (NDK) retrieval, and high-level concept classification, covering parts of TRECVID 2003 and 2005 datasets. In P2P matching, we propose a one-to-one symmetric keypoint matching strategy to diminish the noise effect during keyframe comparison. In addition, a new multi-dimensional index structure is proposed to speed up the matching process with keypoint filtering. Through experiments, we demonstrate that P2P constraint can significantly boost the performance of NDK retrieval, while showing competitive accuracy in concept classification of broadcast domain.

Palabras clave: Color Histogram; Background Clutter; Locality Sensitive Hash; Maximal Stable Extreme Region; Video Search.

- Session O3: Visual Feature Analysis | Pp. 72-81

Local Feature Trajectories for Efficient Event-Based Indexing of Video Sequences

Nicolas Moënne-Loccoz; Eric Bruno; Stéphane Marchand-Maillet

We address the problem of indexing video sequences according to the events they depict. While a number of different approaches have been proposed in order to describe events, none is sufficiently generic and computationally efficient to be applied to event-based retrieval of video sequences within large databases. In this paper, we propose a novel index of video sequences which aims at describing their dynamic content. This index relies on the local feature trajectories estimated from the spatio-temporal volume of the video sequences. The computation of this index is efficient, makes assumption neither about the represented events nor about the video sequences. We show through a batch of experimentations on standard video sequence corpus that this index permits to classify complex human activities as efficiently as state of the art methods while being far more efficient to retrieve generic classes of events.

Palabras clave: Video Sequence; Interest Point; Camera Motion; Equal Error Rate; Event Representation.

- Session O3: Visual Feature Analysis | Pp. 82-91

A Cascade of Unsupervised and Supervised Neural Networks for Natural Image Classification

Julien Ros; Christophe Laurent; Grégoire Lefebvre

This paper presents an architecture well suited for natural image classification or visual object recognition applications. The image content is described by a distribution of local prototype features obtained by projecting local signatures on a self-organizing map. The local signatures describe singularities around interest points detected by a wavelet-based salient points detector. Finally, images are classified by using a multilayer perceptron receiving local prototypes distribution as input. This architecture obtains good results both in terms of global classification rates and computing times on different well known datasets.

Palabras clave: Support Vector Machine; Radial Basis Function; Area Under Curve; Interest Point; Query Image.

- Session O4: Learning and Classification | Pp. 92-101