Catálogo de publicaciones - libros

Compartir en
redes sociales

Image and Video Retrieval: 4th International Conference, CIVR 2005, Singapore, July 20-22, 2005, Proceedings

Wee-Kheng Leow ; Michael S. Lew ; Tat-Seng Chua ; Wei-Ying Ma ; Lekha Chaisorn ; Erwin M. Bakker (eds.)

En conferencia: 4º International Conference on Image and Video Retrieval (CIVR) . Singapore, Singapore . July 20, 2005 - July 22, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Graphics; Information Storage and Retrieval; Database Management; Information Systems Applications (incl. Internet); Multimedia Information Systems; Image Processing and Computer Vision

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2005	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-27858-0

ISBN electrónico

978-3-540-31678-7

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2005

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/11526346_21

Systematic Evaluation of Machine Translation Methods for Image and Video Annotation

Paola Virga; Pınar Duygulu

In this study, we present a systematic evaluation of machine translation methods applied to the image annotation problem. We used the well-studied Corel data set and the broadcast news videos used by TRECVID 2003 as our dataset. We experimented with different models of machine translation with different parameters. The results showed that the simplest model produces the best performance. Based on this experience, we also proposed a new method, based on cross-lingual information retrieval techniques, and obtained a better retrieval performance.

- Image/Video Annotation and Clustering | Pp. 174-183

doi: 10.1007/11526346_22

Automatic Image Semantic Annotation Based on Image-Keyword Document Model

Xiangdong Zhou; Lian Chen; Jianye Ye; Qi Zhang; Baile Shi

This paper presents a novel method of automatic image semantic annotation. Our approach is based on the Image-Keyword Document Model (IKDM) with image features discretization. According to IKDM, the image keyword annotation is conducted using image similarity measurement based on language model from text information retrieval domain. Through the experiments on a testing set of 5000 annotated images, our approach demonstrates great improvement of annotation performance compared with the known discretization-based image annotation model such as CMRM. Our approach also performs better in annotation time compared with the continuous model such as CRM.

- Image/Video Annotation and Clustering | Pp. 184-193

doi: 10.1007/11526346_23

Region-Based Image Clustering and Retrieval Using Multiple Instance Learning

Chengcui Zhang; Xin Chen

Multiple Instance Learning (MIL) is a special kind of supervised learning problem that has been studied actively in recent years. We propose an approach based on One-Class Support Vector Machine (SVM) to solve MIL problem in the region-based Content Based Image Retrieval (CBIR). This is an area where a huge number of image regions are involved. For the sake of efficiency, we adopt a Genetic Algorithm based clustering method to reduce the search space. Relevance Feedback technique is incorporated to provide progressive guidance to the learning process. Performance is evaluated and the effectiveness of our retrieval algorithm is demonstrated in comparative studies.

- Image/Video Annotation and Clustering | Pp. 194-204

doi: 10.1007/11526346_24

Interactive Video Search Using Multilevel Indexing

John Adcock; Matthew Cooper; Andreas Girgensohn; Lynn Wilcox

Large video collections present a unique set of challenges to the search system designer. Text transcripts do not always provide an accurate index to the visual content, and the performance of visually based semantic extraction techniques is often inadequate for search tasks. The searcher must be relied upon to provide detailed judgment of the relevance of specific video segments. We describe a video search system that facilitates this user task by efficiently presenting search results in semantically meaningful units to simplify exploration of query results and query reformulation. We employ a story segmentation system and supporting user interface elements to effectively present query results at the story level. The system was tested in the 2004 TRECVID interactive search evaluations with very positive results.

- Interactive Video Retrieval and Others | Pp. 205-214

doi: 10.1007/11526346_25

Assessing Effectiveness in Video Retrieval

Alexander Hauptmann; Wei-Hao Lin

This paper examines results from the last two years of the TRECVID video retrieval evaluations. While there is encouraging evidence about progress in video retrieval, there are several major disappointments confirming that the field of video retrieval is still in its infancy. Many publications blithely attribute improvements in retrieval tasks to the different techniques without paying much attention to the statistical reliability of the comparisons. We conduct an analysis of the official TRECVID evaluation results, using both retrieval experiment error rates and ANOVA measures, and demonstrate that the difference between many systems is not statistically significant. We conclude the paper with the lessons learned from both results with and without statistically significant difference.

- Interactive Video Retrieval and Others | Pp. 215-225

doi: 10.1007/11526346_26

Person Spotting: Video Shot Retrieval for Face Sets

Josef Sivic; Mark Everingham; Andrew Zisserman

Matching people based on their imaged face is hard because of the well known problems of illumination, pose, size and expression variation. Indeed these variations can exceed those due to identity. Fortunately, videos of people have the happy benefit of containing multiple exemplars of each person in a form that can easily be associated automatically using straightforward visual tracking. We describe progress in harnessing these multiple exemplars in order to retrieve humans automatically in videos, given a query face in a shot. There are three areas of interest: (i) the matching of sets of exemplars provided by “tubes” of the spatial-temporal volume; (ii) the description of the face using a spatial orientation field; and, (iii) the structuring of the problem so that retrieval is immediate at run time.

The result is a person retrieval system, able to retrieve a ranked list of shots containing a particular person in the manner of Google. The method has been implemented and tested on two feature length movies.

- Image/Video Retrieval Applications | Pp. 226-236

doi: 10.1007/11526346_27

Robust Methods and Representations for Soccer Player Tracking and Collision Resolution

Lluis Barceló; Xavier Binefa; John R. Kender

We present a method of tracking multiple players in a soccer match using video taken from a single fixed camera with pan, tilt and zoom. We extract a single mosaic of the playing field and robustly derive its homography to a playing field model, based on color information, line extraction, and a Hausdorff distance measure. Players are identified by color and shape, and tracked in the image mosaic space using a Kalman filter. The frequent occlusions of multiple players are resolved using a novel representation acted on by a rule-based method, which recognizes differences between removable and intrinsic ambiguities. We test the methods with synthetic and real data.

- Image/Video Retrieval Applications | Pp. 237-246

doi: 10.1007/11526346_28

Modeling Multi-object Spatial Relationships for Satellite Image Database Indexing and Retrieval

Grant Scott; Matt Klaric; Chi-Ren Shyu

Geospatial information analysts are interested in spatial configurations of objects in satellite imagery and, more importantly, the ability to search a large-scale database of satellite images using spatial configurations as the query mechanism. In this paper we present a new method to model spatial relationships among sets of three or more objects in satellite images for scene indexing and retrieval by generating discrete spatial signatures. The proposed method is highly insensitive to scaling, rotation, and translation of the spatial configuration. Additionally, the method is efficient for use in real-time applications, such as online satellite image retrievals. Moreover, the number of objects in a spatial configuration has minimal effect on the efficiency of the method.

- Image/Video Retrieval Applications | Pp. 247-256

doi: 10.1007/11526346_29

Hot Event Detection and Summarization by Graph Modeling and Matching

Yuxin Peng; Chong-Wah Ngo

This paper proposes a new approach for hot event detection and summarization of news videos. The approach is mainly based on two graph algorithms: optimal matching (OM) and normalized cut (NC). Initially, OM is employed to measure the visual similarity between all pairs of events under the one-to-one mapping constraint among video shots. Then, news events are represented as a complete weighted graph and NC is carried out to globally and optimally partition the graph into event clusters. Finally, based on the cluster size and globality of events, hot events can be automatically detected and selected as the summaries of news videos across TV stations of various channels and languages. Our proposed approach has been tested on news videos of 10 hours and has been found to be effective.

- Video Processing, Retrieval and Multimedia Systems (Poster) | Pp. 257-266

doi: 10.1007/11526346_30

Domain Knowledge Ontology Building for Semantic Video Event Description

Dan Song; Hai Tao Liu; Miyoung Cho; Hanil Kim; Pankoo Kim

A novel method for video event analysis and description based on the domain knowledge ontology has been put forward in this paper. Semantic concepts in the context of the video event are described in one specific domain enriched with qualitative attributes of the semantic objects, multimedia processing approaches and domain independent factors: low level features (pixel color, motion vectors and spatio-temporal relationship). In this work, we consider one shot (episode) in the Billiard Game of video as the domain to explain how the high-level semantic mapped into low level features and the detection of the semantically important event.

- Video Processing, Retrieval and Multimedia Systems (Poster) | Pp. 267-275