Catálogo de publicaciones - libros
Image and Video Retrieval: 4th International Conference, CIVR 2005, Singapore, July 20-22, 2005, Proceedings
Wee-Kheng Leow ; Michael S. Lew ; Tat-Seng Chua ; Wei-Ying Ma ; Lekha Chaisorn ; Erwin M. Bakker (eds.)
En conferencia: 4º International Conference on Image and Video Retrieval (CIVR) . Singapore, Singapore . July 20, 2005 - July 22, 2005
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Computer Graphics; Information Storage and Retrieval; Database Management; Information Systems Applications (incl. Internet); Multimedia Information Systems; Image Processing and Computer Vision
Disponibilidad
| Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
|---|---|---|---|---|
| No detectada | 2005 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-27858-0
ISBN electrónico
978-3-540-31678-7
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2005
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2005
Cobertura temática
Tabla de contenidos
doi: 10.1007/11526346_21
Systematic Evaluation of Machine Translation Methods for Image and Video Annotation
Paola Virga; Pınar Duygulu
In this study, we present a systematic evaluation of machine translation methods applied to the image annotation problem. We used the well-studied Corel data set and the broadcast news videos used by TRECVID 2003 as our dataset. We experimented with different models of machine translation with different parameters. The results showed that the simplest model produces the best performance. Based on this experience, we also proposed a new method, based on cross-lingual information retrieval techniques, and obtained a better retrieval performance.
- Image/Video Annotation and Clustering | Pp. 174-183
doi: 10.1007/11526346_22
Automatic Image Semantic Annotation Based on Image-Keyword Document Model
Xiangdong Zhou; Lian Chen; Jianye Ye; Qi Zhang; Baile Shi
This paper presents a novel method of automatic image semantic annotation. Our approach is based on the Image-Keyword Document Model (IKDM) with image features discretization. According to IKDM, the image keyword annotation is conducted using image similarity measurement based on language model from text information retrieval domain. Through the experiments on a testing set of 5000 annotated images, our approach demonstrates great improvement of annotation performance compared with the known discretization-based image annotation model such as CMRM. Our approach also performs better in annotation time compared with the continuous model such as CRM.
- Image/Video Annotation and Clustering | Pp. 184-193
doi: 10.1007/11526346_23
Region-Based Image Clustering and Retrieval Using Multiple Instance Learning
Chengcui Zhang; Xin Chen
Multiple Instance Learning (MIL) is a special kind of supervised learning problem that has been studied actively in recent years. We propose an approach based on One-Class Support Vector Machine (SVM) to solve MIL problem in the region-based Content Based Image Retrieval (CBIR). This is an area where a huge number of image regions are involved. For the sake of efficiency, we adopt a Genetic Algorithm based clustering method to reduce the search space. Relevance Feedback technique is incorporated to provide progressive guidance to the learning process. Performance is evaluated and the effectiveness of our retrieval algorithm is demonstrated in comparative studies.
- Image/Video Annotation and Clustering | Pp. 194-204
doi: 10.1007/11526346_24
Interactive Video Search Using Multilevel Indexing
John Adcock; Matthew Cooper; Andreas Girgensohn; Lynn Wilcox
Large video collections present a unique set of challenges to the search system designer. Text transcripts do not always provide an accurate index to the visual content, and the performance of visually based semantic extraction techniques is often inadequate for search tasks. The searcher must be relied upon to provide detailed judgment of the relevance of specific video segments. We describe a video search system that facilitates this user task by efficiently presenting search results in semantically meaningful units to simplify exploration of query results and query reformulation. We employ a story segmentation system and supporting user interface elements to effectively present query results at the story level. The system was tested in the 2004 TRECVID interactive search evaluations with very positive results.
- Interactive Video Retrieval and Others | Pp. 205-214
doi: 10.1007/11526346_25
Assessing Effectiveness in Video Retrieval
Alexander Hauptmann; Wei-Hao Lin
This paper examines results from the last two years of the TRECVID video retrieval evaluations. While there is encouraging evidence about progress in video retrieval, there are several major disappointments confirming that the field of video retrieval is still in its infancy. Many publications blithely attribute improvements in retrieval tasks to the different techniques without paying much attention to the statistical reliability of the comparisons. We conduct an analysis of the official TRECVID evaluation results, using both retrieval experiment error rates and ANOVA measures, and demonstrate that the difference between many systems is not statistically significant. We conclude the paper with the lessons learned from both results with and without statistically significant difference.
- Interactive Video Retrieval and Others | Pp. 215-225
doi: 10.1007/11526346_26
Person Spotting: Video Shot Retrieval for Face Sets
Josef Sivic; Mark Everingham; Andrew Zisserman
Matching people based on their imaged face is hard because of the well known problems of illumination, pose, size and expression variation. Indeed these variations can exceed those due to identity. Fortunately, videos of people have the happy benefit of containing multiple exemplars of each person in a form that can easily be associated automatically using straightforward visual tracking. We describe progress in harnessing these multiple exemplars in order to retrieve humans automatically in videos, given a query face in a shot. There are three areas of interest: (i) the matching of sets of exemplars provided by “tubes” of the spatial-temporal volume; (ii) the description of the face using a spatial orientation field; and, (iii) the structuring of the problem so that retrieval is immediate at run time.
The result is a person retrieval system, able to retrieve a ranked list of shots containing a particular person in the manner of Google. The method has been implemented and tested on two feature length movies.
- Image/Video Retrieval Applications | Pp. 226-236
doi: 10.1007/11526346_27
Robust Methods and Representations for Soccer Player Tracking and Collision Resolution
Lluis Barceló; Xavier Binefa; John R. Kender
We present a method of tracking multiple players in a soccer match using video taken from a single fixed camera with pan, tilt and zoom. We extract a single mosaic of the playing field and robustly derive its homography to a playing field model, based on color information, line extraction, and a Hausdorff distance measure. Players are identified by color and shape, and tracked in the image mosaic space using a Kalman filter. The frequent occlusions of multiple players are resolved using a novel representation acted on by a rule-based method, which recognizes differences between removable and intrinsic ambiguities. We test the methods with synthetic and real data.
- Image/Video Retrieval Applications | Pp. 237-246
doi: 10.1007/11526346_28
Modeling Multi-object Spatial Relationships for Satellite Image Database Indexing and Retrieval
Grant Scott; Matt Klaric; Chi-Ren Shyu
Geospatial information analysts are interested in spatial configurations of objects in satellite imagery and, more importantly, the ability to search a large-scale database of satellite images using spatial configurations as the query mechanism. In this paper we present a new method to model spatial relationships among sets of three or more objects in satellite images for scene indexing and retrieval by generating discrete spatial signatures. The proposed method is highly insensitive to scaling, rotation, and translation of the spatial configuration. Additionally, the method is efficient for use in real-time applications, such as online satellite image retrievals. Moreover, the number of objects in a spatial configuration has minimal effect on the efficiency of the method.
- Image/Video Retrieval Applications | Pp. 247-256
doi: 10.1007/11526346_29
Hot Event Detection and Summarization by Graph Modeling and Matching
Yuxin Peng; Chong-Wah Ngo
This paper proposes a new approach for hot event detection and summarization of news videos. The approach is mainly based on two graph algorithms: optimal matching (OM) and normalized cut (NC). Initially, OM is employed to measure the visual similarity between all pairs of events under the one-to-one mapping constraint among video shots. Then, news events are represented as a complete weighted graph and NC is carried out to globally and optimally partition the graph into event clusters. Finally, based on the cluster size and globality of events, hot events can be automatically detected and selected as the summaries of news videos across TV stations of various channels and languages. Our proposed approach has been tested on news videos of 10 hours and has been found to be effective.
- Video Processing, Retrieval and Multimedia Systems (Poster) | Pp. 257-266
doi: 10.1007/11526346_30
Domain Knowledge Ontology Building for Semantic Video Event Description
Dan Song; Hai Tao Liu; Miyoung Cho; Hanil Kim; Pankoo Kim
A novel method for video event analysis and description based on the domain knowledge ontology has been put forward in this paper. Semantic concepts in the context of the video event are described in one specific domain enriched with qualitative attributes of the semantic objects, multimedia processing approaches and domain independent factors: low level features (pixel color, motion vectors and spatio-temporal relationship). In this work, we consider one shot (episode) in the Billiard Game of video as the domain to explain how the high-level semantic mapped into low level features and the detection of the semantically important event.
- Video Processing, Retrieval and Multimedia Systems (Poster) | Pp. 267-275