Catálogo de publicaciones - libros
Image and Video Retrieval: 5th Internatinoal Conference, CIVR 2006, Tempe, AZ, USA, July 13-15, 2006, Proceedings
Hari Sundaram ; Milind Naphade ; John R. Smith ; Yong Rui (eds.)
En conferencia: 5º International Conference on Image and Video Retrieval (CIVR) . Tempe, AZ, USA . July 13, 2006 - July 15, 2006
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Computer Graphics; Information Storage and Retrieval; Database Management; Information Systems Applications (incl. Internet); Multimedia Information Systems; Image Processing and Computer Vision
Disponibilidad
| Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
|---|---|---|---|---|
| No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-36018-6
ISBN electrónico
978-3-540-36019-3
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Cobertura temática
Tabla de contenidos
doi: 10.1007/11788034_51
Modular Design of Media Retrieval Workflows Using ARIA
Lina Peng; Gisik Kwon; Yinpeng Chen; K. Selçuk Candan; Hari Sundaram; Karamvir Chatha; Maria Luisa Sapino
In this demo, we present the use of the ARIA platform for modular design of media processing and retrieval applications. ARIA is a middleware for describing and executing media processing workflows to process, filter, and fuse sensory inputs and actuate responses in real-time. ARIA is designed with the goal of maximum modularity and ease of integration of a diverse collection of media processing components and data sources. Moreover, ARIA is cognizant of the fact that various media operators and data structures are adaptable in nature; i.e, the delay, size, and quality/precision characteristics of these operators can be controlled via various parameters. In this demo, we present the ARIA design interface in different image processing and retrieval scenarios.
Palabras clave: Media retrieval workflows; modular design; image retrieval.
- Session A: ASU Special Session | Pp. 491-494
doi: 10.1007/11788034_52
Image Rectification for Stereoscopic Visualization Without 3D Glasses
Jin Zhou; Baoxin Li
There exist various methods for stereoscopic viewing of images, most requiring that a viewer wears some special glasses. Recent technology developments have resulted in displays that enable 3D viewing without glasses. In this paper, we present results from our proposed approach to automatic rectification of two images of the same scene captured by cameras at general positions, so that the results can be viewed on a 3D display. Both simulated and real data experiments are presented.
Palabras clave: Image rectification; stereo; 3-D visualization.
- Session A: ASU Special Session | Pp. 495-498
doi: 10.1007/11788034_53
Human Movement Analysis for Interactive Dance
Gang Qian; Jodi James; Todd Ingalls; Thanassis Rikakis; Stjepan Rajko; Yi Wang; Daniel Whiteley; Feng Guo
In this paper, we provide a brief overview of the human movement analysis research at the Arts, Media and Engineering program, Arizona State University, and its applications in interactive dance. A family of robust algorithms has been developed to analyze dancers’ movement at multiple temporal and spatial levels from a number of perspectives such as marker distributions, joint angles, body silhouettes as well as weight distributions to conduct reliable dancer tracking, posture and gesture recognition. Multiple movement sensing modalities have been used and sometimes fused in our current research, including marker-based motion capture system, pressure sensitive floor and video cameras. Some of the developed algorithms have been successfully used in real life dance performances.
Palabras clave: Gaussian Mixture Model; Motion Capture; Gesture Recognition; Dynamic Time Warping; Motion Capture System.
- Session A: ASU Special Session | Pp. 499-502
doi: 10.1007/11788034_54
Exploring the Dynamics of Visual Events in the Multi-dimensional Semantic Concept Space
Shahram Ebadollahi; Lexing Xie; Andres Abreu; Mark Podlaseck; Shih-Fu Chang; John R. Smith
We present a system for visualizing event detection in video and revealing the algorithmic and scientific insights. Visual events are viewed as evolving temporal patterns in the semantic concept space. For video clips of different events, we present their corresponding traces in the semantic concept space as the event evolves. The presentation of the event in the concept space is scored by pre-trained models of the dynamics of each concept in the context of the event, which provides a measure of how well the given event matches the evolution pattern of the target event in the multi-dimensional concept space. Scores obtained for different videos is shown to project them into different parts of the final score space. This presentation walks the user through the entire process of concept-centered event recognition for events such as exiting car , riot , and airplane flying .
- Session D: Demo Session | Pp. 503-505
doi: 10.1007/11788034_55
VideoSOM: A SOM-Based Interface for Video Browsing
Thomas Bärecke; Ewa Kijak; Andreas Nürnberger; Marcin Detyniecki
The VideoSOM sytem is a tool for content-based video navigation based on a growing self-organizing map. Our interface allows the user to browse the video content using simultaneously several perspectives, temporal as well as content-based representations of the video. Combined with the interaction possibilities between them this allows for efficient searching of relevant information in video content.
Palabras clave: Video Content; Video Codec; Shot Boundary; Shot Boundary Detection; Construct High Level.
- Session D: Demo Session | Pp. 506-509
doi: 10.1007/11788034_56
iBase: Navigating Digital Library Collections
Paul Browne; Stefan Rüger; Li-Qun Xu; Daniel Heesch
The growth of digital image collections in many areas of science and commercial environments has over the last decade spawned great interest in content-based image retrieval. A great variety of methods have been developed to retrieve images based on example queries and techniques to elicit and utilize relevance feedback (e.g. [4, 5]). Often the systems provide a simple framework that permits evaluation of the method and are not intended as full-fledged systems ready to be used in a realistic setting. Also, comparatively little effort has been expended on devising efficient techniques to browse image collections effectively. The few notable exceptions [1, 3, 6] treat browsing internally as a sequence of queries and thus do not leverage the performance benefits associated with pre-computed browsing structures.
- Session D: Demo Session | Pp. 510-513
doi: 10.1007/11788034_57
Exploring the Synergy of Humans and Machines in Extreme Video Retrieval
Alexander G. Hauptmann; Wei-Hao Lin; Rong Yan; Jun Yang; Robert V. Baron; Ming-Yu Chen; Sean Gilroy; Michael D. Gordon
We introduce an interface for efficient video search that exploits the human ability to quickly scan visual content, after automatic retrieval has arrange the images in expected order of relevance. While extreme video retrieval is taxing to the human, it is also extremely effective. Two variants of extreme retrieval are demonstrated, 1) RSVP which automatically pages through images with user-control of the paging speed, while the user marks relevant shots and 2) MBRP where the user manually controls paging and adjusts the number of images per page, depending on the density of relevant shots.
Palabras clave: Average Precision; Rapid Serial Visual Presentation; Mean Average Precision; Relevant Image; Video Retrieval.
- Session D: Demo Session | Pp. 514-517
doi: 10.1007/11788034_58
Efficient Summarizing of Multimedia Archives Using Cluster Labeling
Jelena Tešić; John R. Smith
In this demo we present a novel approach for labeling clusters in minimally annotated data archives. We propose to build on clustering by aggregating the automatically tagged semantics. We propose and compare four techniques for labeling the clusters and evaluate the performance compared to human labeled ground-truth. We define the error measures to quantify the results, and present examples of the cluster labeling results obtained on the BBC stock shots and broadcast news videos from the TRECVID-2005 video data set.
- Session D: Demo Session | Pp. 518-520
doi: 10.1007/11788034_59
Collaborative Concept Tagging for Images Based on Ontological Thinking
Alireza Kashian; Robert Kheng Leng Gay; Abdul Halim Abdul Karim
Without textual descriptions or label information of images, searching semantic concepts in image databases is a very challenging task. Automatic annotation techniques for images are aimed to detect objects which are located visually inside images, like a tiger in grass. One challenge which remains to be solved is “Understanding”. Image understanding is something beyond automatic annotation scope. The second issue is manual annotation of images. In manual annotation, user contribution is important. In this demo, we have developed an online tool which simulates a collaborative environment to help users to generate several facts for selected images. Ontological thinking led us to devise a method for a simple user interface. We are also studying the construction of synergies out of generated facts for our future work.
- Session D: Demo Session | Pp. 521-524
doi: 10.1007/11788034_60
Multimodal Search for Effective Video Retrieval
Apostol (Paul) Natsev
Semantic search and retrieval of multimedia content is a challenging research field that has drawn significant attention in the multimedia research community. With the dramatic growth of digital media at home, in enterprises, and on the web, methods for effective indexing and search of visual content are vital in unlocking the value of this content. Conventional database search and text search over large textual corpora are both well-understood problems with ubiquitous applications. However, search in non-textual unstructured content, such as image and video data, is not nearly as mature or effective. A common approach for video retrieval, for example, is to apply conventional text search techniques to the associated closed caption or speech transcript. This approach works fairly well for retrieving named entities, such as specific people, objects, or places. However, it does not work well for generic topics related to general settings, events, or people actions, as the speech track rarely describes the background setting or the visual appearance of the subject. Text-based search is not even applicable to scenarios that do not have speech transcripts or other textual metadata for indexing purposes (e.g., consumer photo collections). In addition, speech-based video retrieval frequently leads to false matches of segments that talk about but do not depict the entity of interest. Because of these and other limitations, it is now apparent that conventional text search techniques on their own are not sufficient for effective image and video retrieval, and they need to be combined with techniques that consider the visual semantics of the content. The most substantial work in this field is presented in the TREC Video Retrieval Evaluation (TRECVID) community, which focuses its efforts on evaluating video retrieval approaches by providing common video datasets and a standard set of queries.
- Session D: Demo Session | Pp. 525-528