Catálogo de publicaciones - libros

Compartir en
redes sociales


Image and Video Retrieval: 5th Internatinoal Conference, CIVR 2006, Tempe, AZ, USA, July 13-15, 2006, Proceedings

Hari Sundaram ; Milind Naphade ; John R. Smith ; Yong Rui (eds.)

En conferencia: 5º International Conference on Image and Video Retrieval (CIVR) . Tempe, AZ, USA . July 13, 2006 - July 15, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer Graphics; Information Storage and Retrieval; Database Management; Information Systems Applications (incl. Internet); Multimedia Information Systems; Image Processing and Computer Vision

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-36018-6

ISBN electrónico

978-3-540-36019-3

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

Modular Design of Media Retrieval Workflows Using ARIA

Lina Peng; Gisik Kwon; Yinpeng Chen; K. Selçuk Candan; Hari Sundaram; Karamvir Chatha; Maria Luisa Sapino

In this demo, we present the use of the ARIA platform for modular design of media processing and retrieval applications. ARIA is a middleware for describing and executing media processing workflows to process, filter, and fuse sensory inputs and actuate responses in real-time. ARIA is designed with the goal of maximum modularity and ease of integration of a diverse collection of media processing components and data sources. Moreover, ARIA is cognizant of the fact that various media operators and data structures are adaptable in nature; i.e, the delay, size, and quality/precision characteristics of these operators can be controlled via various parameters. In this demo, we present the ARIA design interface in different image processing and retrieval scenarios.

Palabras clave: Media retrieval workflows; modular design; image retrieval.

- Session A: ASU Special Session | Pp. 491-494

Image Rectification for Stereoscopic Visualization Without 3D Glasses

Jin Zhou; Baoxin Li

There exist various methods for stereoscopic viewing of images, most requiring that a viewer wears some special glasses. Recent technology developments have resulted in displays that enable 3D viewing without glasses. In this paper, we present results from our proposed approach to automatic rectification of two images of the same scene captured by cameras at general positions, so that the results can be viewed on a 3D display. Both simulated and real data experiments are presented.

Palabras clave: Image rectification; stereo; 3-D visualization.

- Session A: ASU Special Session | Pp. 495-498

Human Movement Analysis for Interactive Dance

Gang Qian; Jodi James; Todd Ingalls; Thanassis Rikakis; Stjepan Rajko; Yi Wang; Daniel Whiteley; Feng Guo

In this paper, we provide a brief overview of the human movement analysis research at the Arts, Media and Engineering program, Arizona State University, and its applications in interactive dance. A family of robust algorithms has been developed to analyze dancers’ movement at multiple temporal and spatial levels from a number of perspectives such as marker distributions, joint angles, body silhouettes as well as weight distributions to conduct reliable dancer tracking, posture and gesture recognition. Multiple movement sensing modalities have been used and sometimes fused in our current research, including marker-based motion capture system, pressure sensitive floor and video cameras. Some of the developed algorithms have been successfully used in real life dance performances.

Palabras clave: Gaussian Mixture Model; Motion Capture; Gesture Recognition; Dynamic Time Warping; Motion Capture System.

- Session A: ASU Special Session | Pp. 499-502

Exploring the Dynamics of Visual Events in the Multi-dimensional Semantic Concept Space

Shahram Ebadollahi; Lexing Xie; Andres Abreu; Mark Podlaseck; Shih-Fu Chang; John R. Smith

We present a system for visualizing event detection in video and revealing the algorithmic and scientific insights. Visual events are viewed as evolving temporal patterns in the semantic concept space. For video clips of different events, we present their corresponding traces in the semantic concept space as the event evolves. The presentation of the event in the concept space is scored by pre-trained models of the dynamics of each concept in the context of the event, which provides a measure of how well the given event matches the evolution pattern of the target event in the multi-dimensional concept space. Scores obtained for different videos is shown to project them into different parts of the final score space. This presentation walks the user through the entire process of concept-centered event recognition for events such as exiting car , riot , and airplane flying .

- Session D: Demo Session | Pp. 503-505

VideoSOM: A SOM-Based Interface for Video Browsing

Thomas Bärecke; Ewa Kijak; Andreas Nürnberger; Marcin Detyniecki

The VideoSOM sytem is a tool for content-based video navigation based on a growing self-organizing map. Our interface allows the user to browse the video content using simultaneously several perspectives, temporal as well as content-based representations of the video. Combined with the interaction possibilities between them this allows for efficient searching of relevant information in video content.

Palabras clave: Video Content; Video Codec; Shot Boundary; Shot Boundary Detection; Construct High Level.

- Session D: Demo Session | Pp. 506-509

iBase: Navigating Digital Library Collections

Paul Browne; Stefan Rüger; Li-Qun Xu; Daniel Heesch

The growth of digital image collections in many areas of science and commercial environments has over the last decade spawned great interest in content-based image retrieval. A great variety of methods have been developed to retrieve images based on example queries and techniques to elicit and utilize relevance feedback (e.g. [4, 5]). Often the systems provide a simple framework that permits evaluation of the method and are not intended as full-fledged systems ready to be used in a realistic setting. Also, comparatively little effort has been expended on devising efficient techniques to browse image collections effectively. The few notable exceptions [1, 3, 6] treat browsing internally as a sequence of queries and thus do not leverage the performance benefits associated with pre-computed browsing structures.

- Session D: Demo Session | Pp. 510-513

Exploring the Synergy of Humans and Machines in Extreme Video Retrieval

Alexander G. Hauptmann; Wei-Hao Lin; Rong Yan; Jun Yang; Robert V. Baron; Ming-Yu Chen; Sean Gilroy; Michael D. Gordon

We introduce an interface for efficient video search that exploits the human ability to quickly scan visual content, after automatic retrieval has arrange the images in expected order of relevance. While extreme video retrieval is taxing to the human, it is also extremely effective. Two variants of extreme retrieval are demonstrated, 1) RSVP which automatically pages through images with user-control of the paging speed, while the user marks relevant shots and 2) MBRP where the user manually controls paging and adjusts the number of images per page, depending on the density of relevant shots.

Palabras clave: Average Precision; Rapid Serial Visual Presentation; Mean Average Precision; Relevant Image; Video Retrieval.

- Session D: Demo Session | Pp. 514-517

Efficient Summarizing of Multimedia Archives Using Cluster Labeling

Jelena Tešić; John R. Smith

In this demo we present a novel approach for labeling clusters in minimally annotated data archives. We propose to build on clustering by aggregating the automatically tagged semantics. We propose and compare four techniques for labeling the clusters and evaluate the performance compared to human labeled ground-truth. We define the error measures to quantify the results, and present examples of the cluster labeling results obtained on the BBC stock shots and broadcast news videos from the TRECVID-2005 video data set.

- Session D: Demo Session | Pp. 518-520

Collaborative Concept Tagging for Images Based on Ontological Thinking

Alireza Kashian; Robert Kheng Leng Gay; Abdul Halim Abdul Karim

Without textual descriptions or label information of images, searching semantic concepts in image databases is a very challenging task. Automatic annotation techniques for images are aimed to detect objects which are located visually inside images, like a tiger in grass. One challenge which remains to be solved is “Understanding”. Image understanding is something beyond automatic annotation scope. The second issue is manual annotation of images. In manual annotation, user contribution is important. In this demo, we have developed an online tool which simulates a collaborative environment to help users to generate several facts for selected images. Ontological thinking led us to devise a method for a simple user interface. We are also studying the construction of synergies out of generated facts for our future work.

- Session D: Demo Session | Pp. 521-524

Multimodal Search for Effective Video Retrieval

Apostol (Paul) Natsev

Semantic search and retrieval of multimedia content is a challenging research field that has drawn significant attention in the multimedia research community. With the dramatic growth of digital media at home, in enterprises, and on the web, methods for effective indexing and search of visual content are vital in unlocking the value of this content. Conventional database search and text search over large textual corpora are both well-understood problems with ubiquitous applications. However, search in non-textual unstructured content, such as image and video data, is not nearly as mature or effective. A common approach for video retrieval, for example, is to apply conventional text search techniques to the associated closed caption or speech transcript. This approach works fairly well for retrieving named entities, such as specific people, objects, or places. However, it does not work well for generic topics related to general settings, events, or people actions, as the speech track rarely describes the background setting or the visual appearance of the subject. Text-based search is not even applicable to scenarios that do not have speech transcripts or other textual metadata for indexing purposes (e.g., consumer photo collections). In addition, speech-based video retrieval frequently leads to false matches of segments that talk about but do not depict the entity of interest. Because of these and other limitations, it is now apparent that conventional text search techniques on their own are not sufficient for effective image and video retrieval, and they need to be combined with techniques that consider the visual semantics of the content. The most substantial work in this field is presented in the TREC Video Retrieval Evaluation (TRECVID) community, which focuses its efforts on evaluating video retrieval approaches by providing common video datasets and a standard set of queries.

- Session D: Demo Session | Pp. 525-528