Catálogo de publicaciones - libros

Compartir en
redes sociales


Computer Vision: ECCV 2002: 7th European Conference on Computer Vision Copenhagen, Denmark, May 28-31, 2002 Proceedings, Part IV

Anders Heyden ; Gunnar Sparr ; Mads Nielsen ; Peter Johansen (eds.)

En conferencia: 7º European Conference on Computer Vision (ECCV) . Copenhagen, Denmark . May 28, 2002 - May 31, 2002

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Image Processing and Computer Vision; Computer Graphics; Pattern Recognition; Artificial Intelligence

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2002 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-43748-2

ISBN electrónico

978-3-540-47979-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2002

Tabla de contenidos

Visual Data Fusion for Objects Localization by Active Vision

Grégory Flandin; François Chaumette

Visual sensors provide exclusively uncertain and partial knowledge of a scene. In this article, we present a suitable scene knowledge representation that makes integration and fusion of new, uncertain and partial sensor measures possible. It is based on a mixture of stochastic and set membership models. We consider that, for a large class of applications, an approximated representation is sufficient to build a preliminary map of the scene. Our approximation mainly results in ellipsoidal calculus by means of a normal assumption for stochastic laws and ellipsoidal over or inner bounding for uniform laws. These approximations allow us to build an efficient estimation process integrating visual data on line. Based on this estimation scheme, optimal exploratory motions of the camera can be automatically determined. Real time experimental results validating our approach are finally given.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 312-326

Towards Real-Time Cue Integration by Using Partial Results

Doug DeCarlo

Typical cue integration techniques work by combining estimates produced by computations associated with each visual cue. Most of these computations are iterative, leading to that are available upon each iteration, culminating in when the algorithm finally terminates. Combining partial results upon each iteration would be the preferred strategy for cue integration, as early cue integration strategies are inherently more stable and more efficient. Surprisingly, existing cue integration techniques cannot correctly use partial results, but must wait for all of the cue computations to finish. This is because the intrinsic error in partial results, which arises entirely from the fact that the algorithm has not yet terminated, is not represented. While cue integration methods do exist which attempt to use partial results (such as one based on an iterated extended Kalman Filter), they make critical errors.

I address this limitation with the development of a probabilistic model of errors in estimates from partial results, which represents the error that remains in iterative algorithms prior to their completion. This enables existing cue integration frameworks to draw upon partial results correctly. Results are presented on using such a model for tracking faces using feature alignment, contours, and optical flow. They indicate that this framework improves accuracy, efficiency, and robustness over one that uses complete results.

The eventual goal of this line of research is the creation of a decision-theoretic meta-reasoning framework for cue integration—a vital mechanism for any system with real-time deadlines and variable computational demands. This framework will provide a means to decide how to best spend computational resources on each cue, based on how much it reduces the uncertainty of the combined result.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 327-342

Tracking and Object Classification for Automated Surveillance

Omar Javed; Mubarak Shah

In this paper we discuss the issues that need to be resolved before fully automated outdoor surveillance systems can be developed, and present solutions to some of these problems. Any outdoor surveillance system must be able to track objects moving in its field of view, classify these objects and detect some of their activities. We have developed a method to track and classify these objects in realistic scenarios. Object tracking in a single camera is performed using background subtraction, followed by region correspondence. This takes into account multiple cues including velocities, sizes and distances of bounding boxes. Objects can be classified based on the type of their motion. This property may be used to label objects as a single person, vehicle or group of persons. Our proposed method to classify objects is based upon detecting recurrent motion for each tracked object. We develop a specific feature vector called a ‘Recurrent Motion Image’ (RMI) to calculate repeated motion of objects. Different types of objects yield very different RMI’s and therefore can easily be classified into different categories on the basis of their RMI. The proposed approach is very efficient both in terms of computational and space criteria. RMI’s are further used to detect carried objects. We present results on a large number of real world sequences including the PETS 2001 sequences. Our surveillance system works in real time at approximately 15Hz for 320x240 resolution color images on a 1.7 GHz pentium-4 PC.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 343-357

Very Fast Template Matching

H. Schweitzer; J. W. Bell; F. Wu

Template matching by normalized correlations is a common technique for determine the existence and compute the location of a shape within an image. In many cases the run time of computer vision applications is dominated by repeated computation of template matching, applied to locate multiple templates in varying scale and orientation. A straightforward implementation of template matching for an image size and a template size requires order of operations. There are fast algorithms that require order of log operations. We describe a new approximation scheme that requires order operations. It is based on the idea of “Integral-Images”, recently introduced by Viola and Jones.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 358-372

Fusion of Multiple Tracking Algorithms for Robust People Tracking

Nils T. Siebel; Steve Maybank

This paper shows how the output of a number of detection and tracking algorithms can be fused to achieve robust tracking of people in an indoor environment. The new tracking system contains three co-operating parts: ) an Active Shape Tracker using a PCA-generated model of pedestrian outline shapes, ) a Region Tracker, featuring region splitting and merging for multiple hypothesis matching, and ) a Head Detector to aid in the initialisation of tracks. Data from the three parts are fused together to select the best tracking hypotheses.

The new method is validated using sequences from surveillance cameras in a underground station. It is demonstrated that robust realtime tracking of people can be achieved with the new tracking system using standard PC hardware.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 373-387

Video Summaries through Mosaic-Based Shot and Scene Clustering

Aya Aner; John R. Kender

We present an approach for compact video summaries that allows fast and direct access to video data. The video is segmented into shots and, in appropriate video genres, into scenes, using previously proposed methods. A new concept that supports the hierarchical representation of video is presented, and is based on physical setting and camera locations. We use mosaics to represent and cluster shots, and detect appropriate mosaics to represent scenes. In contrast to approaches to video indexing which are based on key-frames, our efficient mosaic-based scene representation allows fast clustering of scenes into , as well as further comparison of physical settings across videos. This enables us to detect of different episodes in situation comedies and serves as a basis for indexing whole video sequences. In sports videos where settings are not as well defined, our approach allows classifying shots for characteristic event detection. We use a novel method for mosaic comparison and create a highly compact non-temporal representation of video. This representation allows accurate comparison of scenes across different videos and serves as a basis for indexing video libraries.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 388-402

Optimization Algorithms for the Selection of Key Frame Sequences of Variable Length

Tiecheng Liu; John R. Kender

This paper presents a novel optimization-based approach for video key frame selection. We define key frames to be a temporally ordered subsequence of the original video sequence, and the optimal key frames are the subsequence of length that optimizes an energy function we define on all subsequences. These optimal key subsequences form a hierarchy, with one such subsequence for every less than the length of the video , and this hierarchy can be retrieved all at once using a dynamic programming process with polynomial () computation time. To further reduce computation, an approximate solution based on a greedy algorithm can compute the key frame hierarchy in (n·log()). We also present a hybrid method, which flexibly captures the virtues of both approaches. Our empirical comparisons between the optimal and greedy solutions indicate their results are very close. We show that the greedy algorithm is more appropriate for video streaming and network applications where compression ratios may change dynamically, and provide a method to compute the appropriate times to advance through key frames during video playback of the compressed stream. Additionally, we exploit the results of the greedy algorithm to devise an interactive video content browser. To quantify our algorithms’ effectiveness, we propose a new evaluation measure, called “well-distributed” key frames. Our experimental results on several videos show that both the optimal and the greedy algorithms outperform several popular existing algorithms in terms of summarization quality, computational time, and guaranteed convergence.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 403-417

Multi-scale EM-ICP: A Fast and Robust Approach for Surface Registration

Sébastien Granger; Xavier Pennec

We investigate in this article the rigid registration of large sets of points, generally sampled from surfaces. We formulate this problem as a general Maximum-Likelihood (ML) estimation of the transformation and the matches. We show that, in the specific case of a Gaussian noise, it corresponds to the Iterative Closest Point algorithm (ICP) with the Mahalanobis distance.

Then, considering matches as a hidden variable, we obtain a slightly more complex criterion that can be efficiently solved using Expectation-Maximization (EM) principles. In the case of a Gaussian noise, this new methods corresponds to an ICP with multiple matches weighted by normalized Gaussian weights, giving birth to the EM-ICP acronym of the method.

The variance of the Gaussian noise is a new parameter that can be viewed as a “scale or blurring factor” on our point clouds. We show that EM-ICP robustly aligns the barycenters and inertia moments with a high variance, while it tends toward the accurate ICP for a small variance. Thus, the idea is to use a multi-scale approach using an annealing scheme on this parameter to combine robustness and accuracy. Moreover, we show that at each “scale”, the criterion can be efficiently approximated using a simple decimation of one point set, which drastically speeds up the algorithm.

Experiments on real data demonstrate a spectacular improvement of the performances of EM-ICP w.r.t. the standard ICP algorithm in terms of robustness (a factor of 3 to 4) and speed (a factor 10 to 20), with similar performances in precision. Though the multiscale scheme is only justified with EM, it can also be used to improve ICP, in which case the performances reaches then the one of EM when the data are not too noisy.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 418-432

An Unified Approach to Model-Based and Model-Free Visual Servoing

Ezio Malis

Standard vision-based control techniques can be classified into two groups: model-based and model-free visual servoing. Model-based visual servoing is used when a 3D model of the observed object is available. If the 3D model is completely unknown, robot positioning can still be achieved using a teaching-by-showing approach. This model-free technique needs a preliminary learning step during which a reference image of the scene is stored. The objective of this paper is to propose an unified approach to vision-based control which can be used with a zooming camera whether the model of the object is known or not. The key idea of the unified approach is to build a reference in a projective space invariant to camera intrinsic parameters which can be computed if the model is known or if an image of the object is available. Thus, only one low level visual servoing technique must be implemented at once.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 433-447

Comparing Intensity Transformations and Their Invariants in the Context of Color Pattern Recognition

Florica Mindru; Theo Moons; Luc Van Gool

In this paper we compare different ways of representing the photometric changes in image intensities caused by changes in illumination and viewpoint, aiming at a balance between goodness-of-fit and low complexity. We derive invariant features based on generalized color moment invariants - that can deal with geometric and photometric changes of a planar pattern - corresponding to the chosen photometric models. The geometric changes correspond to a perspective skew. We compare the photometric models also in terms of the invariants’ discriminative power and classification performance in a pattern recognition system.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 448-460