Catálogo de publicaciones - libros

Compartir en
redes sociales


Computer Vision: ECCV 2002: 7th European Conference on Computer Vision Copenhagen, Denmark, May 28-31, 2002 Proceedings, Part IV

Anders Heyden ; Gunnar Sparr ; Mads Nielsen ; Peter Johansen (eds.)

En conferencia: 7º European Conference on Computer Vision (ECCV) . Copenhagen, Denmark . May 28, 2002 - May 31, 2002

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Image Processing and Computer Vision; Computer Graphics; Pattern Recognition; Artificial Intelligence

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2002 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-43748-2

ISBN electrónico

978-3-540-47979-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2002

Tabla de contenidos

Face Identification by Fitting a 3D Morphable Model Using Linear Shape and Texture Error Functions

Sami Romdhani; Volker Blanz; Thomas Vetter

This paper presents a novel algorithm aiming at analysis and identification of faces viewed from different poses and illumination conditions. Face analysis from a single image is performed by recovering the shape and textures parameters of a 3D Morphable Model in an analysis-by-synthesis fashion. The shape parameters are computed from a shape error estimated by optical flow and the texture parameters are obtained from a texture error. The algorithm uses linear equations to recover the shape and texture parameters irrespective of pose and lighting conditions of the face image. Identification experiments are reported on more than 5000 images from the publicly available CMU-PIE database which includes faces viewed from 13 different poses and under 22 different illuminations. Extensive identification results are available on our web page for future comparison with novel algorithms.

- Object Recognition / Vision Systems Engineering and Evaluation | Pp. 3-19

Hausdorff Kernel for 3D Object Acquisition and Detection

Annalisa Barla; Francesca Odone; Alessandro Verri

Learning one class at a time can be seen as an effective solution to classification problems in which only the positive examples are easily identifiable. A kernel method to accomplish this goal consists of a representation stage - which computes the smallest sphere in feature space enclosing the positive examples - and a classification stage - which uses the obtained sphere as a decision surface to determine the positivity of new examples. In this paper we describe a kernel well suited to represent, identify, and recognize 3D objects from unconstrained images. The kernel we introduce, based on Hausdorff distance, is tailored to deal with grey-level image matching. The effectiveness of the proposed method is demonstrated on several data sets of faces and objects of artistic relevance, like statues.

- Object Recognition / Vision Systems Engineering and Evaluation | Pp. 20-33

Evaluating Image Segmentation Algorithms Using the Pareto Front

Mark Everingham; Henk Muller; Barry Thomas

Image segmentation is the first stage of processing in many practical computer vision systems. While development of particular segmentation algorithms has attracted considerable research interest, relatively little work has been published on the subject of their evaluation. In this paper we propose the use of the Pareto front to allow evaluation and comparison of image segmentation algorithms in multi-dimensional fitness spaces, in a manner somewhat analogous to the use of receiver operating characteristic curves in binary classification problems. The principle advantage of this approach is that it avoids the need to aggregate metrics capturing multiple objectives into a single metric, and thus allows trade-offs between multiple aspects of algorithm behavior to be assessed. This is in contrast to previous approaches which have tended to use a single measure of “goodness”, or discrepancy to ground truth data. We define the Pareto front in the context of algorithm evaluation, propose several fitness measures for image segmentation, and use a genetic algorithm for multi-objective optimization to explore the set of algorithms, parameters, and corresponding points in fitness space which lie on the front. Experimental results are presented for six general-purpose image segmentation algorithms, including several which may be considered state-of-the-art.

- Object Recognition / Vision Systems Engineering and Evaluation | Pp. 34-48

On Performance Characterization and Optimization for Image Retrieval

J. Vogel; B. Schiele

In content-based image retrieval (CBIR) performance characterization is easily being neglected. A major difficulty lies in the fact that ground truth and the definition of benchmarks are extremely user and application dependent. This paper proposes a two-stage CBIR framework which allows to predict the behavior of the retrieval system as well as to optimize its performance. In particular, it is possible to maximize precision, recall, or jointly precision and recall. The framework is based on the detection of high-level concepts in images. These concepts correspond to vocabulary users can query the database with. Performance optimization is carried out on the basis of the user query, the performance of the concept detectors, and an estimated distribution of the concepts in the database. The optimization is transparent to the user and leads to a set of internal parameters that optimize the succeeding retrieval. Depending only on the query and the desired concept, precision and recall of the retrieval can be increased by up to 40%. The paper discusses the theoretical and empirical results of the optimization as well as its dependency on the estimate of the concept distribution.

- Object Recognition / Vision Systems Engineering and Evaluation | Pp. 49-63

Statistical Learning of Multi-view Face Detection

Stan Z. Li; Long Zhu; ZhenQiu Zhang; Andrew Blake; HongJiang Zhang; Harry Shum

A new boosting algorithm, called FloatBoost, is proposed to overcome the monotonicity problem of the sequential AdaBoost learning. AdaBoost [,] is a sequential forward search procedure using the greedy selection strategy. The premise offered by the sequential procedure can be broken-down when the monotonicity assumption, that when adding a new feature to the current set, the value of the performance criterion does not decrease, is violated. FloatBoost incorporates the idea of Floating Search [] into AdaBoost to solve the non-monotonicity problem encountered in the sequential search of AdaBoost.

We then present a system which learns to detect multi-view faces using FloatBoost. The system uses a coarse-to-fine, simple-to-complex architecture called detector-pyramid. FloatBoost learns the component detectors in the pyramid and yields similar or higher classification accuracy than AdaBoost with a smaller number of weak classifiers. This work leads to the first real-time multi-view face detection system in the world. It runs at 200 ms per image of size 320×240 pixels on a Pentium-III CPU of 700 MHz. A live demo will be shown at the conference.

- Statistical Learning | Pp. 67-81

Dynamic Trees: Learning to Model Outdoor Scenes

Nicholas J. Adams; Christopher K. I. Williams

This paper considers the dynamic tree (DT) model, first introduced in []. A dynamic tree specifies a prior over structures of trees, each of which is a forest of one or more tree-structured belief networks (TSBN). In the literature standard tree-structured belief network models have been found to produce “blocky” segmentations when naturally occurring boundaries within an image did not coincide with those of the subtrees in the fixed structure of the network. Dynamic trees have a flexible architecture which allows the structure to vary to create configurations where the subtree and image boundaries align, and experimentation with the model has shown significant improvements.

Here we derive an EM-style update based upon mean field inference for learning the parameters of the dynamic tree model and apply it to a database of images of outdoor scenes where all of its parameters are learned. DTs are seen to offer significant improvement in performance over the fixed-architecture TSBN and in a coding comparison the DT achieves 0.294 bits per pixel (bpp) compression compared to 0.378 bpp for lossless JPEG on images of 7 colours.

- Statistical Learning | Pp. 82-96

Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

P. Duygulu; K. Barnard; J. F. G. de Freitas; D. A. Forsyth

We describe a model of object recognition as machine translation. In this model, recognition is a process of annotating image regions with words. Firstly, images are segmented into regions, which are classified into region types using a variety of features. A mapping between region types and keywords supplied with the images, is then learned, using a method based around EM. This process is analogous with learning a lexicon from an aligned bitext. For the implementation we describe, these words are nouns taken from a large vocabulary. On a large test set, the method can predict numerous words with high accuracy. Simple methods identify words that cannot be predicted well. We show how to cluster words that individually are difficult to predict into clusters that can be predicted well — for example, we cannot predict the distinction between train and locomotive using the current set of features, but we can predict the underlying concept. The method is trained on a substantial collection of images. Extensive experimental results illustrate the strengths and weaknesses of the approach.

- Statistical Learning | Pp. 97-112

Learning a Sparse Representation for Object Detection

Shivani Agarwal; Dan Roth

We present an approach for learning to detect objects in still gray images, that is based on a sparse, part-based representation of objects. A vocabulary of information-rich object parts is automatically constructed from a set of sample images of the object class of interest. Images are then represented using parts from this vocabulary, along with spatial relations observed among them. Based on this representation, a feature-efficient learning algorithm is used to learn to detect instances of the object class. The framework developed can be applied to any object with distinguishable parts in a relatively fixed spatial configuration. We report experiments on images of side views of cars. Our experiments show that the method achieves high detection accuracy on a difficult test set of real-world images, and is highly robust to partial occlusion and background variation.

In addition, we discuss and offer solutions to several methodological issues that are significant for the research community to be able to evaluate object detection approaches.

- Statistical Learning | Pp. 113-127

Stratified Self Calibration from Screw-Transform Manifolds

Russell Manning; Charles Dyer

This paper introduces a new, stratified approach for the metric self calibration of a camera with fixed internal parameters. The method works by intersecting , which are a specific type of screw-transform manifold. Through the addition of a single scalar parameter, a 2-dimensional modulus-constraint manifold can become a 3-dimensional Kruppa-constraint manifold allowing for direct self calibration from disjoint pairs of views. In this way, we demonstrate that screw-transform manifolds represent a single, unified approach to performing both stratified and direct self calibration. This paper also shows how to generate the screw-transform manifold arising from turntable (i.e., pairwise-planar) motion and discusses some important considerations for creating a working algorithm from these ideas.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 131-145

Self-Organization of Randomly Placed Sensors

Robert B. Fisher

This paper investigates one problem arising from ubiquitous sensing: can the position of a set of randomly placed sensors be automatically determined even if they do not have an overlapping field of view. (If the view overlapped, then standard stereo auto-calibration can be used.) This paper shows that the problem is solveable. Distant moving features allow accurate orientation registration. Given the sensor orientations, nearby linearly moving features allow full pose registration, up to a scale factor.

- Calibration / Active and Real-Time and Robot Vision / Image and Video Indexing / Medical Image Understanding / Vision Systems / Engineering and Evaluations / Statistical Learning | Pp. 146-160