Catálogo de publicaciones - libros

Compartir en
redes sociales


Computer Vision: ACCV 2007: 8th Asian Conference on Computer Vision, Tokyo, Japan, November 18-22, 2007, Proceedings, Part I

Yasushi Yagi ; Sing Bing Kang ; In So Kweon ; Hongbin Zha (eds.)

En conferencia: 8º Asian Conference on Computer Vision (ACCV) . Tokyo, Japan . November 18, 2007 - November 22, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-76385-7

ISBN electrónico

978-3-540-76386-4

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Gesture Recognition Under Small Sample Size

Tae-Kyun Kim; Roberto Cipolla

This paper addresses gesture recognition under small sample size, where direct use of traditional classifiers is difficult due to high dimensionality of input space. We propose a pairwise feature extraction method of video volumes for classification. The method of Canonical Correlation Analysis is combined with the discriminant functions and Scale-Invariant-Feature-Transform (SIFT) for the discriminative spatiotemporal features for robust gesture recognition. The proposed method is practically favorable as it works well with a small amount of training samples, involves few parameters, and is computationally efficient. In the experiments using 900 videos of 9 hand gesture classes, the proposed method notably outperformed the classifiers such as Support Vector Machine/Relevance Vector Machine, achieving 85% accuracy.

- Face and Gesture | Pp. 335-344

Motion Observability Analysis of the Simplified Color Correlogram for Visual Tracking

Qi Zhao; Hai Tao

Compared with the color histogram, where the position information of each pixel is ignored, a simplified color correlogram (SCC) representation encodes the spatial information explicitly and enables an estimation algorithm to recover the object orientation. This paper analyzes the capability of the SCC (in a kernel based framework) in detecting and estimating object motion and presents a principled way to obtain motion observable SCCs as object representations to achieve more reliable tracking. Extensive experimental results demonstrate the reliability of the tracking procedure using the proposed algorithm.

- Tracking | Pp. 345-354

On-Line Ensemble SVM for Robust Object Tracking

Min Tian; Weiwei Zhang; Fuqiang Liu

In this paper, we present a novel visual object tracking algorithm based on ensemble of linear SVM classifiers. There are two main contributions in this paper. First of all, we propose a simple yet effective way for on-line updating linear SVM classifier, where useful “Key Frames” of target are automatically selected as support vectors. Secondly, we propose an on-line ensemble SVM tracker, which can effectively handle target appearance variation. The proposed algorithm makes better usage of history information, which leads to better discrimination of target and the surrounding background. The proposed algorithm is tested on many video clips including some public available ones. Experimental results show the robustness of our proposed algorithm, especially under large appearance change during tracking.

- Tracking | Pp. 355-364

Multi-camera People Tracking by Collaborative Particle Filters and Principal Axis-Based Integration

Wei Du; Justus Piater

This paper presents a novel approach to tracking people in multiple cameras. A target is tracked not only in each camera but also in the ground plane by individual particle filters. These particle filters collaborate in two different ways. First, the particle filters in each camera pass messages to those in the ground plane where the multi-camera information is integrated by intersecting the targets’ principal axes. This largely relaxes the dependence on precise foot positions when mapping targets from images to the ground plane using homographies. Secondly, the fusion results in the ground plane are then incorporated by each camera as boosted proposal functions. A mixture proposal function is composed for each tracker in a camera by combining an independent transition kernel and the boosted proposal function. Experiments show that our approach achieves more reliable results using less computational resources than conventional methods.

- Tracking | Pp. 365-374

Finding Camera Overlap in Large Surveillance Networks

Anton van den Hengel; Anthony Dick; Henry Detmold; Alex Cichowski; Rhys Hill

Recent research on video surveillance across multiple cameras has typically focused on camera networks of the order of 10 cameras. In this paper we argue that existing systems do not scale to a network of hundreds, or thousands, of cameras. We describe the design and deployment of an algorithm called that is specifically aimed at finding correspondence between regions in cameras for large camera networks. The information recovered by exclusion can be used as the basis for other surveillance tasks such as tracking people through the network, or as an aid to human inspection. We have run this algorithm on a campus network of over 100 cameras, and report on its performance and accuracy over this network.

- Poster Session 2: Camera Networks | Pp. 375-384

Information Fusion for Multi-camera and Multi-body Structure and Motion

Alexander Andreopoulos; John K. Tsotsos

Information fusion algorithms have been successful in many vision tasks such as stereo, motion estimation, registration and robot localization. Stereo and motion image analysis are intimately connected and can provide complementary information to obtain robust estimates of scene structure and motion. We present an information fusion based approach for multi-camera and multi-body structure and motion that combines bottom-up and top-down knowledge on scene structure and motion. The only assumption we make is that all scene motion consists of rigid motion. We present experimental results on synthetic and non-synthetic data sets, demonstrating excellent performance compared to binocular based state-of-the-art approaches for structure and motion.

- Poster Session 2: Camera Networks | Pp. 385-396

Task Scheduling in Large Camera Networks

Ser-Nam Lim; Larry Davis; Anurag Mittal

Camera networks are increasingly being deployed for security. In most of these camera networks, video sequences are captured, transmitted and archived continuously from all cameras, creating enormous stress on available transmission bandwidth, storage space and computing facilities. We describe an intelligent control system for scheduling Pan-Tilt-Zoom cameras to capture video only when task-specific requirements can be satisfied. These videos are collected in during predicted temporal “windows of opportunity”. We present a scalable algorithm that constructs schedules in which multiple tasks can possibly be satisfied simultaneously by a given camera. We describe two scheduling algorithms: a greedy algorithm and another based on Dynamic Programming (DP). We analyze their approximation factors and present simulations that show that the DP method is advantageous for large camera networks in terms of task coverage. Results from a prototype real time active camera system however reveal that the greedy algorithm performs faster than the DP algorithm, making it more suitable for a real time system. The prototype system, built using existing low-level vision algorithms, also illustrates the applicability of our algorithms.

- Poster Session 2: Camera Networks | Pp. 397-407

Constrained Optimization for Human Pose Estimation from Depth Sequences

Youding Zhu; Kikuo Fujimura

A new 2-step method is presented for human upper-body pose estimation from depth sequences, in which coarse human part labeling takes place first, followed by more precise joint position estimation as the second phase. In the first step, a number of constraints are extracted from notable image features such as the head and torso. The problem of pose estimation is cast as that of label assignment with these constraints. Major parts of the human upper body are labeled by this process. The second step estimates joint positions optimally based on kinematic constraints using dense correspondences between depth profile and human model parts. The proposed framework is shown to overcome some issues of existing approaches for human pose tracking using similar types of data streams. Performance comparison with motion capture data is presented to demonstrate the accuracy of our approach.

- Poster Session 2: Face/Gesture/Action Detection and Recognition | Pp. 408-418

Generative Estimation of 3D Human Pose Using Shape Contexts Matching

Xu Zhao; Yuncai Liu

We present a method for 3D pose estimation of human motion in generative framework. For the generalization of application scenario, the observation information we utilized comes from monocular silhouettes. We distill prior information of human motion by performing conventional PCA on single motion capture data sequence. In doing so, the aims for both reducing dimensionality and extracting the prior knowledge of human motion are achieved simultaneously. We adopt the shape contexts descriptor to construct the matching function, by which the validity and the robustness of the matching between image features and synthesized model features can be ensured. To explore the solution space efficiently, we design the Annealed Genetic Algorithm (AGA) and Hierarchical Annealed Genetic Algorithm (HAGA) that searches the optimal solutions effectively by utilizing the characteristics of state space. Results of pose estimation on different motion sequences demonstrate that the novel generative method can achieves viewpoint invariant 3D pose estimation.

- Poster Session 2: Face/Gesture/Action Detection and Recognition | Pp. 419-429

An Active Multi-camera Motion Capture for Face, Fingers and Whole Body

Eng Hui Loke; Masanobu Yamamoto

This paper explores a novel endeavor of deploying only four active-tracking cameras and fundamental vision-based technologies for 3D motion capture of a full human body figure, which includes facial expression, motion of fingers of both hands and a whole body. The proposed methods suggest alternatives to extract motion parameters of the mentioned body parts from four single-view image sequences. The proposed ellipsoidal model- and flow-based facial expression motion capture solution tackles both 3D head pose and non-rigid facial motion effectively and we observe that a set of 22 self-defined feature points suffice the expression representation. The body figure and fingers motion capture is solved with a combination of articulated model and flow-based methods.

- Poster Session 2: Face/Gesture/Action Detection and Recognition | Pp. 430-441