Catálogo de publicaciones - libros

Compartir en
redes sociales


Computer Vision in Human-Computer Interaction: ICCV 2005 Workshop on HCI, Beijing, China, October 21, 2005, Proceedings

Nicu Sebe ; Michael Lew ; Thomas S. Huang (eds.)

En conferencia: International Workshop on Human-Computer Interaction (HCI) . Beijing, China . October 21, 2005 - October 21, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

User Interfaces and Human Computer Interaction; Image Processing and Computer Vision; Computer Graphics; Pattern Recognition

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2005 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-29620-1

ISBN electrónico

978-3-540-32129-3

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2005

Tabla de contenidos

Action Recognition with Global Features

Arash Mokhber; Catherine Achard; Xingtai Qu; Maurice Milgram

In this study, a new method allowing recognizing and segmenting everyday life actions is proposed. Only one camera is utilized without calibration. Viewpoint invariance is obtained by several acquisitions of the same action. To enhance robustness, each sequence is characterized globally: a detection of moving areas is first computed on each image. All these binary points form a volume in the three-dimensional (3D) space (x,y,t). This volume is characterized by its geometric 3D moments. Action recognition is then carried out by computing the Mahalanobis distance between the vector of features of the action to be recognized and those of the reference database. Results, which validate the suggested approach, are presented on a base of 1662 sequences performed by several persons and categorized in eight actions. An extension of the method for the segmentation of sequences with several actions is also proposed.

Palabras clave: Hide Markov Model; Recognition Rate; Action Recognition; Gesture Recognition; Infinite Impulse Response.

- Event Detection | Pp. 110-119

3D Human Action Recognition Using Spatio-temporal Motion Templates

Fengjun Lv; Ramakant Nevatia; Mun Wai Lee

Our goal is automatic recognition of basic human actions, such as stand, sit and wave hands, to aid in natural communication between a human and a computer. Human actions are inferred from human body joint motions, but such data has high dimensionality and large spatial and temporal variations may occur in executing the same action. We present a learning-based approach for the representation and recognition of 3D human action. Each action is represented by a template consisting of a set of channels with weights. Each channel corresponds to the evolution of one 3D joint coordinate and its weight is learned according to the Neyman-Pearson criterion. We use the learned templates to recognize actions based on χ ^2 error measurement. Results of recognizing 22 actions on a large set of motion capture sequences as well as several annotated and automatically tracked sequences show the effectiveness of the proposed algorithm.

Palabras clave: Training Sample; False Alarm Rate; Action Recognition; Template Match; Human Action Recognition.

- Event Detection | Pp. 120-130

Interactive Point-and-Click Segmentation for Object Removal in Digital Images

Frank Nielsen; Richard Nock

In this paper, we explore the problem of deleting objects in still pictures. We present an interactive system based on a novel intuitive user-friendly interface for removing undesirable objects in digital pictures. To erase an object in an image, a user indicates which object is to be removed by simply pinpointing it with the mouse cursor. As the mouse cursor rolls over the image, the current implicit selected object’s border is highlighted, providing a visual feedback. In case the computer-segmented area does not match the users’ perception of the object, users can further provide a few inside/outside object cues by clicking on a small number of object or nonobject pixels. Experimentally, a small number of such cues is generally enough to reach a correct matching, even for complex textured images. Afterwards, the user removes the object by clicking the left mouse button, and a hole-filling technique is initiated to generate a seamless background portion. Our image manipulation system consists of two components: (i) fully automatic or partially user-steered image segmentation based on an improved fast statistical region-growing segmentation, and (ii) texture synthesis or image inpainting of irregular shaped hole regions. Experiments on a variety of photographs display the ability of the system to handle complex scenes with highly textured objects.

Palabras clave: Object Boundary; Texture Synthesis; Image Inpainting; Mouse Cursor; Alpha Matte.

- Augmented Reality | Pp. 131-140

Information Layout and Interaction Techniques on an Augmented Round Table

Shintaro Kajiwara; Hideki Koike; Kentaro Fukuchi; Kenji Oka; Yoichi Sato

Round tabletop display systems are currently being promoted, but the optimal ways to use these systems to display a large amount of information or how to interact with them have not been considered. This paper describes information presentation and interaction technique for a large number of files on a round tabletop display system. Three layouts are explored on our augmented table system: sequential layout, classification layout, and spiral layout. Users can search and find files by virtually rotating the circular display using a ”hands-on” technique.

Palabras clave: Interaction Technique; Round Table; Rotation Table; Tangible Interface; Creation Date.

- Augmented Reality | Pp. 141-149

On-Line Novel View Synthesis Capable of Handling Multiple Moving Objects

Indra Geys; Luc Van Gool

This paper presents a new interactive teleconferencing system. It adds a ‘virtual’ camera to the scene which can move freely in between multiple real cameras. The viewpoint can automatically be selected using basic cinematographic rules, based on the position and the actions of the instructor. This produces a clearer and more engaging view for the remote audience, without the need for a human editor. For the creation of the novel views generated by such a ‘virtual’ camera, segmentation and depth calculations are required. The system is semi-automatic, in that the user is asked to indicate a few corresponding points or edges for generating an initial rough background model. Next to the static background and moving foreground also multiple independently moving objects are catered for. The initial foreground contour is tracked over time, using a new active contour. If a second object appears, the contour prediction allows to recognize this situation and to take appropriate measures. The 3D models are continuously validated based on a Birchfield dissimilarity measure. The foreground model is updated every frame, the background is refined if necessary. The current implementation can reach approx 4 fps on a single desktop.

Palabras clave: Active Contour; Background Model; Delaunay Triangulation; Dissimilarity Measure; Foreground Object.

- Augmented Reality | Pp. 150-159

Resolving Hand over Face Occlusion

Paul Smith; Niels da Vitoria Lobo; Mubarak Shah

This paper presents a method to segment the hand over complex backgrounds, such as the face. The similar colors and texture of the hand and face make the problem particularly challenging. Our method is based on the concept of an image force field. In this representation each individual image location consists of a vector value which is a nonlinear combination of the remaining pixels in the image. We introduce and develop a novel physics based feature that is able to measure regional structure in the image thus avoiding the problem of local pixel based analysis, which break down under our conditions. The regional image structure changes in the occluded region during occlusion. Elsewhere the regional structure remains relatively constant. We model the regional image structure at all image locations over time using a Mixture of Gaussians (MoG) to detect the occluded region in the image. We have tested the method on a number of sequences demonstrating the versatility of the proposed approach.

Palabras clave: Regional Structure; Gesture Recognition; Complex Background; Hand Shape; Hand Gesture Recognition.

- Hand and Gesture | Pp. 160-169

Real-Time Adaptive Hand Motion Recognition Using a Sparse Bayesian Classifier

Shu-Fai Wong; Roberto Cipolla

An approach to increase adaptability of a recognition system, which can recognise 10 elementary gestures and be extended to sign language recognition, is proposed. In this work, recognition is done by firstly extracting a motion gradient orientation image from a raw video input and then classifying a feature vector generated from this image to one of the 10 gestures by a sparse Bayesian classifier. The classifier is designed in a way that it supports online incremental learning and it can be thus re-trained to increase its adaptability to an input captured under a new condition. Experiments show that the accuracy of the classifier can be boosted from less than 40% to over 80% by re-training it using 5 newly captured samples from each gesture class. Apart from having a better adaptability, the system can work reliably in real-time and give a probabilistic output that is useful in complex motion analysis.

- Hand and Gesture | Pp. 170-179

Topographic Feature Mapping for Head Pose Estimation with Application to Facial Gesture Interfaces

Bisser Raytchev; Ikushi Yoda; Katsuhiko Sakaue

We propose a new general approach to the problem of head pose estimation, based on semi-supervised low-dimensional topographic feature mapping. We show how several recently proposed nonlinear manifold learning methods can be applied in this general framework, and additionally, we present a new algorithm, IsoScale, which combines the best aspects of some of the other methods. The efficacy of the proposed approach is illustrated both on a view- and illumination-varied face database, and in a real-world human-computer interface application, as head pose based facial-gesture interface for automatic wheelchair navigation.

Palabras clave: Training Sample; Singular Value Decomposition; Geodesic Distance; Locality Preserve Projection; Radial Basis Function.

- Hand and Gesture | Pp. 180-188

Accurate and Efficient Gesture Spotting via Pruning and Subgesture Reasoning

Jonathan Alon; Vassilis Athitsos; Stan Sclaroff

Gesture spotting is the challenging task of locating the start and end frames of the video stream that correspond to a gesture of interest, while at the same time rejecting non-gesture motion patterns. This paper proposes a new gesture spotting and recognition algorithm that is based on the continuous dynamic programming (CDP) algorithm, and runs in real-time. To make gesture spotting efficient a pruning method is proposed that allows the system to evaluate a relatively small number of hypotheses compared to CDP. Pruning is implemented by a set of model-dependent classifiers, that are learned from training examples. To make gesture spotting more accurate a subgesture reasoning process is proposed that models the fact that some gesture models can falsely match parts of other longer gestures. In our experiments, the proposed method with pruning and subgesture modeling is an order of magnitude faster and 18% more accurate compared to the original CDP algorithm.

Palabras clave: Gesture Recognition; Hand Gesture; Hand Gesture Recognition; False Match; Input Frame.

- Hand and Gesture | Pp. 189-198

A Study of Detecting Social Interaction with Sensors in a Nursing Home Environment

Datong Chen; Jie Yang; Howard Wactlar

Social interaction plays an important role in our daily lives. It is one of the most important indicators of physical or mental diseases of aging patients. In this paper, we present a Wizard of Oz study on the feasibility of detecting social interaction with sensors in skilled nursing facilities. Our study explores statistical models that can be constructed to monitor and analyze social interactions among aging patients and nurses. We are also interested in identifying sensors that might be most useful in interaction detection; and determining how robustly the detection can be performed with noisy sensors. We simulate a wide range of plausible sensors using human labeling of audio and visual data. Based on these simulated sensors, we build statistical models for both individual sensors and combinations of multiple sensors using various machine learning methods. Comparison experiments are conducted to demonstrate the effectiveness and robustness of the sensors and statistical models for detecting interactions.

Palabras clave: Nursing Home; Information Gain; Support Vector Machine Model; Hand Gesture; Sensor Output.

- Applications | Pp. 199-210