Catálogo de publicaciones - libros

Compartir en
redes sociales


Feature Extraction: Foundations and Applications

Isabelle Guyon ; Masoud Nikravesh ; Steve Gunn ; Lotfi A. Zadeh (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-35487-1

ISBN electrónico

978-3-540-35488-8

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer Berlin Heidelberg 2006

Tabla de contenidos

An Introduction to Feature Extraction

Isabelle Guyon; André Elisseeff

This chapter introduces the reader to the various aspects of feature extraction covered in this book. Section 1 reviews definitions and notations and proposes a unified view of the feature extraction problem. Section 2 is an overview of the methods and results presented in the book, emphasizing novel contributions. Section 3 provides the reader with an entry point in the field of feature extraction by showing small revealing examples and describing simple but effective algorithms. Finally, Section 4 introduces a more theoretical formalism and points to directions of research and open problems.

- An Introduction to Feature Extraction | Pp. 1-25

Learning Machines

Norbert Jankowski; Krzysztof Grabczewski

Learning from data may be a very complex task. To satisfactorily solve a variety of problems, many different types of algorithms may need to be combined. Feature extraction algorithms are valuable tools, which prepare data for other learning methods. To estimate their usefulness one must examine the whole complex processes they are parts of.

Part I - Feature Extraction Fundamentals | Pp. 29-64

Assessment Methods

Gérard Dreyfus; Isabelle Guyon

This chapter aims at providing the reader with the tools required for a statistically significant assessment of feature relevance and of the outcome of feature selection. The methods presented in this chapter can be integrated in feature selection wrappers and can serve to select the number of features for filters or feature ranking methods. They can also serve for hyper-parameter selection or model selection. Finally, they can be helpful for assessing the confidence on predictions made by learning machines on fresh data. The concept of model complexity is ubiquitous in this chapter. Before they start reading the chapter, readers with little or old knowledge of basic statistics should first delve into Appendix A; for others, the latter may serve as a quick reference guide for useful definitions and properties. The first section of the present chapter is devoted to the basic statistical tools for feature selection; it puts the task of feature selection into the appropriate statistical perspective, and describes important tools such as hypothesis tests - which are of general use - and random probes, which are more specifically dedicated to feature selection. The use of hypothesis tests is exemplified, and caveats about the reliability of the results of multiple tests are given, leading to the Bonferroni correction and to the definition of the false discovery rate. The use of random probes is also exemplified, in conjunction with forward selection. The second section of the chapter is devoted to validation and cross-validation; those are general tools for assessing the ability of models to generalize; in the present chapter, we show how they can be used specifically in the context of feature selection; attention is drawn to the limitations of those methods.

Part I - Feature Extraction Fundamentals | Pp. 65-88

Filter Methods

Włodzisław Duch

Feature ranking and feature selection algorithms may roughly be divided into three types. The first type encompasses algorithms that are built into adaptive systems for data analysis (predictors), for example feature selection that is a part of embedded methods (such as neural training algorithms). Algorithms of the second type are wrapped around predictors providing them subsets of features and receiving their feedback (usually accuracy). These wrapper approaches are aimed at improving results of the specific predictors they work with. The third type includes feature selection algorithms that are independent of any predictors, filtering out features that have little chance to be useful in analysis of data. These filter methods are based on performance evaluation metric calculated directly from the data, without direct feedback from predictors that will finally be used on data with reduced number of features. Such algorithms are usually computationally less expensive than those from the first or the second group. This chapter is devoted to filter methods.

Part I - Feature Extraction Fundamentals | Pp. 89-117

Search Strategies

Juha Reunanen

In order to make a search for good variable subsets, one has to know which subsets are good and which are not. In other words, an evaluation mechanism for an individual variable subset needs to be defined first.

Part I - Feature Extraction Fundamentals | Pp. 119-136

Embedded Methods

Thomas Navin Lal; Olivier Chapelle; Jason Weston; André Elisseeff

Although many embedded feature selection methods have been introduced during the last few years, a unifying theoretical framework has not been developed to date. We start this chapter by defining such a framework which we think is general enough to cover many embedded methods. We will then discuss embedded methods based on they solve the feature selection problem.

Part I - Feature Extraction Fundamentals | Pp. 137-165

Information-Theoretic Methods

Kari Torkkola

Shannon’s seminal work on information theory provided the conceptual framework for communication through noisy channels (). This work, quantifying the information content of coded messages, established the basis for all current systems aiming to transmit information through any medium.

Part I - Feature Extraction Fundamentals | Pp. 167-185

Ensemble Learning

Eugene Tuv

Supervised ensemble methods construct a set of base learners (experts) and use their weighted outcome to predict new data. Numerous empirical studies confirm that ensemble methods often outperform any single base learner (, , ). The improvement is intuitively clear when a base algorithm is unstable. In an unstable algorithm small changes in the training data lead to large changes in the resulting base learner (such as for decision tree, neural network, etc). Recently, a series of theoretical developments (, , , ) also confirmed the fundamental role of stability for generalization (ability to perform well on the unseen data) of any learning engine. Given a multivariate learning algorithm, model selection and feature selection are closely related problems (the latter is a special case of the former). Thus, it is sensible that model-based feature selection methods (wrappers, embedded) would benefit from the regularization effect provided by ensemble aggregation. This is especially true for the fast, greedy and unstable learners often used for feature evaluation.

Part I - Feature Extraction Fundamentals | Pp. 187-204

Fuzzy Neural Networks

Madan M. Gupta; Noriyasu Homma; Zeng-Guang Hou

The theory of , founded by Zadeh (1965), deals with the linguistic notion of graded membership, unlike the computational functions of the digital computer with bivalent propositions. Since mentation and cognitive functions of brains are based on of information acquired by the natural (biological) sensory systems, fuzzy logic has been used as a powerful tool for modeling human thinking and cognition (, ). The and of the cognitive process thus act on the graded information associated with fuzzy concepts, fuzzy judgment, fuzzy reasoning, and cognition. The most successful domain of fuzzy logic has been in the field of feedback control of various physical and chemical processes such as temperature, electric current, flow of liquid/gas, and the motion of machines (, , , , , , ,). Fuzzy logic principles can also be applied to other areas. For example, these fuzzy principles have been used in the area such as fuzzy knowledge-based systems that use fuzzy IF-THEN rules, , which may incorporate fuzziness in data and programs, and fuzzy database systems in the field of medicine, economics, and management problems. It is exciting to note that some consumer electronic and automotive industry products in the current market have used technology based on fuzzy logic, and the performance of these products has significantly improved (, ).

Part I - Feature Extraction Fundamentals | Pp. 205-233

Design and Analysis of the NIPS2003 Challenge

Isabelle Guyon; Steve Gunn; Asa Ben Hur; Gideon Dror

We organized in 2003 a benchmark of feature selection methods, whose results are summarized and analyzed in this chapter. The top ranking entrants of the competition describe their methods and results in more detail in the following chapters. We provided participants with five datasets from different application domains and called for classification results using a minimal number of features. Participants were asked to make on-line submissions on two test sets: a validation set and a “final” test set, with performance on the validation set being presented immedi to the participant and performance on the final test set presented at the end of the competition. The competition took place over a period of 13 weeks and attracted 78 research groups. In total 1863 entries were made on the validation sets during the development period and 135 entries on all test sets for the final competition. The winners used a combination of Bayesian neural networks with ARD priors and Dirichlet diffusion trees. Other top entries used a variety of methods for feature selection, which combined filters and/or wrapper or embedded methods using Random Forests, kernel methods, neural networks as classification engine. The classification engines most often used after feature selection are regularized kernel methods, including SVMs. The results of the benchmark (including the predictions made by the participants and the features they selected) and the scoring software are publicly available. The benchmark is available at for post-challenge submissions to stimulate further research.

Part II - Feature Selection Challenge | Pp. 237-263