Catálogo de publicaciones - libros
Feature Extraction: Foundations and Applications
Isabelle Guyon ; Masoud Nikravesh ; Steve Gunn ; Lotfi A. Zadeh (eds.)
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-35487-1
ISBN electrónico
978-3-540-35488-8
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer Berlin Heidelberg 2006
Tabla de contenidos
Combining a Filter Method with SVMs
Thomas Navin Lal; Olivier Chapelle; Bernhard Schölkopf
Our goal for the competition was to evaluate the usefulness of simple machine learning techniques. We decided to use the Fisher criterion (see Chapter 2) as a feature selection method and Support Vector Machines (see Chapter 1) for the classification part. Here we explain how we chose the regularization parameter of the SVM, how we determined the kernel parameter and how we estimated the number of features used for each data set. All analyzes were carried out on the training sets of the competition data. We choose the data set as an example to explain the approach step by step.
In our view the point of this competition was the construction of a well performing classifier rather than the systematic analysis of a specific approach. This is why our search for the best classifier was only guided by the described methods and that we deviated from the road map at several occasions. All calculations were done with the software ().
Part II - Feature Selection Challenge | Pp. 439-445
Feature Selection via Sensitivity Analysis with Direct Kernel PLS
Mark J. Embrechts; Robert A. Bress; Robert H. Kewley
This chapter introduces Direct Kernel Partial Least Squares (DK-PLS) and feature selection via sensitivity analysis for DK-PLS. The overall feature selection strategy for the five data sets used in the NIPS competition is outlined as well.
Part II - Feature Selection Challenge | Pp. 447-462
Information Gain, Correlation and Support Vector Machines
Danny Roobaert; Grigoris Karakoulas; Nitesh V. Chawla
We report on our approach, CBAmethod3E, which was submitted to the NIPS 2003 Feature Selection Challenge on Dec. 8, 2003. Our approach consists of combining filtering techniques for variable selection, information gain and feature correlation, with Support Vector Machines for induction. We ranked 13th overall and ranked 6th as a group. It is worth pointing out that our feature selection method was very successful in selecting the second smallest set of features among the top-20 submissions, and in identifying almost all probes in the datasets, resulting in the challenge’s best performance on the latter benchmark.
Part II - Feature Selection Challenge | Pp. 463-470
Mining for Complex Models Comprising Feature Selection and Classification
Krzysztof Grabczewski; Norbert Jankowski
Different classification tasks require different learning schemes to be satisfactorily solved. Most real-world datasets can be modeled only by complex structures resulting from deep data exploration with a number of different classification and data transformation methods. The search through the space of complex structures must be augmented with reliable validation strategies. All these techniques were necessary to build accurate models for the five high-dimensional datasets of the NIPS 2003 Feature Selection Challenge. Several feature selection algorithms (e.g. based on variance, correlation coefficient, decision trees) and several classification schemes (e.g. nearest neighbors, Normalized RBF, Support Vector Machines) were used to build complex models which transform the data and then classify. Committees of feature selection models and ensemble classifiers were also very helpful to construct models of high generalization abilities.
Part II - Feature Selection Challenge | Pp. 471-488
Combining Information-Based Supervised and Unsupervised Feature Selection
Sang-Kyun Lee; Seung-Joon Yi; Byoung-Tak Zhang
The filter is a simple and practical method for feature selection, but it can introduce biases resulting in decreased prediction performance. We propose an enhanced filter method that exploits features from two information-based filtering steps: supervised and unsupervised. By combining the features in these steps we attempt to reduce biases caused by misleading causal relations induced in the supervised selection procedure. When tested with the five datasets given at the NIPS 2003 Feature Extraction Workshop, our approach attained a significant performance, considering the simplicity of the approach. We expect the combined information-based method to be a promising substitute for classical filter methods.
Part II - Feature Selection Challenge | Pp. 489-498
An Enhanced Selective Naïve Bayes Method with Optimal Discretization
Marc Boullé
In this chapter, we present an extension of the wrapper approach applied to the predictor. The originality is to use the area under the training lift curve as a criterion of feature set optimality and to preprocess the numeric variables with a new optimal discretization method. The method is experimented on the NIPS 2003 datasets both as a wrapper and as a filter for multi-layer perceptron.
Part II - Feature Selection Challenge | Pp. 499-507
An Input Variable Importance Definition based on Empirical Data Probability Distribution
V. Lemaire; F. Clérot
We propose in this chapter a new method to score subsets of variables according to their usefulness for a given model. It can be qualified as a variable ranking method ‘in the context of other variables’. The method consists in replacing a variable value by another value obtained by randomly choosing a among other values of that variable in the training set. The impact of this change on the output is measured and averaged over all training examples and changes of that variable for a given training example. As a search strategy, backward elimination is used. This method is applicable on every kind of model and on classification or regression task. We assess the efficiency of the method with our results on the NIPS 2003 feature selection challenge.
Part II - Feature Selection Challenge | Pp. 509-516
Spectral Dimensionality Reduction
Yoshua Bengio; Olivier Delalleau; Nicolas Le Roux; Jean-François Paiement; Pascal Vincent; Marie Ouimet
In this chapter, we study and put under a common framework a number of non-linear dimensionality reduction methods, such as Locally Linear Embedding, Isomap, Laplacian eigenmaps and kernel PCA, which are based on performing an eigen-decomposition (hence the name “spectral”). That framework also includes classical methods such as PCA and metric multidimensional scaling (MDS). It also includes the data transformation step used in spectral clustering. We show that in all of these cases the learning algorithm estimates the principal eigenfunctions of an operator that depends on the unknown data density and on a kernel that is not necessarily positive semi-definite. This helps generalizing some of these algorithms so as to predict an embedding for out-of-sample examples without having to retrain the model. It also makes it more transparent what these algorithm are minimizing on the empirical data and gives a corresponding notion of generalization error.
Part III - New Perspectives in Feature Extraction | Pp. 519-550
Constructing Orthogonal Latent Features for Arbitrary Loss
Michinari Momma; Kristin P. Bennett
A boosting framework for constructing orthogonal features targeted to a given loss function is developed. Combined with techniques from spectral methods such as PCA and PLS, an orthogonal boosting algorithm for linear hypothesis is used to efficiently construct orthogonal latent features selected to optimize the given loss function. The method is generalized to construct orthogonal nonlinear features using the kernel trick. The resulting method, Boosted Latent Features (BLF) is demonstrated to both construct valuable orthogonal features and to be a competitive inference method for a variety of loss functions. For the least squares loss, BLF reduces to the PLS algorithm and preserves all the attractive properties of that algorithm. As in PCA and PLS, the resulting nonlinear features are valuable for visualization, dimensionality reduction, improving generalization by regularization, and use in other learning algorithms, but now these features can be targeted to a specific inference task/loss function. The data matrix is factorized by the extracted features. The low-rank approximation of the data matrix provides efficiency and stability in computation, an attractive characteristic of PLS-type methods. Computational results demonstrate the effectiveness of the approach on a wide range of classification and regression problems.
Part III - New Perspectives in Feature Extraction | Pp. 551-583
Large Margin Principles for Feature Selection
Ran Gilad-Bachrach; Amir Navot; Naftali Tishby
In this paper we introduce a margin based feature selection criterion and apply it to measure the quality of sets of features. Using margins we devise novel selection algorithms for multi-class categorization problems and provide theoretical generalization bound. We also study the well known algorithm and show that it resembles a gradient ascent over our margin criterion. We report promising results on various datasets.
Part III - New Perspectives in Feature Extraction | Pp. 585-606