Catálogo de publicaciones - libros
Knowledge Discovery in Databases: PKDD 2005: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005, Proceedings
Alípio Mário Jorge ; Luís Torgo ; Pavel Brazdil ; Rui Camacho ; João Gama (eds.)
En conferencia: 9º European Conference on Principles of Data Mining and Knowledge Discovery (PKDD) . Porto, Portugal . October 3, 2005 - October 7, 2005
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
| Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
|---|---|---|---|---|
| No detectada | 2005 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-29244-9
ISBN electrónico
978-3-540-31665-7
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2005
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2005
Tabla de contenidos
doi: 10.1007/11564126_71
STochFS: A Framework for Combining Feature Selection Outcomes Through a Stochastic Process
Jerffeson Teixeira de Souza; Nathalie Japkowicz; Stan Matwin
The Feature Selection problem involves discovering a subset of features such that a classifier built only with this subset would have better predictive accuracy than a classifier built from the entire set of features. Ensemble methods, such as Bagging and Boosting, have been shown to increase the performance of classifiers to remarkable levels but surprisingly have not been tried in other parts of the classification process. In this paper, we apply the ensemble approach to feature selection by proposing a systematic way of combining various outcomes of a feature selection algorithm. The proposed framework, named STochFS, have been shown empirically to improve the performance of well-known feature selection algorithms.
Palabras clave: Feature Selection; Feature Subset; Ensemble Method; Deterministic Algorithm; Feature Selection Algorithm.
- Short Papers | Pp. 667-674
doi: 10.1007/11564126_72
Speeding Up Logistic Model Tree Induction
Marc Sumner; Eibe Frank; Mark Hall
Logistic Model Trees have been shown to be very accurate and compact classifiers [8]. Their greatest disadvantage is the computational complexity of inducing the logistic regression models in the tree. We address this issue by using the AIC criterion [1] instead of cross-validation to prevent overfitting these models. In addition, a weight trimming heuristic is used which produces a significant speedup. We compare the training time and accuracy of the new induction process with the original one on various datasets and show that the training time often decreases while the classification accuracy diminishes only slightly.
Palabras clave: Training Time; Training Instance; Linear Logistic Regression; Simple Linear Regression Model; Model Selection Method.
- Short Papers | Pp. 675-683
doi: 10.1007/11564126_73
A Random Method for Quantifying Changing Distributions in Data Streams
Haixun Wang; Jian Pei
In applications such as fraud and intrusion detection, it is of great interest to measure the evolving trends in the data. We consider the problem of quantifying changes between two datasets with class labels. Traditionally, changes are often measured by first estimating the probability distributions of the given data, and then computing the distance, for instance, the K-L divergence, between the estimated distributions. However, this approach is computationally infeasible for large, high dimensional datasets. The problem becomes more challenging in the streaming data environment, as the high speed makes it difficult for the learning process to keep up with the concept drifts in the data. To tackle this problem, we propose a method to quantify concept drifts using a universal model that incurs minimal learning cost. In addition, our model also provides the ability of performing classification.
Palabras clave: Decision Tree; Random Forest; Data Stream; Leaf Node; Training Dataset.
- Short Papers | Pp. 684-691
doi: 10.1007/11564126_74
Deriving Class Association Rules Based on Levelwise Subspace Clustering
Takashi Washio; Koutarou Nakanishi; Hiroshi Motoda
Most approaches of Class Association Rule (CAR) based classification have not intensively addressed the classification of instances including numeric attributes. In this paper, a levelwise subspace clustering method deriving hyper-rectangular clusters is proposed to efficiently provide quantitative, interpretative and accurate CARs.
Palabras clave: Association Rule; Numeric Attribute; Dense Cluster; Subspace Cluster; Categorical Item.
- Short Papers | Pp. 692-700
doi: 10.1007/11564126_75
An Incremental Algorithm for Mining Generators Representation
Lijun Xu; Kanglin Xie
This paper presents an efficient algorithm for maintaining the generator representation in dynamic datasets. The generators representation is a kind of lossless, concise representation of the set of frequent itemsets. Furthermore, the algorithm utilizes a novel optimization based on generators borders for the first time in the literature. Generators borders are the borderline between frequent generators and other itemsets. New frequent generators can be generated through monitoring them. Experiments show that our algorithm is more efficient than previous solutions.
Palabras clave: Association Rule; Frequent Generator; Frequent Itemsets; Concise Representation; Support Threshold.
- Short Papers | Pp. 701-708
doi: 10.1007/11564126_76
Hybrid Technique for Artificial Neural Network Architecture and Weight Optimization
Cleber Zanchettin; Teresa Bernarda Ludermir
This work presents a technique that integrates the heuristics tabu search, simulated annealing, genetic algorithms and backpropagation. This approach obtained promising results in the simultaneous optimization of the artificial neural network architecture and weights.
Palabras clave: Genetic Algorithm; Cost Function; Simulated Annealing; Tabu Search; Current Solution.
- Short Papers | Pp. 709-716