Catálogo de publicaciones - libros
Knowledge Discovery in Databases: PKDD 2007: 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Warsaw, Poland, September 17-21, 2007. Proceedings
Joost N. Kok ; Jacek Koronacki ; Ramon Lopez de Mantaras ; Stan Matwin ; Dunja Mladenič ; Andrzej Skowron (eds.)
En conferencia: 11º European Conference on Principles of Data Mining and Knowledge Discovery (PKDD) . Warsaw, Poland . September 17, 2007 - September 21, 2007
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-74975-2
ISBN electrónico
978-3-540-74976-9
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Tabla de contenidos
Efficient Weight Learning for Markov Logic Networks
Daniel Lowd; Pedro Domingos
Markov logic networks (MLNs) combine Markov networks and first-order logic, and are a powerful and increasingly popular representation for statistical relational learning. The state-of-the-art method for discriminative learning of MLN weights is the voted perceptron algorithm, which is essentially gradient descent with an MPE approximation to the expected sufficient statistics (true clause counts). Unfortunately, these can vary widely between clauses, causing the learning problem to be highly ill-conditioned, and making gradient descent very slow. In this paper, we explore several alternatives, from per-weight learning rates to second-order methods. In particular, we focus on two approaches that avoid computing the partition function: diagonal Newton and scaled conjugate gradient. In experiments on standard SRL datasets, we obtain order-of-magnitude speedups, or more accurate models given comparable learning times.
- Long Papers | Pp. 200-211
Classification in Very High Dimensional Problems with Handfuls of Examples
Mark Palatucci; Tom M. Mitchell
Modern classification techniques perform well when the number of training examples exceed the number of features. If, however, the number of features greatly exceed the number of training examples, then these same techniques can fail. To address this problem, we present a hierarchical Bayesian framework that shares information between features by modeling similarities between their parameters. We believe this approach is applicable to many sparse, high dimensional problems and especially relevant to those with both spatial and temporal components. One such problem is fMRI time series, and we present a case study that shows how we can successfully classify in this domain with 80,000 original features and only 2 training examples per class.
- Long Papers | Pp. 212-223
Domain Adaptation of Conditional Probability Models Via Feature Subsetting
Sandeepkumar Satpal; Sunita Sarawagi
The goal in domain adaptation is to train a model using labeled data sampled from a domain different from the target domain on which the model will be deployed. We exploit unlabeled data from the target domain to train a model that maximizes likelihood over the training sample while minimizing the distance between the training and target distribution. Our focus is conditional probability models used for predicting a label structure given input based on features defined jointly over and . We propose practical measures of divergence between the two domains based on which we penalize features with large divergence, while improving the effectiveness of other less deviant correlated features. Empirical evaluation on several real-life information extraction tasks using Conditional Random Fields (CRFs) show that our method of domain adaptation leads to significant reduction in error.
- Long Papers | Pp. 224-235
Learning to Detect Adverse Traffic Events from Noisily Labeled Data
Tomáš Šingliar; Miloš Hauskrecht
Many deployed traffic incident detection systems use algorithms that require significant manual tuning. We seek machine learning incident detection solutions that reduce the need for manual adjustments by taking advantage of massive databases of traffic sensor network measurements. First, we show that a rather straightforward supervised learner based on the SVM model outperforms a fixed detection model used by state-of-the-art traffic incident detectors. Second, we seek further improvements of learning performance by correcting misaligned incident times in the training data. The misalignment is due to an imperfect incident logging procedure. We propose a label realignment model based on a dynamic Bayesian network to re-estimate the correct position (time) of the incident in the data. Training on the automatically realigned data consistently leads to improved detection performance in the low false positive region.
- Long Papers | Pp. 236-247
IKNN: Informative K-Nearest Neighbor Pattern Classification
Yang Song; Jian Huang; Ding Zhou; Hongyuan Zha; C. Lee Giles
The -nearest neighbor (KNN) decision rule has been a ubiquitous classification tool with good scalability. Past experience has shown that the optimal choice of depends upon the data, making it laborious to tune the parameter for different applications. We introduce a new metric that measures the informativeness of objects to be classified. When applied as a query-based distance metric to measure the closeness between objects, two novel KNN procedures, Locally Informative-KNN (LI-KNN) and Globally Informative-KNN (GI-KNN), are proposed. By selecting a subset of most informative objects from neighborhoods, our methods exhibit stability to the change of input parameters, number of neighbors() and informative points (). Experiments on UCI benchmark data and diverse real-world data sets indicate that our approaches are application-independent and can generally outperform several popular KNN extensions, as well as SVM and Boosting methods.
- Long Papers | Pp. 248-264
Finding Outlying Items in Sets of Partial Rankings
Antti Ukkonen; Heikki Mannila
Partial rankings are totally ordered subsets of a set of items. For example, the sequence in which a user browses through different parts of a website is a partial ranking. We consider the following problem. Given a set of partial rankings, find items that have strongly different status in different parts of . To do this, we first compute a clustering of and then look at items whose average rank in the cluster substantially deviates from its average rank in . Such items can be seen as those that contribute the most to the differences between the clusters. To test the statistical significance of the found items, we propose a method that is based on a MCMC algorithm for sampling random sets of partial rankings with exactly the same statistics as . We also demonstrate the method on movie rankings and gene expression data.
- Long Papers | Pp. 265-276
Speeding Up Feature Subset Selection Through Mutual Information Relevance Filtering
Gert Van Dijck; Marc M. Van Hulle
A relevance filter is proposed which removes features based on the mutual information between class labels and features. It is proven that both feature independence and class conditional feature independence are required for the filter to be statistically optimal. This could be shown by establishing a relationship with the conditional relative entropy framework for feature selection. Removing features at various significance levels as a preprocessing step to sequential forward search leads to a huge increase in speed, without a decrease in classification accuracy. These results are shown based on experiments with 5 high-dimensional publicly available gene expression data sets.
- Long Papers | Pp. 277-287
A Comparison of Two Approaches to Classify with Guaranteed Performance
Stijn Vanderlooy; Ida G. Sprinkhuizen-Kuyper
The recently introduced transductive confidence machine approach and the ROC isometrics approach provide a framework to extend classifiers such that their performance can be set by the user prior to classification. In this paper we use the -nearest neighbour classifier in order to provide an extensive empirical evaluation and comparison of the approaches. From our results we may conclude that the approaches are competing and promising generally applicable machine learning tools.
- Long Papers | Pp. 288-299
Towards Data Mining Without Information on Knowledge Structure
Alexandre Vautier; Marie-Odile Cordier; René Quiniou
Most knowledge discovery processes are biased since some part of the knowledge structure must be given before extraction. We propose a framework that avoids this bias by supporting all major model structures e.g. clustering, sequences, etc., as well as specifications of data and DM (Data Mining) algorithms, in the same language. A unification operation is provided to match automatically the data to the relevant DM algorithms in order to extract models and their related structure. The MDL principle is used to evaluate and rank models. This evaluation is based on the covering relation that links the data to the models. The notion of schema, related to the category theory, is the key concept of our approach. Intuitively, a schema is an algebraic specification enhanced by the union of types, and the concepts of list and relation. An example based on network alarm mining illustrates the process.
- Long Papers | Pp. 300-311
Relaxation Labeling for Selecting and Exploiting Efficiently Non-local Dependencies in Sequence Labeling
Guillaume Wisniewski; Patrick Gallinari
We consider the problem of sequence labeling and propose a two steps method which combines the scores of local classifiers with a relaxation labeling technique. This framework can account for sparse dynamically changing dependencies, which allows us to efficiently discover relevant non-local dependencies and exploit them. This is in contrast to existing models which incorporate only local relationships between neighboring nodes. Experimental results show that the proposed method gives promising results.
- Long Papers | Pp. 312-323