Catálogo de publicaciones - libros

Compartir en
redes sociales

Knowledge Discovery in Inductive Databases: 4th International Workshop, KDID 2005, Porto, Portugal, October 3, 2005, Revised Selected and Invited Papers

Francesco Bonchi ; Jean-François Boulicaut (eds.)

En conferencia: 4º International Workshop on Knowledge Discovery in Inductive Databases (KDID) . Porto, Portugal . October 3, 2005 - October 3, 2005

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2006	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-33292-3

ISBN electrónico

978-3-540-33293-0

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2006

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Ingeniería eléctrica, electrónica e informática

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/11733492_11

Shaping SQL-Based Frequent Pattern Mining Algorithms

Csaba István Sidló; András Lukács

Integration of data mining and database management systems could significantly ease the process of knowledge discovery in large databases. We consider implementations of frequent itemset mining algorithms, in particular pattern-growth algorithms similar to the top-down FP-growth variations, tightly coupled to relational database management systems. Our implementations remain within the confines of the conventional relational database facilities like tables, indices, and SQL operations. We compare our algorithm to the most promising previously proposed SQL-based FIM algorithm. Experiments show that our method performs better in many cases, but still has severe limitations compared to the traditional stand-alone pattern-growth method implementations. We identify the bottlenecks of our SQL-based pattern-growth methods and investigate the applicability of tightly coupled algorithms in practice.

- Contributed Papers | Pp. 188-201

doi: 10.1007/11733492_12

Exploiting Virtual Patterns for Automatically Pruning the Search Space

Arnaud Soulet; Bruno Crémilleux

A lot of works address the mining of patterns under constraints. The search space is reduced by taking advantage of pruning conditions on patterns, typically by using anti-monotone and monotone properties. In this paper, we introduce two virtual patterns in order to automatically deduce pruning conditions from constraint coming from the primitive-based framework which gathers a large set of varied constraints. These virtual patterns enable us to provide negative and positive pruning conditions according to the generalization and the specialization of patterns. We show that these pruning conditions are monotone or anti-monotone and can be pushed into usual constraint mining algorithms. Experiments carried on several contexts show that our proposals improve the mining.

- Contributed Papers | Pp. 202-221

doi: 10.1007/11733492_13

Constraint Based Induction of Multi-objective Regression Trees

Jan Struyf; Sašo Džeroski

Constrained based inductive systems are a key component of inductive databases and responsible for building the models that satisfy the constraints in the inductive queries. In this paper, we propose a constraint based system for building multi-objective regression trees. A multi-objective regression tree is a decision tree capable of predicting several numeric variables at once. We focus on size and accuracy constraints. By either specifying maximum size or minimum accuracy, the user can trade-off size (and thus interpretability) for accuracy. Our approach is to first build a large tree based on the training data and to prune it in a second step to satisfy the user constraints. This has the advantage that the tree can be stored in the inductive database and used for answering inductive queries with different constraints. Besides size and accuracy constraints, we also briefly discuss syntactic constraints. We evaluate our system on a number of real world data sets and measure the size versus accuracy trade-off.

- Contributed Papers | Pp. 222-233

doi: 10.1007/11733492_14

Learning Predictive Clustering Rules

Bernard Ženko; Sašo Džeroski; Jan Struyf

The two most commonly addressed data mining tasks are predictive modelling and clustering. Here we address the task of predictive clustering, which contains elements of both and generalizes them to some extent. Predictive clustering has been mainly evaluated in the context of trees. In this paper, we extend predictive clustering toward rules. Each cluster is described by a rule and different clusters are allowed to overlap since the sets of examples covered by different rules do not need to be disjoint. We propose a system for learning these predictive clustering rules, which is based on a heuristic sequential covering algorithm. The heuristic takes into account both the precision of the rules (compactness w.r.t. the target space) and the compactness w.r.t. the input space, and the two can be traded-off by means of a parameter. We evaluate our system in the context of several multi-objective classification problems.

- Contributed Papers | Pp. 234-250