Catálogo de publicaciones - libros
Selected Contributions in Data Analysis and Classification
Paula Brito ; Guy Cucumel ; Patrice Bertrand ; Francisco de Carvalho (eds.)
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Statistical Theory and Methods; Data Mining and Knowledge Discovery; Pattern Recognition
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-73558-8
ISBN electrónico
978-3-540-73560-1
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Cobertura temática
Tabla de contenidos
Quality Issues in Symbolic Data Analysis
Haralambos Papageorgiou; Maria Vardaki
Symbolic Data Analysis is an extension of Classical Data Analysis to more complex data types and tables through the application of certain conditions, where underlying concepts are vital for their further processing. Therefore, the assessment of the quality of Symbolic Data depends extensively on the quality of the collected classical data. However, even though various criteria and indicators have been established to assess quality in classsical statistics, the specificities of Symbolic Data construction challenge the efficacy of the classical quality assessment components. In this paper we initially refer to the quality dimensions that can be considered for the classical data and then emphasize on the extent that these can be applied to symbolic data, taking into account the peculiarities of symbolic approach.
Part I - Analysis of Symbolic Data | Pp. 113-122
Dynamic Clustering of Histogram Data: Using the Right Metric
Rosanna Verde; Antonio Irpino
In this paper we present a review of some metrics to be proposed as allocation functions in the Dynamic Clustering Algorithm (DCA) when data are distribution or histograms of values. The choice of the most suitable distance plays a central role in the DCA because it is related to the criterion function that is optimized. Moreover, it has to be consistent with the which represents the cluster. In such a way, for each proposed metric, we identify the corresponding according to the minimization of the criterion function and then to the best fitting between the partition and the best representation of the clusters. Finally, we focus our attention on a Wassertein based distance showing its optimality in partitioning a set of histogram data with respect to a representation of the clusters by means of their barycenter expressed in terms of distributions.
Part I - Analysis of Symbolic Data | Pp. 123-134
Beyond the Pyramids: Rigid Clustering Systems
Jean-Pierre Barthélemy; Gentian Gusho; Christophe Osswald
This paper is devoted to, more or less new extensions of the notion of pyramid introduced by Diday (1984, 1986) and Fichet (1984, 1986). It is related to the notion of “rigid clustering system” or “rigid hypergraph” (topics related to combinatorial theory). Pyramids are representations of clusterings systems whose classes are connected subgraphs of a path (or, in other words, intervals of some linear order). More generally, we shall consider clustering systems whose classes are connected components of some graph. After reviewing some classical results, we shall emphasize relations between rigidity and minimal spanning trees.
Part II - Clustering Methods | Pp. 137-150
Indirect Blockmodeling of 3-Way Networks
Vladimir Batagelj; Anuška Ferligoj; Patrick Doreian
An approach to the indirect blockmodeling of 3-way network data is presented for structural equivalence. This equivalence type is defined formally and expressed in terms of an interchangeability condition that is used to construct a compatible dissimilarity. Using Ward’s method, the three dimensional partitioning is obtained via hierarchical clustering and represented diagrammatically. Artificial and real data are used to illustrate these methods.
Part II - Clustering Methods | Pp. 151-159
Clustering Methods: A History of -Means Algorithms
Hans-Hermann Bock
This paper surveys some historical issues related to the well-known k-means algorithm in cluster analysis. It shows to which authors the different versions of this algorithm can be traced back, and which were the underlying applications. We sketch various generalizations (with references also to Diday’s work) and thereby underline the usefulness of the -means approach in data analysis.
Part II - Clustering Methods | Pp. 161-172
Overlapping Clustering in a Graph Using -Means and Application to Protein Interactions Networks
Irène Charon; Lucile Denoeud; Olivier Hudry
In this article, we design an overlapping clustering method in a graph in order to deal with a biological issue: the proteins annotation. Given an unweighted and undirected graph , we search for subgraphs of that are dense in edges. The method consists in three steps. First we determine some intial kernels of the classes by means of a local density function; then we improve these kernels using a -means process; last the kernels are extended to overlapping classes. The method is tested on random graphs and finally applied to a protein interactions network.
Part II - Clustering Methods | Pp. 173-182
Species Clustering via Classical and Interval Data Representation
Marie Chavent
Consider a data table where objects are described by numerical variables and a qualitative variable with m categories. Interval data representation and interval data clustering methods are useful for clustering the categories. We study in this paper a data set of fish contaminated with mercury. We will see how classical or interval data representation can be used for clustering the species of fish and not the fishes themselves. We will compare the results obtained with the two approaches (classical or interval) in the particular case of this application in Ecotoxicology.
Part II - Clustering Methods | Pp. 183-191
Looking for High Density Zones in a Graph
Tristan Colombo; Alain Guénoche
The aim of this paper is to introduce new methods to build dense classes of vertices in a graph. These classes correspond to connected parts having a proportion of inner edges which is higher than the average on the whole graph. They are progressively built; a kernel of each class is first established, then they are extended to connected elements and finally to a partition. Several density fonctions are compared. A Monte-Carlo validation of this method is made from random graphs fulfilling some density conditions.
Part II - Clustering Methods | Pp. 193-201
Block Bernoulli Parsimonious Clustering Models
Gérard Govaert; Mohamed Nadif
When the data consists of a set of objects described by a set of binary variables, we have embedded the block clustering problem of binary table in the mixture approach. In using a Bernoulli model and adopting the classification maximum likelihood principle we perform an adapted version of the block CEM algorithm. In this paper, we propose different parsimonious models by imposing constraints on the Bernoulli parameter.
Part II - Clustering Methods | Pp. 203-212
Cluster Analysis Based on Posets
Melvin F. Janowitz
When dissimilarities are measured in a space other than the reals, it is argued that previous models for cluster analysis are not adequate. Possible new models will be explored. It is also shown that formal concept analysis may be viewed as a special case of a Boolean dissimilarity coefficient. A persistent underlying theme involves generalized notions of adjoints of order preserving mappings between posets.
Part II - Clustering Methods | Pp. 213-223