Catálogo de publicaciones - libros

Compartir en
redes sociales


Selected Contributions in Data Analysis and Classification

Paula Brito ; Guy Cucumel ; Patrice Bertrand ; Francisco de Carvalho (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Statistical Theory and Methods; Data Mining and Knowledge Discovery; Pattern Recognition

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-73558-8

ISBN electrónico

978-3-540-73560-1

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Group Average Representations in Euclidean Distance Cones

Casper J. Albers; Frank Critchley; John C. Gower

The set of Euclidean distance matrices has a well-known representation as a convex cone. The problems of representing the group averages of distance matrices are discussed, but not fully resolved, in the context of SMACOF, Generalized Orthogonal Procrustes Analysis and Individual Differences Scaling. The polar (or dual) cone representation, corresponding to inner-products around a centroid, is also discussed. Some new characterisations of distance cones in terms of circumhyperspheres are presented.

Part VI - Dissimilarities: Structures and Indices | Pp. 445-454

On Lower-Maximal Paired-Ultrametrics

Patrice Bertrand; François Brucker

The weakly indexed paired-hierarchies (shortly, p-hierarchies) provide a clustering model that allows overlapping clusters and extends the hierarchical model. There exists a bijection between weakly indexed p-hierarchies and the so-called paired-ultrametrics (shortly, p-ultrametrics), this correspondence being a restriction of the bijection between weakly indexed pyramids and Robinsonian dissimilarities. This paper proposes a generalization of the well-known HAC clustering method to compute a weakly indexed p-hierarchy from a given dissimilarity . Moreover, the p-ultrametric associated to such a weakly indexed p-hierarchy is proved to be lower-maximal for and larger than the sub-dominant ultrametric of .

Part VI - Dissimilarities: Structures and Indices | Pp. 455-464

A Note on Three-Way Dissimilarities and Their Relationship with Two-Way Dissimilarities

Victor Chepoi; Bernard Fichet

This note is devoted to three-way dissimilarities defined on unordered triples. Some of them are derived from two-way dissimilarities via an -transformation (1 ≤ ≤ ∞). For < ∞, a six-point condition of Menger type is established. Based on the definitions of Joly-Le Calvé and Heiser-Bennani Dosse, the concepts of three-way distances are also discussed. A particular attention is paid to three-way ultrametrics and three-way tree distances.

Part VI - Dissimilarities: Structures and Indices | Pp. 465-475

One-to-One Correspondence Between Indexed Cluster Structures and Weakly Indexed Closed Cluster Structures

Jean Diatta

We place ourselves in a setting where singletons are not all required to be clusters, and we show that the resulting cluster structures and their corresponding closure under finite nonempty intersections still have the same minimal members. Moreover, we show that indexed cluster structures and weakly indexed closed cluster structures correspond in a one-to-one way.

Part VI - Dissimilarities: Structures and Indices | Pp. 477-482

Adaptive Dissimilarity Index for Gene Expression Profiles Classification

Ahlame Douzal Chouakria; Alpha Diallo; Françoise Giroud

DNA microarray technology allows to monitor simultaneously the expression levels of thousands of genes during important biological processes and across collections of related experiments. Clustering and classification techniques have proved to be helpful to understand gene function, gene regulation, and cellular processes. However the conventional proximity measures between genes expression data, used for clustering or classification purpose, do not fit gene expression specifications as they are based on the closeness of the expression magnitudes regardless of the overall gene expression profile (shape). We propose in this paper an adaptive dissimilarity index which would cover both values and behavior proximity. The effectiveness of the adaptive dissimilarity index is illustrated through a classification process for identification of genes cell cycle phases.

Part VI - Dissimilarities: Structures and Indices | Pp. 483-494

Lower (Anti-)Robinson Rank Representations for Symmetric Proximity Matrices

Lawrence J. Hubert; Hans-Friedrich Köhn

Edwin Diday, some two decades ago, was among the first few individuals to recognize the importance of the (anti-)Robinson form for representing a proximity matrix, and was the leader in suggesting how such matrices might be depicted graphically (as pyramids). We characterize the notions of an anti-Robinson (AR) and strongly anti-Robinson (SAR) matrix, and provide open-source M-files within a MATLAB environment to effect additive decompositions of a given proximity matrix into sums of AR (or SAR) matrices. We briefly introduce how the AR (or SAR) rank of a matrix might be specified.

Part VI - Dissimilarities: Structures and Indices | Pp. 495-504

Density-Based Distances: a New Approach for Evaluating Proximities Between Objects. Applications in Clustering and Discriminant Analysis

Jean-Paul Rasson; François Roland

The aim of this paper is twofold. First it is shown that taking densities between objects into account to define proximities between them is intuitively a right way to process. Secondly, some new distances based on density estimates are defined and some properties are presented. Many algorithms in clustering or discriminant analysis require the choice of a dissimilarity: two applications are presented, one in clustering and the other in discriminant analysis, and illustrate the benefits of using these new distances.

Part VI - Dissimilarities: Structures and Indices | Pp. 505-514

Robinson Cubes

Matthijs J. Warrens; Willem J. Heiser

A square similarity matrix is called a Robinson matrix if the highest entries within each row and column are on the main diagonal and if, when moving away from this diagonal, the entries never increase. This paper formulates Robinson cubes as three-way generalizations of Robinson matrices. The first definition involves only those entries that are in a row, column or tube with an entry of the main diagonal. A stronger definition, called a regular Robinson cube, involves all entries. Several examples of the definitions are presented.

Part VI - Dissimilarities: Structures and Indices | Pp. 515-523

Relative and Absolute Contributions to Aid Strata Interpretation

M. Carmen Bravo; José M. García-Santesmases

Strata generalisation by symbolic objects is presented when there is a class variable to be explained simultaneously in all strata. This is attained by a generalised recursive tree-building algorithm for populations partitioned into strata and described by symbolic data, that is, more complex data structures than classical data. Symbolic objects describe decisional nodes and strata. This paper presents some measures to interpret strata and nodes. The method is integrated into the SODAS Software (Symbolic Official Data Analysis System), partially supported by ESPRIT-20821 SODAS and IST-25161 ASSO.

Part VII - Multivariate Statistics | Pp. 527-537

Classification and Generalized Principal Component Analysis

Henri Caussinus; Anne Ruiz-Gazen

In previous papers, we propose a generalized principal component analysis (GPCA) aimed to display salient features of a multidimensional data set, in particular the existence of clusters. In the light of an example, this article evidences how GPCA and clustering methods are complementary. The projections provided by GPCA and the sequence of eigenvalues give useful indications on the number and the type of clusters to be expected; submitting GPCA principal components to a clustering algorithm instead of the raw data can improve the classification. The use of a convenient robustification of GPCA is also evoked.

Part VII - Multivariate Statistics | Pp. 539-548