Catálogo de publicaciones - libros

Compartir en
redes sociales


Selected Contributions in Data Analysis and Classification

Paula Brito ; Guy Cucumel ; Patrice Bertrand ; Francisco de Carvalho (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Statistical Theory and Methods; Data Mining and Knowledge Discovery; Pattern Recognition

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-73558-8

ISBN electrónico

978-3-540-73560-1

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Locally Linear Regression and the Calibration Problem for Micro-Array Analysis

Antonio Ciampi; Benjamin Rich; Alina Dyachenko; Isadora Antoniano Villalobos; Carl Murie; Robert Nadon

We review the concept of locally linear regression and its relationship to Diday’s and to tree-structured linear regression. We describe the calibration problem in microarray analysis and propose a Bayesian approach based on tree-structured linear regression. Using the proposed approach, we analyze a subset of a large data set from an Affymetrix microarray calibration experiment. In this example, a tree-structured regression model outperforms a multiple regression model. We calculated 95% Credible Intervals for a sample of the data, obtaining reasonably good results. Future research will consider and compare several other approaches to locally linear regression.

Part VII - Multivariate Statistics | Pp. 549-555

Sanskrit Manuscript Comparison for Critical Edition and Classification

Marc Csernel; Patrice Bertrand

A critical edition takes into account all the different known versions of the same text in order to show the differences related to any two distinct versions. The construction of a critical edition is a long and, sometimes, tedious work. In order to make it easier, softwares helping the philologist are nowadays available for the European languages. Because of its complex graphical characteristics, which involve computationally expensive solutions to problems occurring in text comparisons, such softwares do not yet exist for Sanskrit language.

This paper describes the Sanskrit characteristics that make text comparisons different, presents computationally feasible solutions for the elaboration of the computer assisted critical edition of Sanskrit texts, and provides, as a byproduct, a distance between two versions of the edited text.

Part VII - Multivariate Statistics | Pp. 557-566

Divided Switzerland

Yadolah Dodge; Gérard Geiser; Valentin Rousson

On the 6th of December, 1992, the Swiss population voted against the “Adhesion of Switzerland to the European Economic Area”. Swiss German cantons, except Basel-Stadt and Basel-Land, voted against, and all French speaking cantons voted in favour of adhesion. Shocked by this outcome, the media, the politicians, and the population itself took this date as the beginning of the divided Switzerland. The purpose of this article is to show that what happened on that day was not a new phenomenon but was in line with more than a century of votations.

Part VII - Multivariate Statistics | Pp. 567-576

Prediction with Confidence

Alexander Gammerman

The paper outlines an efficient way to complement predictions, produced by new and traditional machine-learning methods, with measures of their accuracy and reliability. These measures are not only valid and informative, but they also take full account of the special features of the object to be predicted. They are based on computable approximations of Kolmogorov’s algorithmic notion of randomness. In using these measures it becomes possible to control the number of erroneous predictions by selecting a suitable confidence level. Further development of these ideas can lead to establishing useful links with the Diday’s Symbolic Data Analysis.

Part VII - Multivariate Statistics | Pp. 577-580

Which Bootstrap for Principal Axes Methods?

Ludovic Lebart

This paper deals with validation techniques in the context of exploratory techniques involving singular values decomposition, namely: Principal Components Analysis, Simple and Multiple Correspondence Analysis. We briefly show that, according to the purpose of the analysis, at least five types of resampling techniques could be carried out to assess the quality of the obtained visualisations: a) Partial bootstrap, that considers the replications as supplementary data, without diagonalization of the replicated moment-product matrices. b) Total bootstrap type 1, that performs a new diagonalization for each replicate, with corrections limited to possible changes of signs of the axes. c) Total bootstrap type 2, which adds to the preceding one a correction for the possible exchanges of axes. d) Total bootstrap type 3, that implies Procrustean transformations of all the replicates striving to take into account both rotations and exchanges of axes. e) Specific bootstrap, implying a resampling at a different level (case of a hierarchy of statistical units). An example is presented for each type of resampling.

Part VII - Multivariate Statistics | Pp. 581-588

PCR and PLS for Clusterwise Regression on Functional Data

Cristian Preda; Gilbert Saporta

Clusterwise regression is applied to functional data, using PCR and PLS as regularization methods for the functional linear regression model. We compare these two approaches on simulated data as well as on stock-exchange data.

Part VII - Multivariate Statistics | Pp. 589-598

A New Method for Ranking Statistical Units

Alfredo Rizzi

In many research problems it is useful to summarize some indices or indicators to express a synthetic, indirect measure of a concept which is revealed by variables observed in each statistical unit. This is because the variables are considered to be indirect measures of a complex (perhaps indefinable) concept. Within this context and for ranking the statistical units the author suggests the index: where the ( = 1, 2,...,; = 1, 2,..., ) represent the values of the p principal components connected with the statistical unit. This index is applied for ranking the 20 Italian Regions for for the years 2000–2002. The results are compared with those that are furnished by the .

Part VII - Multivariate Statistics | Pp. 599-607

About Relational Correlations

Yves Schektman

Using particular euclidean geometries called relational, one can go deeper into the usual concepts as well as the Data Analysis methods and even generalizes or proposes new ones. Inner products in these particular euclidean spaces are built using correlations between principal components of observed sets of variables. A summary of the main topics on an essay in process is proposed.

Part VII - Multivariate Statistics | Pp. 609-618

Dynamic Features Extraction in Soybean Futures Market of China

Huiwen Wang; Jie Meng

By applying Symbolic Data Analysis (SDA), this paper investigates the dynamic features of soybean futures market of Dalian Commodity Exchange (DCE) of China during 2002 to 2004. First, interval data is created by classifying mass futures contracts by different years and different maturity dates; and then DIV clustering method is applied on these interval data which produces further simplified three-way interval symbolic data and greatly reduces the sample size. Based on that, factor analysis of interval data is adopted to extract dynamic principal characteristics of soybean futures, which reduces the dimension of the variable space. The results of the empirical research, which are rightly coincident with the realities, verify the application value of SDA in analyzing mass, dynamic and complex data.

Part VII - Multivariate Statistics | Pp. 619-627