Catálogo de publicaciones - libros
Selected Contributions in Data Analysis and Classification
Paula Brito ; Guy Cucumel ; Patrice Bertrand ; Francisco de Carvalho (eds.)
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Statistical Theory and Methods; Data Mining and Knowledge Discovery; Pattern Recognition
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-73558-8
ISBN electrónico
978-3-540-73560-1
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Cobertura temática
Tabla de contenidos
Locally Linear Regression and the Calibration Problem for Micro-Array Analysis
Antonio Ciampi; Benjamin Rich; Alina Dyachenko; Isadora Antoniano Villalobos; Carl Murie; Robert Nadon
We review the concept of locally linear regression and its relationship to Diday’s and to tree-structured linear regression. We describe the calibration problem in microarray analysis and propose a Bayesian approach based on tree-structured linear regression. Using the proposed approach, we analyze a subset of a large data set from an Affymetrix microarray calibration experiment. In this example, a tree-structured regression model outperforms a multiple regression model. We calculated 95% Credible Intervals for a sample of the data, obtaining reasonably good results. Future research will consider and compare several other approaches to locally linear regression.
Part VII - Multivariate Statistics | Pp. 549-555
Sanskrit Manuscript Comparison for Critical Edition and Classification
Marc Csernel; Patrice Bertrand
A critical edition takes into account all the different known versions of the same text in order to show the differences related to any two distinct versions. The construction of a critical edition is a long and, sometimes, tedious work. In order to make it easier, softwares helping the philologist are nowadays available for the European languages. Because of its complex graphical characteristics, which involve computationally expensive solutions to problems occurring in text comparisons, such softwares do not yet exist for Sanskrit language.
This paper describes the Sanskrit characteristics that make text comparisons different, presents computationally feasible solutions for the elaboration of the computer assisted critical edition of Sanskrit texts, and provides, as a byproduct, a distance between two versions of the edited text.
Part VII - Multivariate Statistics | Pp. 557-566
Divided Switzerland
Yadolah Dodge; Gérard Geiser; Valentin Rousson
On the 6th of December, 1992, the Swiss population voted against the “Adhesion of Switzerland to the European Economic Area”. Swiss German cantons, except Basel-Stadt and Basel-Land, voted against, and all French speaking cantons voted in favour of adhesion. Shocked by this outcome, the media, the politicians, and the population itself took this date as the beginning of the divided Switzerland. The purpose of this article is to show that what happened on that day was not a new phenomenon but was in line with more than a century of votations.
Part VII - Multivariate Statistics | Pp. 567-576
Prediction with Confidence
Alexander Gammerman
The paper outlines an efficient way to complement predictions, produced by new and traditional machine-learning methods, with measures of their accuracy and reliability. These measures are not only valid and informative, but they also take full account of the special features of the object to be predicted. They are based on computable approximations of Kolmogorov’s algorithmic notion of randomness. In using these measures it becomes possible to control the number of erroneous predictions by selecting a suitable confidence level. Further development of these ideas can lead to establishing useful links with the Diday’s Symbolic Data Analysis.
Part VII - Multivariate Statistics | Pp. 577-580
Which Bootstrap for Principal Axes Methods?
Ludovic Lebart
This paper deals with validation techniques in the context of exploratory techniques involving singular values decomposition, namely: Principal Components Analysis, Simple and Multiple Correspondence Analysis. We briefly show that, according to the purpose of the analysis, at least five types of resampling techniques could be carried out to assess the quality of the obtained visualisations: a) Partial bootstrap, that considers the replications as supplementary data, without diagonalization of the replicated moment-product matrices. b) Total bootstrap type 1, that performs a new diagonalization for each replicate, with corrections limited to possible changes of signs of the axes. c) Total bootstrap type 2, which adds to the preceding one a correction for the possible exchanges of axes. d) Total bootstrap type 3, that implies Procrustean transformations of all the replicates striving to take into account both rotations and exchanges of axes. e) Specific bootstrap, implying a resampling at a different level (case of a hierarchy of statistical units). An example is presented for each type of resampling.
Part VII - Multivariate Statistics | Pp. 581-588
PCR and PLS for Clusterwise Regression on Functional Data
Cristian Preda; Gilbert Saporta
Clusterwise regression is applied to functional data, using PCR and PLS as regularization methods for the functional linear regression model. We compare these two approaches on simulated data as well as on stock-exchange data.
Part VII - Multivariate Statistics | Pp. 589-598
A New Method for Ranking Statistical Units
Alfredo Rizzi
In many research problems it is useful to summarize some indices or indicators to express a synthetic, indirect measure of a concept which is revealed by variables observed in each statistical unit. This is because the variables are considered to be indirect measures of a complex (perhaps indefinable) concept. Within this context and for ranking the statistical units the author suggests the index: where the ( = 1, 2,...,; = 1, 2,..., ) represent the values of the p principal components connected with the statistical unit. This index is applied for ranking the 20 Italian Regions for for the years 2000–2002. The results are compared with those that are furnished by the .
Part VII - Multivariate Statistics | Pp. 599-607
About Relational Correlations
Yves Schektman
Using particular euclidean geometries called relational, one can go deeper into the usual concepts as well as the Data Analysis methods and even generalizes or proposes new ones. Inner products in these particular euclidean spaces are built using correlations between principal components of observed sets of variables. A summary of the main topics on an essay in process is proposed.
Part VII - Multivariate Statistics | Pp. 609-618
Dynamic Features Extraction in Soybean Futures Market of China
Huiwen Wang; Jie Meng
By applying Symbolic Data Analysis (SDA), this paper investigates the dynamic features of soybean futures market of Dalian Commodity Exchange (DCE) of China during 2002 to 2004. First, interval data is created by classifying mass futures contracts by different years and different maturity dates; and then DIV clustering method is applied on these interval data which produces further simplified three-way interval symbolic data and greatly reduces the sample size. Based on that, factor analysis of interval data is adopted to extract dynamic principal characteristics of soybean futures, which reduces the dimension of the variable space. The results of the empirical research, which are rightly coincident with the realities, verify the application value of SDA in analyzing mass, dynamic and complex data.
Part VII - Multivariate Statistics | Pp. 619-627