Catálogo de publicaciones - libros
Statistical Methods for Biostatistics and Related Fields
Wolfgang Härdle Yuichi Mori Philippe Vieu
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-32690-8
ISBN electrónico
978-3-540-32691-5
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Cobertura temática
Tabla de contenidos
Survival Analysis
Makoto Tomita
This paper focuses on the Fréchet distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves (Fréchet (1906)). The major limitation of this proximity measure is that it is based on the closeness of the values independently of the local trends. To alleviate this set back, we propose a dissimilarity index extending the above estimates to include the information of dependency between local trends. A synthetic dataset is generated to reproduce and show the limited conditions for the Fréchet distance. The proposed dissimilarity index is then compared with the Fréchet estimate and results illustrating its efficiency are reported.
Part I - Biostatistics | Pp. 207-217
Ozone Pollution Forecasting Using Conditional Mean and Conditional Quantiles with Functional Covariates
Hervé Cardot; Christophe Crambes; Pascal Sarda
Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional -test and the normal mixture model.
Part II - Related Sciences | Pp. 221-243
Nonparametric Functional Methods: New Tools for Chemometric Analysis
Frédéric Ferraty; Aldo Goia; Philippe Vieu
In this contribution, we have shown how spectrometric data can be succesfully analysed by considering them as curve data and by using the recent nonparametric methodology for curve data. However, note that all the statistical backgrounds are presented in a general way (and not only for spectrometric data). Similarly, the XploRe quantlets that we provided can be directly used in any other applied setting involving curve data. For reason of shortness, and because it was not the purpose here, we only presented the results given by the nonparametric functional methodology without discussing any comparison with alternative methods (but relevant references on these points are given all along the contribution).
Also for shortness reasons, we just presented two statistical problems (namely regression from curve data and curves discrimination) among the several problems that can be treated by nonparametric functional methods (on this point also, our contribution contains several references about other problems that could be attacked similarly). These two problems have been chosen by us for two reasons: first, these issues are highly relevant to many applied studies involving curve analysis and second, their theoretical and practical importance led to emergence of different computer automated procedures.
Part II - Related Sciences | Pp. 245-264
Variable Selection in Principal Component Analysis
Yuichi Mori; Masaya Iizuka; Tomoyuki Tarumi; Yutaka Tanaka
While there exist several criteria by which to select a reasonable subset of variables in the context of PCA, we introduce herein variable selection using criteria in modified PCA (M.PCA) among others.
In order to perform such variable selection via XploRe, the quantlib vaspca, which reads all the necessary quantlets for selection, is first called, and then the quantlet mpca is run using a number of selection parameters.
In the first four sections we present brief explanations of variable selection in PCA, an outline of M.PCA and flows of four selection procedures, based mainly on , , and . In the last two sections, we illustrate the quantlet mpca and its performance by two numerical examples.
Part II - Related Sciences | Pp. 265-283
Spatial Statistics
Pavel Čížzek; Wolfgang Härdle; Jürgen Symanzik
While there exist several criteria by which to select a reasonable subset of variables in the context of PCA, we introduce herein variable selection using criteria in modified PCA (M.PCA) among others.
In order to perform such variable selection via XploRe, the quantlib vaspca, which reads all the necessary quantlets for selection, is first called, and then the quantlet mpca is run using a number of selection parameters.
In the first four sections we present brief explanations of variable selection in PCA, an outline of M.PCA and flows of four selection procedures, based mainly on , , and . In the last two sections, we illustrate the quantlet mpca and its performance by two numerical examples.
Part II - Related Sciences | Pp. 285-304
Functional Data Analysis
Michal Benko
In many different fields of applied statistics the object of interest is depending on some continuous parameter, i.e. continuous time. Typical examples in biostatistics are growth curves or temperature measurements. Although for technical reasons, we are able to measure temperature just in discrete intervals — it is clear that temperature is a continuous process. Temperature during one year is a function with argument “time”. By collecting one-year-temperature functions for several years or for different weather stations we obtain bunch (sample) of functions — . The questions arising by the statistical analysis of functional data are basically identical to the standard statistical analysis of univariate or multivariate objects. From the theoretical point, design of a stochastic model for functional data and statistical analysis of the functional data set can be taken often one-to-one from the conventional multivariate analysis. In fact the first method how to deal with the functional data is to discretize them and perform a standard multivariate analysis on the resulting random vectors. The aim of this chapter is to introduce the functional data analysis (FDA), discuss the practical usage and implementation of the FDA methods.
This chapter is organized as follows: Section 16.1 defines the basic mathematical and statistical framework for the FDA, Section 16.2 introduces the most popular implementation of functional data analysis — the functional basis expansion. In Section 16.4 we present the basic theory of the functional principal components, smoothed functional principal components and a practical application on the temperature data set of the Canadian Weather-stations.
Part II - Related Sciences | Pp. 305-327
Analysis of Failure Time with Microearthquakes Applications
Graciela Estévez-Pérez; Alejandro Quintela del Rio
Some gene expression data contain outliers and noise because of experiment error. In clustering, outliers and noise can result in false positives and false negatives. This motivates us to develop a weighting method to adjust the expression data such that the outlier and noise effect decrease, and hence result in a reduction in false positives and false negatives in clustering.
In this paper, we describe the weighting adjustment method and apply it to a yeast cell cycle data set. Based on the adjusted yeast cell cycle expression data, the hierarchical clustering method with a correlation coefficient measure performs better than that based on standardized expression data. The clustering method based on the adjusted data can group some functionally related genes together and yields higher quality clusters.
Part II - Related Sciences | Pp. 329-345
Polychotomous Regression: Application to Landcover Prediction
Frédéric Ferraty; Martin Paegelow; Pascal Sarda
This paper focuses on the Fréchet distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves (Fréchet (1906)). The major limitation of this proximity measure is that it is based on the closeness of the values independently of the local trends. To alleviate this set back, we propose a dissimilarity index extending the above estimates to include the information of dependency between local trends. A synthetic dataset is generated to reproduce and show the limited conditions for the Fréchet distance. The proposed dissimilarity index is then compared with the Fréchet estimate and results illustrating its efficiency are reported.
Part II - Related Sciences | Pp. 347-356
The Application of Fuzzy Clustering to Satellite Images Data
Hizir Sofyan; Muzailin Affan; Khaled Bawahidi
Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional -test and the normal mixture model.
Part II - Related Sciences | Pp. 357-366