Catálogo de publicaciones - libros

Compartir en
redes sociales


Statistical Methods for Biostatistics and Related Fields

Wolfgang Härdle Yuichi Mori Philippe Vieu

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-32690-8

ISBN electrónico

978-3-540-32691-5

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Discriminant Analysis Based on Continuous and Discrete Variables

Avner Bar-Hen; Jean-Jacques Daudin

Reference curves which take one covariable into account such as the age, are often required in medicine, but simple systematic and efficient statistical methods for constructing them are lacking. Classical methods are based on parametric fitting (polynomial curves). In this chapter, we describe a new methodology for the estimation of reference curves for data sets, based on nonparametric estimation of conditional quantiles. The derived method should be applicable to all clinical or more generally biological variables that are measured on a continuous quantitative scale. To avoid the curse of dimensionality when the covariate is multidimensional, a new semiparametric approach is proposed. This procedure combines a dimension-reduction step (based on sliced inverse regression) and kernel estimation of conditional quantiles step. The usefulness of this semiparametric estimation procedure is illustrated on a simulated data set and on a real data set collected in order to establish reference curves for biophysical properties of the skin of healthy French women.

Part I - Biostatistics | Pp. 3-27

Longitudinal Data Analysis with Linear Regression

Jörg Breitung; Rémy Slama; Axel Werwatz

This paper focuses on the Fréchet distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves (Fréchet (1906)). The major limitation of this proximity measure is that it is based on the closeness of the values independently of the local trends. To alleviate this set back, we propose a dissimilarity index extending the above estimates to include the information of dependency between local trends. A synthetic dataset is generated to reproduce and show the limited conditions for the Fréchet distance. The proposed dissimilarity index is then compared with the Fréchet estimate and results illustrating its efficiency are reported.

Part I - Biostatistics | Pp. 29-43

A Kernel Method Used for the Analysis of Replicated Micro-array Experiments

Ali Gannoun; Beno Liquetît; Jérôme Saracco; Wolfgang Urfer

Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional -test and the normal mixture model.

Part I - Biostatistics | Pp. 45-61

Kernel Estimates of Hazard Functions for Biomedical Data Sets

Ivana Horová; Jiří Zelinka

This paper focuses on the Fréchet distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves (Fréchet (1906)). The major limitation of this proximity measure is that it is based on the closeness of the values independently of the local trends. To alleviate this set back, we propose a dissimilarity index extending the above estimates to include the information of dependency between local trends. A synthetic dataset is generated to reproduce and show the limited conditions for the Fréchet distance. The proposed dissimilarity index is then compared with the Fréchet estimate and results illustrating its efficiency are reported.

Part I - Biostatistics | Pp. 63-86

Partially Linear Models

Wolfgang Härdle; Hua Liang

In this contribution, we have shown how spectrometric data can be succesfully analysed by considering them as curve data and by using the recent nonparametric methodology for curve data. However, note that all the statistical backgrounds are presented in a general way (and not only for spectrometric data). Similarly, the XploRe quantlets that we provided can be directly used in any other applied setting involving curve data. For reason of shortness, and because it was not the purpose here, we only presented the results given by the nonparametric functional methodology without discussing any comparison with alternative methods (but relevant references on these points are given all along the contribution).

Also for shortness reasons, we just presented two statistical problems (namely regression from curve data and curves discrimination) among the several problems that can be treated by nonparametric functional methods (on this point also, our contribution contains several references about other problems that could be attacked similarly). These two problems have been chosen by us for two reasons: first, these issues are highly relevant to many applied studies involving curve analysis and second, their theoretical and practical importance led to emergence of different computer automated procedures.

Part I - Biostatistics | Pp. 87-103

Analysis of Contingency Tables

Masahiro Kuroda

This paper focuses on the Fréchet distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves (Fréchet (1906)). The major limitation of this proximity measure is that it is based on the closeness of the values independently of the local trends. To alleviate this set back, we propose a dissimilarity index extending the above estimates to include the information of dependency between local trends. A synthetic dataset is generated to reproduce and show the limited conditions for the Fréchet distance. The proposed dissimilarity index is then compared with the Fréchet estimate and results illustrating its efficiency are reported.

Part I - Biostatistics | Pp. 105-124

Identifying Coexpressed Genes

Qihua Wang

Some gene expression data contain outliers and noise because of experiment error. In clustering, outliers and noise can result in false positives and false negatives. This motivates us to develop a weighting method to adjust the expression data such that the outlier and noise effect decrease, and hence result in a reduction in false positives and false negatives in clustering.

In this paper, we describe the weighting adjustment method and apply it to a yeast cell cycle data set. Based on the adjusted yeast cell cycle expression data, the hierarchical clustering method with a correlation coefficient measure performs better than that based on standardized expression data. The clustering method based on the adjusted data can group some functionally related genes together and yields higher quality clusters.

Part I - Biostatistics | Pp. 125-145

Bootstrap Methods for Testing Interactions in GAMs

Javier Roca-Pardiñas; Carmen Cadarso-Suárez; Wenceslao González-Manteiga

While there exist several criteria by which to select a reasonable subset of variables in the context of PCA, we introduce herein variable selection using criteria in modified PCA (M.PCA) among others.

In order to perform such variable selection via XploRe, the quantlib vaspca, which reads all the necessary quantlets for selection, is first called, and then the quantlet mpca is run using a number of selection parameters.

In the first four sections we present brief explanations of variable selection in PCA, an outline of M.PCA and flows of four selection procedures, based mainly on , , and . In the last two sections, we illustrate the quantlet mpca and its performance by two numerical examples.

Part I - Biostatistics | Pp. 147-166

Survival Trees

Carmela Cappelli; Heping Zhang

Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional -test and the normal mixture model.

Part I - Biostatistics | Pp. 167-179

A Semiparametric Approach to Estimate Reference Curves for Biophysical Properties of the Skin

Saracco Jérôme; Gannoun Ali; Guinot Christiane; Liquet Benoît

Reference curves which take one covariable into account such as the age, are often required in medicine, but simple systematic and efficient statistical methods for constructing them are lacking. Classical methods are based on parametric fitting (polynomial curves). In this chapter, we describe a new methodology for the estimation of reference curves for data sets, based on nonparametric estimation of conditional quantiles. The derived method should be applicable to all clinical or more generally biological variables that are measured on a continuous quantitative scale. To avoid the curse of dimensionality when the covariate is multidimensional, a new semiparametric approach is proposed. This procedure combines a dimension-reduction step (based on sliced inverse regression) and kernel estimation of conditional quantiles step. The usefulness of this semiparametric estimation procedure is illustrated on a simulated data set and on a real data set collected in order to establish reference curves for biophysical properties of the skin of healthy French women.

Part I - Biostatistics | Pp. 181-205