Catálogo de publicaciones - libros
Statistical Methods for Biostatistics and Related Fields
Wolfgang Härdle Yuichi Mori Philippe Vieu
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-32690-8
ISBN electrónico
978-3-540-32691-5
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Cobertura temática
Tabla de contenidos
Discriminant Analysis Based on Continuous and Discrete Variables
Avner Bar-Hen; Jean-Jacques Daudin
Reference curves which take one covariable into account such as the age, are often required in medicine, but simple systematic and efficient statistical methods for constructing them are lacking. Classical methods are based on parametric fitting (polynomial curves). In this chapter, we describe a new methodology for the estimation of reference curves for data sets, based on nonparametric estimation of conditional quantiles. The derived method should be applicable to all clinical or more generally biological variables that are measured on a continuous quantitative scale. To avoid the curse of dimensionality when the covariate is multidimensional, a new semiparametric approach is proposed. This procedure combines a dimension-reduction step (based on sliced inverse regression) and kernel estimation of conditional quantiles step. The usefulness of this semiparametric estimation procedure is illustrated on a simulated data set and on a real data set collected in order to establish reference curves for biophysical properties of the skin of healthy French women.
Part I - Biostatistics | Pp. 3-27
Longitudinal Data Analysis with Linear Regression
Jörg Breitung; Rémy Slama; Axel Werwatz
This paper focuses on the Fréchet distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves (Fréchet (1906)). The major limitation of this proximity measure is that it is based on the closeness of the values independently of the local trends. To alleviate this set back, we propose a dissimilarity index extending the above estimates to include the information of dependency between local trends. A synthetic dataset is generated to reproduce and show the limited conditions for the Fréchet distance. The proposed dissimilarity index is then compared with the Fréchet estimate and results illustrating its efficiency are reported.
Part I - Biostatistics | Pp. 29-43
A Kernel Method Used for the Analysis of Replicated Micro-array Experiments
Ali Gannoun; Beno Liquetît; Jérôme Saracco; Wolfgang Urfer
Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional -test and the normal mixture model.
Part I - Biostatistics | Pp. 45-61
Kernel Estimates of Hazard Functions for Biomedical Data Sets
Ivana Horová; Jiří Zelinka
This paper focuses on the Fréchet distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves (Fréchet (1906)). The major limitation of this proximity measure is that it is based on the closeness of the values independently of the local trends. To alleviate this set back, we propose a dissimilarity index extending the above estimates to include the information of dependency between local trends. A synthetic dataset is generated to reproduce and show the limited conditions for the Fréchet distance. The proposed dissimilarity index is then compared with the Fréchet estimate and results illustrating its efficiency are reported.
Part I - Biostatistics | Pp. 63-86
Partially Linear Models
Wolfgang Härdle; Hua Liang
In this contribution, we have shown how spectrometric data can be succesfully analysed by considering them as curve data and by using the recent nonparametric methodology for curve data. However, note that all the statistical backgrounds are presented in a general way (and not only for spectrometric data). Similarly, the XploRe quantlets that we provided can be directly used in any other applied setting involving curve data. For reason of shortness, and because it was not the purpose here, we only presented the results given by the nonparametric functional methodology without discussing any comparison with alternative methods (but relevant references on these points are given all along the contribution).
Also for shortness reasons, we just presented two statistical problems (namely regression from curve data and curves discrimination) among the several problems that can be treated by nonparametric functional methods (on this point also, our contribution contains several references about other problems that could be attacked similarly). These two problems have been chosen by us for two reasons: first, these issues are highly relevant to many applied studies involving curve analysis and second, their theoretical and practical importance led to emergence of different computer automated procedures.
Part I - Biostatistics | Pp. 87-103
Analysis of Contingency Tables
Masahiro Kuroda
This paper focuses on the Fréchet distance introduced by Maurice Fréchet in 1906 to account for the proximity between curves (Fréchet (1906)). The major limitation of this proximity measure is that it is based on the closeness of the values independently of the local trends. To alleviate this set back, we propose a dissimilarity index extending the above estimates to include the information of dependency between local trends. A synthetic dataset is generated to reproduce and show the limited conditions for the Fréchet distance. The proposed dissimilarity index is then compared with the Fréchet estimate and results illustrating its efficiency are reported.
Part I - Biostatistics | Pp. 105-124
Identifying Coexpressed Genes
Qihua Wang
Some gene expression data contain outliers and noise because of experiment error. In clustering, outliers and noise can result in false positives and false negatives. This motivates us to develop a weighting method to adjust the expression data such that the outlier and noise effect decrease, and hence result in a reduction in false positives and false negatives in clustering.
In this paper, we describe the weighting adjustment method and apply it to a yeast cell cycle data set. Based on the adjusted yeast cell cycle expression data, the hierarchical clustering method with a correlation coefficient measure performs better than that based on standardized expression data. The clustering method based on the adjusted data can group some functionally related genes together and yields higher quality clusters.
Part I - Biostatistics | Pp. 125-145
Bootstrap Methods for Testing Interactions in GAMs
Javier Roca-Pardiñas; Carmen Cadarso-Suárez; Wenceslao González-Manteiga
While there exist several criteria by which to select a reasonable subset of variables in the context of PCA, we introduce herein variable selection using criteria in modified PCA (M.PCA) among others.
In order to perform such variable selection via XploRe, the quantlib vaspca, which reads all the necessary quantlets for selection, is first called, and then the quantlet mpca is run using a number of selection parameters.
In the first four sections we present brief explanations of variable selection in PCA, an outline of M.PCA and flows of four selection procedures, based mainly on , , and . In the last two sections, we illustrate the quantlet mpca and its performance by two numerical examples.
Part I - Biostatistics | Pp. 147-166
Survival Trees
Carmela Cappelli; Heping Zhang
Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels of thousands of genes simultaneously. In microarray data analysis, the comparison of gene expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large data sets. To identify genes with altered expression under two experimental conditions, we describe in this chapter a new nonparametric statistical approach. Specifically, we propose estimating the distributions of a t-type statistic and its null statistic, using kernel methods. A comparison of these two distributions by means of a likelihood ratio test can identify genes with significantly changed expressions. A method for the calculation of the cut-off point and the acceptance region is also derived. This methodology is applied to a leukemia data set containing expression levels of 7129 genes. The corresponding results are compared to the traditional -test and the normal mixture model.
Part I - Biostatistics | Pp. 167-179
A Semiparametric Approach to Estimate Reference Curves for Biophysical Properties of the Skin
Saracco Jérôme; Gannoun Ali; Guinot Christiane; Liquet Benoît
Reference curves which take one covariable into account such as the age, are often required in medicine, but simple systematic and efficient statistical methods for constructing them are lacking. Classical methods are based on parametric fitting (polynomial curves). In this chapter, we describe a new methodology for the estimation of reference curves for data sets, based on nonparametric estimation of conditional quantiles. The derived method should be applicable to all clinical or more generally biological variables that are measured on a continuous quantitative scale. To avoid the curse of dimensionality when the covariate is multidimensional, a new semiparametric approach is proposed. This procedure combines a dimension-reduction step (based on sliced inverse regression) and kernel estimation of conditional quantiles step. The usefulness of this semiparametric estimation procedure is illustrated on a simulated data set and on a real data set collected in order to establish reference curves for biophysical properties of the skin of healthy French women.
Part I - Biostatistics | Pp. 181-205