Catálogo de publicaciones - libros
Methods of Microarray Data Analysis
Jennifer S. Shoemaker ; Simon M. Lin (eds.)
IV.
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Human Genetics; Cancer Research
Disponibilidad
| Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
|---|---|---|---|---|
| No detectada | 2005 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-0-387-23074-0
ISBN electrónico
978-0-387-23077-1
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2005
Información sobre derechos de publicación
© Springer Science + Business Media, Inc. Boston 2005
Cobertura temática
Tabla de contenidos
Probabilistic Lung Cancer Models Conditioned on Gene Expression Microarray Data
Craig Friedman; Wenbo Cao; Cheng Fan
A number of quantitative methods have been applied to the classification and clustering of microarray data (see, for example, [Tibshirani et al., 2001]). In this article, we describe a statistical learning theory-based method to construct lung cancer probability models that are conditioned on gene expression microarray data. Our models do more than classify—they indicate an estimate of the probability. We find our estimate for the conditional probability distribution by choosing a model that balances consistency with the training data and consistency with a prior distribution. This formulation leads to an optimization problem that has a mathematically equivalent problem with an objective function that is a penalized log-likelihood. We discuss three particular estimation problems: 1) find the conditional probability that a sample is adenocarcinoma or normal, given gene expression levels, 2) find the conditional probability for each of six disjoint categories related to lung cancer, given gene expression levels, and 3) find the conditional probability distribution for survival time, given gene expression levels. We describe the features that we select and measure the performance of the models that we create in economic terms. For the conditional probability of adenocarcinoma, we condition on probeset identifiers common to both the Harvard and Michigan data sets. When we trained on either data set, we were able to nearly perfectly classify adenocarcinoma on the other set.
Palabras clave: Microarray; ontology; adenocarcinoma; conditional probability; gene expression; features.
Pp. 133-146
Integration of Microarray Data for a Comparative Study of Classifiers and Identification of Marker Genes
Daniel Berrar; Brian Sturgeon; Ian Bradbury; C. Stephen Downes; Werner Dubitzky
Novel diagnostic tools promise the development of patient-tailored cancer treatment. However, one major step towards individualized therapy is to use a combination of various data sources, e.g. transcriptomic, proteomic, and clinical data. We have integrated clinical data and lung cancer microarray data that were generated on two different oligonucleotide platforms. We were interested in the question whether the prediction of survival outcome benefits from the integration of clinical and transcriptomic data. In addition, we attempted to identify those genes whose expression profiles correlate with survival outcome. We applied five machine learning techniques to predict survival risk groups, and we compared the models with respect to their performance and general user acceptance. Based on quantitative and qualitative evaluation criteria, we chose decision trees as the most relevant technique for this type of analysis. Our in silico analysis corroborates the role of numerous marker genes already described in lung adenocarcinomas. In addition, our study reveals a set of highly interesting genes whose expression profiles correlate with genetic risk groups of unexpected survival outcomes.
Palabras clave: Microarray; lung cancer; survival analysis; machine learning.
Pp. 147-162
Use of Micro Array Data via Model-based Classification in the Study and Prediction of Survival from Lung Cancer
Liat Ben-Tovim Jones; Shu-Kay Ng; Christophe Ambroise; Katrina Monico; Nazim Khan; Geoff McLachlan
We applied a model-based clustering approach to classify tumor tissues on the basis of microarray gene expression. The impact of this classification on cancer biology and clinical outcome was studied. In particular, the association between the clusters so formed and patient survival (recurrence) times was examined. The approach was illustrated using the four CAMDA’03 lung cancer datasets. We showed that the gene expression-based clustering is a powerful predictor of the outcome of disease, in addition to current systems based on histopathology criteria and extent of disease at presentation.
Palabras clave: Mixture models; EMMIX-GENE algorithm; selection bias; microarrays; survival analysis; Cox proportional hazards; Kaplan-Meier survival curve.
Pp. 163-173
Microarray Data Analysis of Survival Times of Patients with Lung Adenocarcinomas Using ADC and K-Medians Clustering
Wenting Zhou; Weichen Wu; Nathan Palmer; Emily Mower; Noah Daniels; Lenore Cowen; Anselm Blumer
We experiment with two types of clustering, K-medians and a dimensionreduction technique known as approximate distance clustering (ADC) [Cowen and Priebe 1997], for classifying lung adenocarcinomas into high-risk and low-risk groups according to gene expression values from microarray data. The microarrays were Affymetrix oligonucleotide arrays used in studies at Michigan and Harvard, with 12,600 and 7129 probesets respectively. We show that we can obtain accurate classification based on a reduced set of genes obtained by nearest shrunken mean (NSM) [Tibshirani et al. 2002] or a combination of a variance-based approach with hierarchical clustering. The quality of the clustering is measured by using the p-values from log-rank tests, and the results are confirmed using cross-validation and by using the reduced set of genes obtained from one dataset to cluster the other.
Palabras clave: Microarray; ADC clustering; K-medians; adenocarcinoma; survival time.
Pp. 175-190
Higher Dimensional Approach for Classification of Lung Cancer Microarray Data
F. Crimins; R. Dimitri; T. Klein; N. Palmer; L. Cowen
A lung cancer microarray dataset is re-examined using simple techniques, but retaining more of the high-dimensional structure. In particular, instead of discarding genes that look uninformative when considered in isolation, pairs, triples and quartets of genes are selected using kNN classifiers. Genes of potential biological importance are also uncovered.
Palabras clave: Lung cancer; microarray; classification; high-dimensional data.
Pp. 191-205
Microarray Data Analysis Using Neural Network Classifiers and Gene Selection Methods
Gaolin Zheng; E. Olusegun George; Giri Narasimhan
Different research groups have conducted independent gene expression studies on tissue samples from human lung adenocarcinomas [Bhattacharjee et al. 2001; Beer et al. 2002]. In this paper we (a) investigate methods to integrate data obtained from independent studies, (b) experiment with different gene selection methods to find genes that have significantly differential expression among different tumor stages, (c) study the performance of neural network classifiers with correlated weights, and (d) compare the performance of classifiers based on neural networks and its many variants on gene expression data. Raw cell intensity data were preprocessed for our analyses. Affymetrix array comparison spreadsheets were used to extract the overlapping probe sets for the data integration study. We considered neural network classifiers with random weights selected from a univariate normal distribution and optimized using Bayesian methods. The performance of the neural network was further enhanced using ensemble techniques such as bagging and boosting. The performance of all the resulting classifiers was compared using the Michigan and Harvard data sets from the CAMDA website. Three gene selection methods were used to find significant genes that could discriminate between the various stages of lung cancer. Significant genes, which were mined from the Gene Ontology (GO) database using the GoMiner and AmiGO packages, were found to be involved in apoptosis, angiogenesis, and cell growth and differentiation. Neural networks enhanced with bagging exhibited the best performance among all the classifiers we tested.
Palabras clave: Microarray; lung adenocarcinoma; robust multiarray averaging; gene selection; neural network classifiers; gene ontology.
Pp. 207-222
A Combinatorial Approach to the Analysis of Differential Gene Expression Data
Michael A. Langston; Lan Lin; Xinxia Peng; Nicole E. Baldwin; Christopher T. Symons; Bing Zhang; Jay R. Snoddy
Combinatorial methods are studied in an effort to gauge their potential utility in the analysis of differential gene expression data. Patient and gene relationships are modeled using edge-weighted graphs. Two algorithms with different, but complementary approaches are devised and implemented. One is based on finding optimal cliques within general graphs, the other on isolating near-optimal dominating sets within bipartite graphs. A main goal is to develop methodologies for training algorithms on patient populations with known disease profiles, so that they can be employed to classify and predict the likelihood of disease in patient populations whose profiles are not known. These novel strategies are in marked contrast with Bayesian and other wellknown techniques. Encouraging results are reported.
Palabras clave: Combinatorial methods; discrete mathematics; disease prediction and screening; graph algorithms; graph theory; microarray analysis.
Pp. 223-238
Genes Associated with Prognosis in Adenocarcinoma Across Studies at Multiple Institutions
Andrew V. Kossenkov; Ghislain Bidaut; Michael F. Ochs
Cancer is a complex disease, comprising many different specific malfunctions within the body. Because many biological processes occur simultaneously within all cells, the gene expression related to tumor behavior is generally confounded with expression due to routine metabolic processes and additional processes unrelated to tumorigenesis. Bayesian Decomposition has been used to isolate expression signatures related to these processes as well as signatures related to patient prognosis. The signatures related to prognosis have been analyzed to identify biological processes as well as specific genes whose presence appears related to outcome in all studies.
Palabras clave: Bayesian methods; gene expression; gene ontology; cancer.
Pp. 239-253