Catálogo de publicaciones - libros
Artificial Intelligence in Medicine: 10th Conference on Artificial Intelligence in Medicine, AIME 2005, Aberdeen, UK, July 23-27, 2005, Proceedings
Silvia Miksch ; Jim Hunter ; Elpida T. Keravnou (eds.)
En conferencia: 10º Conference on Artificial Intelligence in Medicine in Europe (AIME) . Aberdeen, UK . July 23, 2005 - July 27, 2005
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Artificial Intelligence (incl. Robotics); Health Informatics; Image Processing and Computer Vision; Information Systems Applications (incl. Internet); Information Storage and Retrieval; Database Management
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2005 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-27831-3
ISBN electrónico
978-3-540-31884-2
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2005
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2005
Tabla de contenidos
doi: 10.1007/11527770_61
Subgroup Mining for Interactive Knowledge Refinement
Martin Atzmueller; Joachim Baumeister; Achim Hemsing; Ernst-Jürgen Richter; Frank Puppe
When knowledge systems are deployed into a real-world application, then the maintenance of the knowledge is a crucial success factor. In the past, some approaches for the automatic refinement of knowledge bases have been proposed. Many only provide limited control during the modification and refinement process, and often assumptions about the correctness of the knowledge base and case base are made. However, such assumptions do not necessarily hold for real-world applications.
In this paper, we present a novel interactive approach for the user-guided refinement of knowledge bases. Subgroup mining methods are used to discover local patterns that describe factors potentially causing incorrect behavior of the knowledge system. We provide a case study of the presented approach with a fielded system in the medical domain.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 453-462
doi: 10.1007/11527770_62
Evidence Accumulation to Identify Discriminatory Signatures in Biomedical Spectra
A. Bamgbade; R. Somorjai; B. Dolenko; E. Pranckeviciene; A. Nikulin; R. Baumgartner
Extraction of meaningful spectral signatures (sets of features) from high-dimensional biomedical datasets is an important stage of biomarker discovery. We present a novel feature extraction algorithm for supervised classification, based on the evidence accumulation framework, originally proposed by Fred and Jain for unsupervised clustering. By taking advantage of the randomness in genetic-algorithm-based feature extraction, we generate interpretable spectral signatures, which serve as hypotheses for corroboration by further research. As a benchmark, we used the state-of-the-art support vector machine classifier. Using external crossvalidation, we were able to obtain candidate biomarkers without sacrificing prediction accuracy.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 463-467
doi: 10.1007/11527770_63
On Understanding and Assessing Feature Selection Bias
Šarunas Raudys; Richard Baumgartner; Ray Somorjai
Feature selection in high-dimensional biomedical data, such as gene expression arrays or biomedical spectra constitutes and important step towards biomarker discovery. Controlling feature selection bias is considered a major issue for a realistic assessment of the feature selection process. We propose a theoretical, probabilistic framework for the analysis of selection bias. In particular, we derive the means of calculating the true selection error when the performance estimates of the feature subsets are mutually dependent and the distribution density of the true error is arbitrary. We demonstrate in an extensive series of experiments the utility of the theoretical derivations with real-world datasets. We discuss the importance of understanding feature selection bias for the small sample size (n) / high dimensionality (p) situation, typical for biomedical data (genomics, proteomics, spectroscopy).
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 468-472
doi: 10.1007/11527770_64
A Model-Based Approach to Visualizing Classification Decisions for Patient Diagnosis
Keith Marsolo; Srinivasan Parthasarathy; Michael Twa; Mark Bullimore
Automated classification systems are often used for patient diagnosis. In many cases, the rationale behind a decision is as important as the decision itself. Here we detail a method of visualizing the criteria used by a decision tree classifier to provide support for clinicians interested in diagnosing corneal disease. We leverage properties of our data transformation to create surfaces highlighting the details deemed important in classification. Preliminary results indicate that the features illustrated by our visualization method are indeed the criteria that often lead to a correct diagnosis and that our system also seems to find favor with practicing clinicians.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 473-483
doi: 10.1007/11527770_65
Learning Rules from Multisource Data for Cardiac Monitoring
Élisa Fromont; René Quiniou; Marie-Odile Cordier
This paper aims at formalizing the concept of learning rules from multisource data in a cardiac monitoring context. Our method has been implemented and evaluated on learning from data describing cardiac behaviors from different viewpoints, here electrocardiograms and arterial blood pressure measures. In order to cope with the dimensionality problems of multisource learning, we propose an Inductive Logic Programming method using a two-step strategy. Firstly, rules are learned independently from each sources. Secondly, the learned rules are used to bias a new learning process from the aggregated data. The results show that the the proposed method is much more efficient than learning directly from the aggregated data. Furthermore, it yields rules having better or equal accuracy than rules obtained by monosource learning.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 484-493
doi: 10.1007/11527770_66
Effective Confidence Region Prediction Using Probability Forecasters
David G. Lindsay; Siân Cox
Confidence region prediction is a practically useful extension to the commonly studied pattern recognition problem. Instead of predicting a single label, the constraint is relaxed to allow prediction of a subset of labels given a desired confidence level 1 – . Ideally, effective region predictions should be (1) well calibrated – predictive regions at confidence level 1 – should err with relative frequency at most and (2) be as narrow (or certain) as possible. We present a simple technique to generate confidence region predictions from conditional probability estimates (probability forecasts). We use this ‘conversion’ technique to generate confidence region predictions from probability forecasts output by standard machine learning algorithms when tested on 15 multi-class datasets. Our results show that approximately 44% of experiments demonstrate well-calibrated confidence region predictions, with the -Nearest Neighbour algorithm tending to perform consistently well across all data. Our results illustrate the practical benefits of effective confidence region prediction with respect to medical diagnostics, where guarantees of capturing the true disease label can be given.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 494-503
doi: 10.1007/11527770_67
Signature Recognition Methods for Identifying Influenza Sequences
Jitimon Keinduangjun; Punpiti Piamsa-nga; Yong Poovorawan
Basically, one of the most important issues for identifying biological sequences is accuracy; however, since the exponential growth and excessive diversity of biological data, the requirement to compute within considerably appropriate time usually compromises with accuracy. We propose novel approaches for accurately identifying DNA sequences in shorter time by discovering sequence patterns – signatures, which are enough distinctive information for the sequence identification. The approaches are to find the best combination of -gram patterns and six statistical scoring algorithms, which are regularly used in the research of Information Retrieval, and then employ the signatures to create a similarity scoring model for identifying the DNA. We generate two approaches to discover the signatures. For the first one, we use only statistical information extracted directly from the sequences to discover the signatures. For the second one, we use prior knowledge of the DNA in the signature discovery process. From our experiments on influenza virus, we found that: 1) our technique can identify the influenza virus at the accuracy of up to 99.69% when 11-gram is used and the prior knowledge is applied; 2) the use of too short or too long signatures produces lower efficiency; and 3) most scoring algorithms are good for identification except the “” where its results are approximately 9% lower than the others. Moreover, this technique can be applied for identifying other organisms.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 504-513
doi: 10.1007/11527770_68
Conquering the Curse of Dimensionality in Gene Expression Cancer Diagnosis: Tough Problem, Simple Models
Minca Mramor; Gregor Leban; Janez Demšar; Blaž Zupan
In the paper we study the properties of cancer gene expression data sets from the perspective of classification and tumor diagnosis. Our findings and case studies are based on several recently published data sets. We find that these data sets typically include a subset of about 100 highly discriminating features of which predictive power can be further enhanced by exploring their interactions. This finding speaks against often used univariate feature selection methods, and may explain the superior performance of support vector machines recently reported in the related work. We argue that a much simpler technique that directly finds visualizations with clear separation of diagnostic classes may be used instead. Furthermore, it may perform better in inference of an understandable classifier that includes only a few relevant features.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 514-523
doi: 10.1007/11527770_69
An Algorithm to Learn Causal Relations Between Genes from Steady State Data: Simulation and Its Application to Melanoma Dataset
Xin Zhang; Chitta Baral; Seungchan Kim
In recent years, a few researchers have challenged past dogma and suggested methods (such as the IC algorithm) for inferring relationship among variables using steady state observations. In this paper, we present a modified IC (mIC) algorithm that uses entropy to test conditional independence and combines the steady state data with partial prior knowledge of topological ordering in gene regulatory network, for jointly learning the causal relationship among genes. We evaluate our mIC algorithm using the simulated data. The results show that the precision and recall rates are significantly improved compared with using IC algorithm. Finally, we apply the mIC algorithm to microarray data for melanoma. The algorithm identified the important causal relations associated with WNT5A, a gene playing an important role in melanoma, verified by the literatures.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 524-534
doi: 10.1007/11527770_70
Relation Mining over a Corpus of Scientific Literature
Fabio Rinaldi; Gerold Schneider; Kaarel Kaljurand; Michael Hess; Christos Andronis; Andreas Persidis; Ourania Konstanti
The amount of new discoveries (as published in the scientific literature) in the area of Molecular Biology is currently growing at an exponential rate. This growth makes it very difficult to filter the most relevant results, and the extraction of the core information, for inclusion in one of the knowledge resources being maintained by the research community, becomes very expensive. Therefore, there is a growing interest in text processing approaches that can deliver selected information from scientific publications, which can limit the amount of human intervention normally needed to gather those results.
This paper presents and evaluates an approach aimed at automating the process of extracting semantic relations (e.g. interactions between genes and proteins) from scientific literature in the domain of Molecular Biology. The approach, using a novel dependency-based parser, is based on a complete syntactic analysis of the corpus.
- Machine Learning, Knowledge Discovery and Data Mining | Pp. 535-544