Catálogo de publicaciones - libros
Machine Learning and Data Mining in Pattern Recognition: 5th International Conference, MLDM 2007, Leipzig, Germany, July 18-20, 2007. Proceedings
Petra Perner (eds.)
En conferencia: 5º International Workshop on Machine Learning and Data Mining in Pattern Recognition (MLDM) . Leipzig, Germany . July 18, 2007 - July 20, 2007
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Artificial Intelligence (incl. Robotics); Mathematical Logic and Formal Languages; Database Management; Data Mining and Knowledge Discovery; Pattern Recognition; Image Processing and Computer Vision
Disponibilidad
| Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
|---|---|---|---|---|
| No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-73498-7
ISBN electrónico
978-3-540-73499-4
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Tabla de contenidos
Categorizing Evolved CoreWar Warriors Using EM and Attribute Evaluation
Doni Pracner; Nenad Tomašev; Miloš Radovanović; Mirjana Ivanović
CoreWar is a computer simulation where two programs written in an assembly language called redcode compete in a virtual memory array. These programs are referred to as . Over more than twenty years of development a number of different battle strategies have emerged, making it possible to identify different warrior types. Systems for automatic warrior creation appeared more recently, evolvers being the dominant kind. This paper describes an attempt to analyze the output of the CCAI evolver, and explores the possibilities for performing automatic categorization by warrior type using representations based on redcode source, as opposed to instruction execution frequency. Analysis was performed using EM clustering, as well as information gain and gain ratio attribute evaluators, and revealed which mainly brute-force types of warriors were being generated. This, along with the observed correlation between clustering and the workings of the evolutionary algorithm justifies our approach and calls for more extensive experiments based on annotated warrior benchmark collections.
- Image Mining | Pp. 681-693
Restricted Sequential Floating Search Applied to Object Selection
J. Arturo Olvera-López; J. Francisco Martínez-Trinidad; J. Ariel Carrasco-Ochoa
The object selection is an important task for instance-based classifiers since through this process the size of a training set could be reduced and then the runtimes in both classification and training steps would be reduced. Several methods for object selection have been proposed but some methods discard relevant objects for the classification step. In this paper, we propose an object selection method which is based on the idea of sequential floating search. This method reconsiders the inclusion of relevant objects previously discarded. Some experimental results obtained by our method are shown and compared against some other object selection methods.
- Image Mining | Pp. 694-702
Color Reduction Using the Combination of the Kohonen Self-Organized Feature Map and the Gustafson-Kessel Fuzzy Algorithm
Konstantinos Zagoris; Nikos Papamarkos; Ioannis Koustoudis
The color of the digital images is one of the most important components of the image processing research area. In many applications such as image segmentation, analysis, compression and transition, it is preferable to reduce the colors as much as possible. In this paper, a color clustering technique which is the combination of a neural network and a fuzzy algorithm is proposed. Initially, the Kohonen Self Organized Featured Map (KSOFM) is applied to the original image. Then, the KSOFM results are fed to the Gustafson-Kessel (GK) fuzzy clustering algorithm as starting values. Finally, the output classes of GK algorithm define the numbers of colors of which the image will be reduced.
- Image Mining | Pp. 703-715
A Hybrid Algorithm Based on Evolution Strategies and Instance-Based Learning, Used in Two-Dimensional Fitting of Brightness Profiles in Galaxy Images
Juan Carlos Gomez; Olac Fuentes
The hybridization of optimization techniques can exploit the strengths of different approaches and avoid their weaknesses. In this work we present a hybrid optimization algorithm based on the combination of Evolution Strategies (ES) and Locally Weighted Linear Regression (LWLR). In this hybrid a local algorithm (LWLR) proposes a new solution that is used by a global algorithm (ES) to produce new better solutions. This new hybrid is applied in solving an interesting and difficult problem in astronomy, the two-dimensional fitting of brightness profiles in galaxy images.
The use of standardized fitting functions is arguably the most powerful method for measuring the large-scale features (e.g. brightness distribution) and structure of galaxies, specifying parameters that can provide insight into the formation and evolution of galaxies. Here we employ the hybrid algorithm ES+LWLR to find models that describe the bi-dimensional brightness profiles for a set of optical galactic images. Models are created using two functions: de Vaucoleurs and exponential, which produce models that are expressed as sets of concentric generalized ellipses that represent the brightness profiles of the images.
The problem can be seen as an optimization problem because we need to minimize the difference between the flux from the model and the flux from the original optical image, following a normalized Euclidean distance. We solved this optimization problem using our hybrid algorithm ES+LWLR. We have obtained results for a set of 100 galaxies, showing that hybrid algorithm is very well suited to solve this problem.
- Image Mining | Pp. 716-726
Gait Recognition by Applying Multiple Projections and Kernel PCA
Murat Ekinci; Murat Aykut; Eyup Gedikli
Recognizing people by gait has a unique advantage over other biometrics: it has potential for use at a distance when other biometrics might be at too low a resolution, or might be obscured. In this paper, an improved method for gait recognition is proposed. The proposed work introduces a nonlinear machine learning method, kernel Principal Component Analysis (KPCA), to extract gait features from silhouettes for individual recognition. Binarized silhouette of a motion object is first represented by four 1-D signals which are the basic image features called the distance vectors. The distance vectors are differences between the bounding box and silhouette, and extracted using four projections to silhouette. Classic linear feature extraction approaches, such as PCA, LDA, and FLDA, only take the 2-order statistics among gait patterns into account, and are not sensitive to higher order statistics of data. Therefore, KPCA is used to extract higher order relations among gait patterns for future recognition. Fast Fourier Transform (FFT) is employed as a preprocessing step to achieve translation invariant on the gait patterns accumulated from silhouette sequences which are extracted from the subjects walk in different speed and/or different time. The experiments are carried out on the CMU and the USF gait databases and presented based on the different training gait cycles. Finally, the performance of the proposed algorithm is comparatively illustrated to take into consideration the published gait recognition approaches.
- Image Mining | Pp. 727-741
A Machine Learning Approach to Test Data Generation: A Case Study in Evaluation of Gene Finders
Henning Christiansen; Christina Mackeprang Dahmcke
Programs for gene prediction in computational biology are examples of systems for which the acquisition of authentic test data is difficult as these require years of extensive research. This has lead to test methods based on semiartificially produced test data, often produced by techniques complemented by statistical models such as Hidden Markov Models (HMM). The quality of such a test method depends on how well the test data reflect the regularities in known data and how well they generalize these regularities. So far only very simplified and generalized, artificial data sets have been tested, and a more thorough statistical foundation is required.
We propose to use logic-statistical modelling methods for machine-learning for analyzing existing and manually marked up data, integrated with the generation of new, artificial data. More specifically, we suggest to use the PRISM system developed by Sato and Kameya. Based on logic programming extended with random variables and parameter learning, PRISM appears as a powerful modelling environment, which subsumes HMMs and a wide range of other methods, all embedded in a declarative language. We illustrate these principles here, showing parts of a model under development for genetic sequences and indicate first initial experiments producing test data for evaluation of existing gene finders, exemplified by GENSCAN, HMMGene and genemark.hmm.
- Medical, Biological, and Environmental Data Mining | Pp. 742-755
Discovering Plausible Explanations of Carcinogenecity in Chemical Compounds
Eva Armengol
The goal of predictive toxicology is the automatic construction of carcinogenecity models. Most common artificial intelligence techniques used to construct these models are inductive learning methods. In a previous work we presented an approach that uses lazy learning methods for solving the problem of predicting carcinogenecity. Lazy learning methods solve new problems based on their similarity to already solved problems. Nevertheless, a weakness of these kind of methods is that sometimes the result is not completely understandable by the user. In this paper we propose an explanation scheme for a concrete lazy learning method. This scheme is particularly interesting to justify the predictions about the carcinogenesis of chemical compounds.
- Medical, Biological, and Environmental Data Mining | Pp. 756-769
One Lead ECG Based Personal Identification with Feature Subspace Ensembles
Hugo Silva; Hugo Gamboa; Ana Fred
In this paper we present results on real data, focusing on personal identification based on one lead ECG, using a reduced number of heartbeat waveforms. A wide range of features can be used to characterize the ECG signal trace with application to personal identification. We apply feature selection (FS) to the problem with the dual purpose of improving the recognition rate and reducing data dimensionality. A feature subspace ensemble method (FSE) is described which uses an association between FS and parallel classifier combination techniques to overcome some FS difficulties. With this approach, the discriminative information provided by multiple feature subspaces, determined by means of FS, contributes to the global classification system decision leading to improved classification performance. Furthermore, by considering more than one heartbeat waveform in the decision process through sequential classifier combination, higher recognition rates were obtained.
- Medical, Biological, and Environmental Data Mining | Pp. 770-783
Classification of Breast Masses in Mammogram Images Using Ripley’s K Function and Support Vector Machine
Leonardo de Oliveira Martins; Erick Corrêa da Silva; Aristófanes Corrêa Silva; Anselmo Cardoso de Paiva; Marcelo Gattass
Female breast cancer is a major cause of death in western countries. Several computer techniques have been developed to aid radiologists to improve their performance in the detection and diagnosis of breast abnormalities. In Point Pattern Analysis, there is a statistic known as Ripley’s function that is frequently applied to Spatial Analysis in Ecology, like mapping specimens of plants. This paper proposes a new way in applying Ripley’s function to classify breast masses from mammogram images. The features of each nodule image are obtained through the calculate of that function. Then, the samples gotten are classified through a Support Vector Machine (SVM) as benign or malignant masses. SVM is a machine-learning method, based on the principle of structural risk minimization, which performs well when applied to data outside the training set. The best result achieved was 94.94% of accuracy, 92.86% of sensitvity and 93.33% of specificity.
- Medical, Biological, and Environmental Data Mining | Pp. 784-794
Selection of Experts for the Design of Multiple Biometric Systems
Roberto Tronci; Giorgio Giacinto; Fabio Roli
In the biometric field, different experts are combined to improve the system reliability, as in many application the performance attained by individual experts (i.e., different sensors, or processing algorithms) does not provide the required reliability. However, there is no guarantee that the combination of any ensemble of experts provides superior performance than those of individual experts. Thus, an open problem in multiple biometric system is the selection of experts to combine, provided that a bag of experts for the problem at hand are available. In this paper we present an extensive experimental evaluation of four combination methods, i.e. the Mean rule, the Product rule, the Dynamic Score Selection technique, and a linear combination based on the Linear Discriminant Analysis. The performance of combination have been evaluated by the Area Under the Curve (AUC), and the Equal Error Rate (EER). Then, four measures have been used to characterise the performance of the individual experts included in each ensemble, namely the AUC, the EER, and two measures of class separability, i.e., the d’ and an integral separability measure. The experimental results clearly pointed out that the larger the d’ of individual experts, the higher the performance that can be attained by the combination of experts.
- Medical, Biological, and Environmental Data Mining | Pp. 795-809