Catálogo de publicaciones - libros

Compartir en
redes sociales


Machine Learning and Data Mining in Pattern Recognition: 5th International Conference, MLDM 2007, Leipzig, Germany, July 18-20, 2007. Proceedings

Petra Perner (eds.)

En conferencia: 5º International Workshop on Machine Learning and Data Mining in Pattern Recognition (MLDM) . Leipzig, Germany . July 18, 2007 - July 20, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Artificial Intelligence (incl. Robotics); Mathematical Logic and Formal Languages; Database Management; Data Mining and Knowledge Discovery; Pattern Recognition; Image Processing and Computer Vision

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-73498-7

ISBN electrónico

978-3-540-73499-4

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Comparing State-of-the-Art Collaborative Filtering Systems

Laurent Candillier; Frank Meyer; Marc Boullé

aims at helping find they should appreciate from huge catalogues. In that field, we can distinguish , and approaches. For each of them, many options play a crucial role for their performances, and in particular the similarity function defined between users or items, the number of neighbors considered for user- or item-based approaches, the number of clusters for model-based approaches using clustering, and the prediction function used.

In this paper, we review the main collaborative filtering methods proposed in the litterature and compare them on the same widely used real dataset called , and using the same widely used performance measure called (MAE). This study thus allows us to highlight the advantages and drawbacks of each approach, and to propose some default options that we think should be used when using a given approach or designing a new one.

- Mining Marketing Data | Pp. 548-562

Reducing the Dimensionality of Vector Space Embeddings of Graphs

Kaspar Riesen; Vivian Kilchherr; Horst Bunke

Graphs are a convenient representation formalism for structured objects, but they suffer from the fact that only a few algorithms for graph classification and clustering exist. In this paper we propose a new approach to graph classification by embedding graphs in real vector spaces. This approach allows us to apply advanced classification tools while retaining the high representational power of graphs. The basic idea of our approach is to regard the edit distances of a given graph to a set of training graphs as a vectorial description of . Once a graph has been transformed into a vector, different dimensionality reduction algorithms are applied such that redundancies are eliminated. To this reduced vectorial data representation, pattern classification algorithms can be applied. Through various experimental results we show that the proposed vector space embedding and subsequent classification with the reduced vectors outperform the classification algorithms in the original graph domain.

- Structural Data Mining | Pp. 563-573

PE-PUC: A Graph Based PU-Learning Approach for Text Classification

Shuang Yu; Chunping Li

This paper presents a novel solution for the problem of building text classifier using positive documents (P) and unlabeled documents (U). Here, the unlabeled documents are mixed with positive and negative documents. This problem is also called PU-Learning. The key feature of PU-Learning is that there is no negative document for training. Recently, several approaches have been proposed for solving this problem. Most of them are based on the same idea, which builds a classifier in two steps. Each existing technique uses a different method for each step. Generally speaking, these existing approaches do not perform well when the size of P is small. In this paper, we propose a new approach aiming at improving the system when the size of P is small. This approach combines the graph-based semi-supervised learning method with the two-step method. Experiments indicate that our proposed method performs well especially when the size of P is small.

- Structural Data Mining | Pp. 574-584

Efficient Subsequence Matching Using the Longest Common Subsequence with a Dual Match Index

Tae Sik Han; Seung-Kyu Ko; Jaewoo Kang

The purpose of subsequence matching is to find a query sequence from a long data sequence. Due to the abundance of applications, many solutions have been proposed. Virtually all previous solutions use the Euclidean measure as the basis for measuring distance between sequences. Recent studies, however, suggest that the Euclidean distance often fails to produce proper results due to the irregularity in the data, which is not so uncommon in our problem domain. Addressing this problem, some non-Euclidean measures, such as and , have been proposed. However, most of the previous work in this direction focused on the whole sequence matching problem where query and data sequences are the same length. In this paper, we propose a novel subsequence matching framework using a non-Euclidean measure, in particular, , and a new index query scheme. The proposed framework is based on the Dual Match framework where data sequences are divided into a series of disjoint equi-length subsequences and then indexed in an R-tree. We introduced similarity bound for index matching with . The proposed query matching scheme reduces significant numbers of false positives in the match result. Furthermore, we developed an algorithm to skip expensive computations through observing the warping paths. We validated our framework through extensive experiments using 48 different time series datasets. The results of the experiments suggest that our approach significantly improves the subsequence matching performance in various metrics.

- Structural Data Mining | Pp. 585-600

A Direct Measure for the Efficacy of Bayesian Network Structures Learned from Data

Gary F. Holness

Current metrics for evaluating the performance of Bayesian network structure learning includes order statistics of the data likelihood of learned structures, the average data likelihood, and average convergence time. In this work, we define a new metric that directly measures a structure learning algorithm’s ability to correctly model causal associations among variables in a data set. By treating membership in a Markov Blanket as a retrieval problem, we use ROC analysis to compute a structure learning algorithm’s efficacy in capturing causal associations at varying strengths. Because our metric moves beyond error rate and data-likelihood with a measurement of stability, this is a better characterization of structure learning performance. Because the structure learning problem is NP-hard, practical algorithms are either heuristic or approximate. For this reason, an understanding of a structure learning algorithm’s stability and boundary value conditions is necessary. We contribute to state of the art in the data-mining community with a new tool for understanding the behavior of structure learning techniques.

- Structural Data Mining | Pp. 601-615

A New Combined Fractal Scale Descriptor for Gait Sequence

Li Cui; Hua Li

In this paper, we present a new combined fractal scale descriptor based on wavelet moments in gait recognition. This method is likely useful to general 2d objects pattern recognition. By introducing the Mallat algorithm of wavelet, it reduces the computational complexity compared with wavelet moments. Moreover, fractal scale has advantage on the self-similarity description of signals. And because it is based on wavelet moments, it is still translation, scale and rotation invariant, and have strongly anti-noise and occlusion handling performance. For completely decomposed signals, we get the new descriptor by combining the global and local fractal scale in each level. Experiments on a middle size database of gait sequences show that the new combined fractal scale method has simple computation and is an effective descriptor for 2-d objects.

- Image Mining | Pp. 616-627

Palmprint Recognition by Applying Wavelet Subband Representation and Kernel PCA

Murat Ekinci; Murat Aykut

This paper presents a novel Daubechies-based kernel Principal Component Analysis (PCA) method by integrating the Daubechies wavelet representation of palm images and the kernel PCA method for palmprint recognition. The palmprint is first transformed into the wavelet domain to decompose palm images and the lowest resolution subband coefficients are chosen for palm representation. The kernel PCA method is then applied to extract non-linear features from the subband coefficients. Finally, weighted Euclidean linear distance based NN classifier and support vector machine (SVM) are comparatively performed for similarity measurement. Experimental results on PolyU Palmprint Databases demonstrate that the proposed approach achieves highly competitive performance with respect to the published palmprint recognition approaches.

- Image Mining | Pp. 628-642

A Filter-Refinement Scheme for 3D Model Retrieval Based on Sorted Extended Gaussian Image Histogram

Zhiwen Yu; Shaohong Zhang; Hau-San Wong; Jiqi Zhang

In this paper, we propose a filter-refinement scheme based on a new approach called Sorted Extended Gaussian Image histogram approach (SEGI) to address the problems of traditional EGI. Specifically, SEGI first constructs a 2D histogram based on the EGI histogram and the shell histogram. Then, SEGI extracts two kinds of descriptors from each 3D model: () the descriptor from the sorted histogram bins is used to perform approximate 3D model retrieval in the filter step, and () the descriptor which records the relations between the histogram bins is used to refine the approximate results and obtain the final query results. The experiments show that SEGI outperforms most of state-of-art approaches (e.g., EGI, shell histogram) on the public Princeton Shape Benchmark.

- Image Mining | Pp. 643-652

Fast-Maneuvering Target Seeking Based on Double-Action Q-Learning

Daniel C. K. Ngai; Nelson H. C. Yung

In this paper, a reinforcement learning method called DAQL is proposed to solve the problem of seeking and homing onto a fast maneuvering target, within the context of mobile robots. This Q-learning based method considers both target and obstacle actions when determining its own action decisions, which enables the agent to learn more effectively in a dynamically changing environment. It particularly suits fast-maneuvering target cases, in which maneuvers of the target are unknown a priori. Simulation result depicts that the proposed method is able to choose a less convoluted path to reach the target when compared to the ideal proportional navigation (IPN) method in handling fast maneuvering and randomly moving target. Furthermore, it can learn to adapt to the physical limitation of the system and do not require specific initial conditions to be satisfied for successful navigation towards the moving target.

- Image Mining | Pp. 653-666

Mining Frequent Trajectories of Moving Objects for Location Prediction

Mikołaj Morzy

Advances in wireless and mobile technology flood us with amounts of moving object data that preclude all means of manual data processing. The volume of data gathered from position sensors of mobile phones, PDAs, or vehicles, defies human ability to analyze the stream of input data. On the other hand, vast amounts of gathered data hide interesting and valuable knowledge patterns describing the behavior of moving objects. Thus, new algorithms for mining moving object data are required to unearth this knowledge. An important function of the mobile objects management system is the prediction of the unknown location of an object. In this paper we introduce a data mining approach to the problem of predicting the location of a moving object. We mine the database of moving object locations to discover frequent trajectories and movement rules. Then, we match the trajectory of a moving object with the database of movement rules to build a probabilistic model of object location. Experimental evaluation of the proposal reveals prediction accuracy close to 80%. Our original contribution includes the elaboration on the location prediction model, the design of an efficient mining algorithm, introduction of movement rule matching strategies, and a thorough experimental evaluation of the proposed model.

- Image Mining | Pp. 667-680