Catálogo de publicaciones - libros

Compartir en
redes sociales


Advances in Knowledge Discovery and Data Mining: 10th Pacific-Asia Conference, PAKDD 2006, Singapore, April 9-12, 2006, Proceedings

Wee-Keong Ng ; Masaru Kitsuregawa ; Jianzhong Li ; Kuiyu Chang (eds.)

En conferencia: 10º Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) . Singapore, Singapore . April 9, 2006 - April 12, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-33206-0

ISBN electrónico

978-3-540-33207-7

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

A Wavelet Analysis Based Data Processing for Time Series of Data Mining Predicting

Weimin Tong; Yijun Li; Qiang Ye

This paper presents wavelet method for time series in business-field forecasting. An autoregressive moving average (ARMA) model is used, it can model the near-periodicity, nonstationarity and nonlinearity existed in business short-term time series. According to the wavelet denoising, wavelet decomposition and wavelet reconstruction, the hidden period and the nonstationarity existed in time series are extracted and separated by wavelet transformation. The characteristic of wavelet decomposition series is applied to BP networks and an autoregressive moving average (ARMA) model. It shows that the proposed method can provide more accurate results than the conventional techniques, like those only using BP networks or autoregressive moving average (ARMA) models.

- Temporal Data Mining | Pp. 780-789

Intelligent Particle Swarm Optimization in Multi-objective Problems

Shinn-Jang Ho; Wen-Yuan Ku; Jun-Wun Jou; Ming-Hao Hung; Shinn-Ying Ho

In this paper, we proposes a novel intelligent multi-objective particle swarm optimization (IMOPSO) to solve multi-objective optimization problems. High performance of IMOPSO mainly arises from two parts: one is using generalized Pareto-based scale-independent fitness function (GPSISF) can efficiently given all candidate solutions a score, and then decided candidate solutions level. The other one is replacing the conventional particle move process of PSO with an intelligent move mechanism (IMM) based on orthogonal experimental design to enhance the search ability. IMM can evenly sample and analyze from the best experience of an individual particle and group particles by using a systematic reasoning method, and then efficiently generate a good candidate solution for the next move of the particle. Some benchmark functions are used to evaluate the performance of IMOPSO, and compared with some existing multi-objective evolution algorithms. According to experimental results and analysis, they show that IMOPSO performs well.

- Temporal Data Mining | Pp. 790-800

Hidden Space Principal Component Analysis

Weida Zhou; Li Zhang; Licheng Jiao

A new nonlinear principle component analysis (PCA) method, hidden space principal component analysis (HSPCA) is presented in this paper. Firstly, the data in the input space is mapped into a high hidden space by a nonlinear function whose role is similar to that of hidden neurons in Artificial Neural Networks. Then the goal of features extraction and data compression will be implemented by performing PCA on the mapped data in the hidden space. Compared with linear PCA method, our algorithm is a nonlinear PCA one essentially and can extract the data features more efficiently. While compared with kernel PCA method presented recently, the mapped samples are exactly known and the conditions satisfied by nonlinear mapping functions are more relaxed. The unique condition is symmetry for kernel function in HSPCA. Finally, experimental results on artificial and real-world data show the feasibility and validity of HSPCA.

- Temporal Data Mining | Pp. 801-805

Neighbor Line-Based Locally Linear Embedding

De-Chuan Zhan; Zhi-Hua Zhou

Locally linear embedding () is a powerful approach for mapping high-dimensional data nonlinearly to a lower-dimensional space. However, when the training examples are not densely sampled, often returns invalid results. In this paper, the (Neighbor Line-based ) approach is proposed, which generates some virtual examples with the help of such that the learning can be executed on an enriched training set. Experiments show that outperforms in visualization.

- Temporal Data Mining | Pp. 806-815

Predicting Rare Extreme Values

Luis Torgo; Rita Ribeiro

Modelling extreme data is very important in several application domains, like for instance finance, meteorology, ecology, etc.. This paper addresses the problem of predicting extreme values of a continuous variable. The main distinguishing feature of our target applications resides on the fact that these values are rare. Any prediction model is obtained by some sort of search process guided by a pre-specified evaluation criterion. In this work we argue against the use of standard criteria for evaluating regression models in the context of our target applications. We propose a new predictive performance metric for this class of problems that our experiments show to perform better in distinguishing models that are more accurate at rare extreme values. This new evaluation metric could be used as the basis for developing better models in terms of rare extreme values prediction.

- Temporal Data Mining | Pp. 816-820

Domain-Driven Actionable Knowledge Discovery in the Real World

Longbing Cao; Chengqi Zhang

Actionable knowledgediscovery is one of Grand Challenges in KDD. To this end, many methodologies have been developed. However, they either view data mining as an autonomous data-driven trial-and-error process, or only analyze the issues in an isolated and case-by-case manner. As a result, the knowledge discovered is often not actionable to constrained business. This paper proposes a practical perspective, referred to as (DDID-PD). It presents a domain-driven view of discovering knowledge satisfying real business needs. Its main ideas include constraint mining, in-depth mining, human-cooperated mining, and loop-closed mining. We demonstrate its deployment in mining actionable trading strategies in Australian Stock Exchange data.

- Temporal Data Mining | Pp. 821-830

Evaluation of Attribute-Aware Recommender System Algorithms on Data with Varying Characteristics

Karen H. L. Tso; Lars Schmidt-Thieme

The growth of Internet commerce has provoked the use of Recommender Systems (RS). Adequate datasets of users and products have always been demanding to better evaluate RS algorithms. Yet, the amount of public data, especially data containing content information (attributes) is limited. In addition, the performance of RS is highly dependent on various characteristics of the datasets. Thus, few others have conducted studies on synthetically generated datasets to mimic the user-product relationship. Evaluating algorithms based on only one or two datasets is often not sufficient. A more thorough analysis can be conducted by applying systematic changes to data, which cannot be done with real data. However, synthetic datasets that include attributes are rarely investigated. In this paper, we review synthetic datasets applied in RS and present our synthetic data generation methodology that considers attributes. Furthermore, we conduct empirical evaluations on existing hybrid recommendation algorithms and other state-of-the-art algorithms using these variable synthetic data and observe their behavior as the characteristic of data varies. In addition, we also introduce the use of entropy to control the randomness of the generated data.

- Temporal Data Mining | Pp. 831-840

An Intelligent System Based on Kernel Methods for Crop Yield Prediction

A. Majid Awan; Mohd. Noor Md. Sap

This paper presents work on developing a software system for predicting crop yield from climate and plantation data. At the core of this system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. For this purpose, a robust weighted kernel k-means algorithm incorporating spatial constraints is presented. The algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.

- Innovative Applications | Pp. 841-846

A Machine Learning Application for Human Resource Data Mining Problem

Zhen Xu; Binheng Song

Apply machine learning methods to data mining domain can be more helpful to extract useful knowledge for problems with changing conditions. Human resource allocation is a kind of problem in data mining domain. It presents machine learning techniques to dissolve it. First, we construct a new model which optimizes the multi-objectives allocation problem by using fuzzy logic strategy. One of the most important problems in the model is how to get the precise individual capability matrixes. Machine learning method by being told is well used to settle the problem in this paper. In the model, appraisal values about employees are saved in knowledge warehouse. Before tasks allocation, machine learning approach provides the capability matrixes based on the existing data sets. Then Task-Arrange or Hungarian Algorithm provides the final solution with our proposed matrixes. After present tasks are finished, machine learning method by being told can update the matrixes according to the suggestions on employees’ performance provided by the specialists. Useful knowledge can be well mined in cycles by learning approach. As a numerical example demonstrated, it is helpful to make a realistic decision on human resource allocation under a dynamic environment for organizations.

- Innovative Applications | Pp. 847-856

Towards Automated Design of Large-Scale Circuits by Combining Evolutionary Design with Data Mining

Shuguang Zhao; Mingying Zhao; Jun Zhao; Licheng Jiao

As an important branch of evolvable hardware, evolutionary design of circuit (EDC) is a promising way to realize automated design of complex electronic circuits. To improve EDC in efficiency, scalability and capability of optimization, a novel technique was developed. It features an adaptive multi-objective genetic algorithm and interactions between EDC and data mining. It was validated by the experiments on arithmetic circuits, showing some exciting results. Some circuits evolved are the best ones ever reported in terms of gate count and operating speed. Moreover, some novel knowledge, e.g., efficient and scalable design formulae and generalized transform rules have been discovered by mining the data and results of EDC, which are easy to verify but difficult to dig out by human experts with existing knowledge.

- Innovative Applications | Pp. 857-866