Catálogo de publicaciones - libros

Compartir en
redes sociales


MICAI 2006: Advances in Artificial Intelligence: 5th Mexican International Conference on Artificial Intelligence, Apizaco, Mexico, November 13-17, 2006, Proceedings

Alexander Gelbukh ; Carlos Alberto Reyes-Garcia (eds.)

En conferencia: 5º Mexican International Conference on Artificial Intelligence (MICAI) . Apizaco, Mexico . November 13, 2006 - November 17, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Artificial Intelligence (incl. Robotics); Computation by Abstract Devices; Mathematical Logic and Formal Languages; Image Processing and Computer Vision

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-49026-5

ISBN electrónico

978-3-540-49058-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

On Combining Fractal Dimension with GA for Feature Subset Selecting

GuangHui Yan; ZhanHuai Li; Liu Yuan

Selecting a set of features which is optimal for a given task is a problem which plays an important role in a wide variety of contexts including pattern recognition, adaptive control, and machine learning. Recently, exploiting fractal dimension to reduce the features of dataset is a novel method. FDR (Fractal Dimensionality Reduction), proposed by Traina in 2000, is the most famous fractal dimension based feature selection algorithm. However, it is intractable in the high dimensional data space for multiple scanning the dataset and incapable of eliminating two or more features simultaneously. In this paper we combine GA with the Z-ordering based FDR for addressing this problem and present a new algorithm GAZBFDR(Genetic Algorithm and Z-ordering Based FDR). The algorithm proposed can directly select the fixed number features from the feature space and utilize the fractal dimension variation to evaluate the selected features within the comparative lower space. The experimental results show that GAZBFDR algorithm achieves better performance in the high dimensional dataset.

- Machine Learning and Feature Selection | Pp. 543-553

Locally Adaptive Nonlinear Dimensionality Reduction

Yuexian Hou; Hongmin Yang; Pilian He

Popular nonlinear dimensionality reduction algorithms, e.g., SIE and Isomap suffer a difficulty in common: global neighborhood parameters often fail in tackling data sets with high variation in local manifold. To improve the availability of nonlinear dimensionality reduction algorithms in the field of machine learning, an adaptive neighbors selection scheme based on locally principal direction reconstruction is proposed in this paper. Our method involves two main computation steps. First, it selects an appropriate neighbors set for each data points such that all neighbors in a neighbors set form a d-dimensional linear subspace approximately and computes locally principal directions for each neighbors set respectively. Secondly, it fits each neighbor by means of locally principal directions of corresponding neighbors set and deletes the neighbors whose fitting error exceeds a predefined threshold. The simulation shows that our proposal could deal with data set with high variation in local manifold effectively. Moreover, comparing with other adaptive neighbors selection strategy, our method could circumvent false connectivity induced by noise or high local curvature.

- Machine Learning and Feature Selection | Pp. 554-561

Fuzzy Pairwise Multiclass Support Vector Machines

J. M. Puche; J. M. Benítez; J. L. Castro; C. J. Mantas

At first, support vector machines (SVMs) were applied to solve binary classification problems. They can also be extended to solve multicategory problems by the combination of binary SVM classifiers. In this paper, we propose a new fuzzy model that includes the advantages of several previously published methods solving their drawbacks. For each datum, a class is rejected using information provided by every decision function related to it. Our proposal yields membership degrees in the unit interval and in some cases, it improves the performance of the former methods in the unclassified regions.

- Classification | Pp. 562-571

Support Vector Machine Classification Based on Fuzzy Clustering for Large Data Sets

Jair Cervantes; Xiaoou Li; Wen Yu

Support vector machine (SVM) has been successfully applied to solve a large number of classification problems. Despite its good theoretic foundations and good capability of generalization, it is a big challenging task for the large data sets due to the training complexity, high memory requirements and slow convergence. In this paper, we present a new method, . Before applying SVM we use fuzzy clustering, in this stage the optimal number of clusters are not needed in order to have less computational cost. We only need to partition the training data set briefly. The SVM classification is realized with the center of the groups. Then the de-clustering and SVM classification via reduced data are used. The proposed approach is scalable to large data sets with high classification accuracy and fast convergence speed. Empirical studies show that the proposed approach achieves good performance for large data sets.

- Classification | Pp. 572-582

Optimizing Weighted Kernel Function for Support Vector Machine by Genetic Algorithm

Ha-Nam Nguyen; Syng-Yup Ohn; Soo-Hoan Chae; Dong Ho Song; Inbok Lee

The problem of determining optimal decision model is a difficult combinatorial task in the fields of pattern classification, machine learning, and especially bioinformatics. Recently, support vector machine (SVM) has shown a better performance than conventional learning methods in many applications. This paper proposes a weighted kernel function for support vector machine and its learning method with a fast convergence and a good classification performance. We defined the weighted kernel function as the weighted sum of a set of different types of basis kernel functions such as neural, radial, and polynomial kernels, which are trained by a learning method based on genetic algorithm. The weights of basis kernel functions in proposed kernel are determined in learning phase and used as the parameters in the decision model in classification phase. The experiments on several clinical datasets such as colon cancer, leukemia cancer, and lung cancer datasets indicate that our weighted kernel function results in higher and more stable classification performance than other kernel functions. Our method also has comparable and sometimes better classification performance than other classification methods for certain applications.

- Classification | Pp. 583-592

Decision Forests with Oblique Decision Trees

Peter J. Tan; David L. Dowe

Ensemble learning schemes have shown impressive increases in prediction accuracy over single model schemes. We introduce a new decision forest learning scheme, whose base learners are Minimum Message Length (MML) oblique decision trees. Unlike other tree inference algorithms, MML oblique decision tree learning does not over-grow the inferred trees. The resultant trees thus tend to be shallow and do not require pruning. MML decision trees are known to be resistant to over-fitting and excellent at probabilistic predictions. A novel weighted averaging scheme is also proposed which takes advantage of high probabilistic prediction accuracy produced by MML oblique decision trees. The experimental results show that the new weighted averaging offers solid improvement over other averaging schemes, such as majority vote. Our MML decision forests scheme also returns favourable results compared to other ensemble learning algorithms on data sets with binary classes.

- Classification | Pp. 593-603

Using Reliable Short Rules to Avoid Unnecessary Tests in Decision Trees

Hyontai Sug

It is known that in decision trees the reliability of lower branches is worse than the upper branches due to data fragmentation problem. As a result, unnecessary tests of attributes may be done, because decision trees may require tests that are not best for some part of the data objects. To supplement the weak point of decision trees of data fragmentation, using reliable short rules with decision tree is suggested, where the short rules come from limited application of association rule finding algorithms. Experiment shows the method can not only generate more reliable decisions but also save test costs by using the short rules.

- Classification | Pp. 604-611

Selection of the Optimal Wavebands for the Variety Discrimination of Chinese Cabbage Seed

Di Wu; Lei Feng; Yong He

This paper presents a method based on chemometrics analysis to select the optimal wavebands for variety discrimination of Chinese cabbage seed by using a Visible/Near-infrared spectroscopy (Vis/NIRS) system. A total of 120 seed samples were investigated using a field spectroradiometer. Chemometrics was used to build the relationship between the absorbance spectra and varieties. Principle component analysis (PCA) was not suitable for variety discrimination as the principle components (PCs) plot of three primary principle components could only intuitively distinguish the varieties well. Partial Least Squares Regression (PLS) was executed to select 6 optimal wavebands as 730nm, 420nm, 675nm, 620nm, 604nm and 609nm based on loading values. Two chemometrics, multiple linear regression (MLR) and stepwise discrimination analysis (SDA) were used to establish the recognition models. MLR model is not suitable in this study because of its unsatisfied predictive ability. The SDA model was proposed by the advantage of variable selection. The final results based on SDA model showed an excellent performance with high discrimination rate of 99.167%. It is also proved that optimal wavebands are suitable for variety discrimination.

- Classification | Pp. 612-621

Hybrid Method for Detecting Masqueraders Using Session Folding and Hidden Markov Models

Román Posadas; Carlos Mex-Perera; Raúl Monroy; Juan Nolazco-Flores

This paper focuses on the study of a new method for detecting masqueraders in computer systems. The main feature of such masqueraders is that they have knowledge about the behavior profile of legitimate users. The dataset provided by Schonlau  [1], called SEA, has been modified for including synthetic sessions created by masqueraders using the behavior profile of the users intended to impersonate. It is proposed an hybrid method for detection of masqueraders based on the compression of the users sessions and Hidden Markov Models. The performance of the proposed method is evaluated using ROC curves and compared against other known methods. As shown by our experimental results, the proposed detection mechanism is the best of the methods here considered.

- Classification | Pp. 622-631

Toward Lightweight Detection and Visualization for Denial of Service Attacks

Dong Seong Kim; Sang Min Lee; Jong Sou Park

In this paper, we present a lightweight detection and visualization methodology for Denial of Service (DoS) attacks. First, we propose a new approach based on Random Forest (RF) to detect DoS attacks. The classification accuracy of RF is comparable to that of Support Vector Machines (SVM). RF is also able to produce the importance value of individual feature. We adopt RF to select intrinsic important features for detecting DoS attacks in a lightweight way. And then, with selected features, we plot both DoS attacks and normal traffics in 2 dimensional space using Multi-Dimensional Scaling (MDS). The visualization results show that simple MDS can help one to visualize DoS attacks without any expert domain knowledge. The experimental results on the KDD 1999 intrusion detection dataset validate the possibility of our approach.

- Classification | Pp. 632-640