Catálogo de publicaciones - libros

Compartir en
redes sociales


Intelligent Information Processing and Web Mining: Proceedings of the International IIS: IIPWMŽ06 Conference held in Ustrón, Poland, June 19-22, 2006

Mieczysław A. Kłopotek ; Sławomir T. Wierzchoń ; Krzysztof Trojanowski (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-33520-7

ISBN electrónico

978-3-540-33521-4

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer 2006

Tabla de contenidos

Visualizing Latent Structures in Grade Correspondence Cluster Analysis and Generalized Association Plots

Wieslaw Szczesny; Marek Wiech

The latent structure of psychological data set concerning superstitions is investigated by means of two recent exploratory methods: Grade Correspondence Cluster Analysis (GCCA) and Generalized Association Plots (GAP). The paper compares visualized results in GCCA and GAP. Moreover, it shows what differs both methodologies and what is their intrinsic similarity, according to which the revealed latent structures become equivalent whenever the data set is sufficiently regular. Therefore upon the basis of the real data set, were constructed two types of highly regular simulated data, of the same size and the same multivariate dependence index. These simulated data were then analyzed.

VI - Regular Sessions: Statistical Methods in Knowledge Discovery | Pp. 211-220

Converting a Naive Bayes Model into a Set of Rules

Bartłomiej Śnieżyński

A knowledge representation based on the probability theory is currently the most popular way of handling uncertainty. However, rule based systems are still popular. Their advantage is that rules are usually more easy to interpret than probabilistic models. A conversion method would allow to exploit advantages of both techniques. In this paper an algorithm that converts Naive Bayes models into rule sets is proposed. Preliminary experimental results show that rules generated from Naive Bayes models are compact and accuracy of such rule-based classifiers are relatively high.

VI - Regular Sessions: Statistical Methods in Knowledge Discovery | Pp. 221-229

Improving Quality of Agglomerative Scheduling in Concurrent Processing of Frequent Itemset Queries

Pawel Boinski; Konrad Jozwiak; Marek Wojciechowski; Maciej Zakrzewicz

Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. Recently, a new problem of optimizing processing of batches of frequent itemset queries has been considered. The best technique for this problem proposed so far is Common Counting, which consists in concurrent processing of frequent itemset queries and integrating their database scans. Common Counting requires that data structures of several queries are stored in main memory at the same time. Since in practice memory is limited, the crucial problem is scheduling the queries to Common Counting phases so that the I/O cost is optimized. According to our previous studies, the best algorithm for this task, applicable to large batches of queries, is CCAgglomerative. In this paper we present a novel query scheduling method CCAgglomerativeNoise, built around CCAgglomerative, increasing its chances of finding an optimal solution.

VII - Regular Sessions: Knowledge Discovery in Applications | Pp. 233-242

Analysis of the Structure of Online Marketplace Graph

Andrzej Dominik; Jacek Wojciechowski

In this paper the structure of the online marketplace graph is studied. We based our research on one of the biggest and most popular online auction services in Poland. Our graph is created from the data obtained from Transactions Rating System of this service. We discuss properties of the considered graph and its dynamics. It turns out that such a graph has scale-free topology and shows smallworld behaviour. We also discovered a few interesting features (e.g. high clustering mutual coefficient) which are not present in other real-life networks.

VII - Regular Sessions: Knowledge Discovery in Applications | Pp. 243-252

Trademark Retrieval in the Presence of Occlusion

Dariusz Frejlichowski

Employing content based image retrieval (CBIR) methods to trademark registration can improve and accelerate the checking process greatly. Amongst all the features present in CBIR, shape seems to be the most appropriate for this task. It is however usually only utilized for non-occluded and noise free objects. In this paper the emphasis is put on the atypical case of the fraudulent creation of a new trademark based on a popular registered one. One can just modify an existing logo by, for example, removing or inserting a part into it. Another method is to modify even smaller subparts, which is close to adding noise to it’s silhouette. So, a method is herein described of template matching using a shape descriptor which is robust to rotation, scaling, shifting, and also to occlusion and noise.

VII - Regular Sessions: Knowledge Discovery in Applications | Pp. 253-262

On Allocating Limited Sampling Resources Using a Learning Automata-based Solution to the Fractional Knapsack Problem

Ole-Christofier Granmo; B. John Oommen

In this paper, we consider the problem of allocating limited sampling resources in a “real-time” manner with the purpose of estimating multiple binomial proportions. This is the scenario encountered when evaluating multiple web sites by accessing a limited number of web pages, and the proportions of interest are the fraction of each web site that is successfully validated by an HTML validator [11]. Our novel solution is based on mapping the problem onto the so-called nonlinear fractional knapsack problem with separable and concave criterion functions [3], which, in turn, is solved using a of deterministic Learning Automata (LA). To render the problem even more meaningful, since the binomial proportions are unknown and must be sampled, we particularly consider the scenario when the target criterion functions are with distributions. Using the general LA paradigm, our scheme improves a current solution in an online manner, through a series of informed guesses which move towards the optimal solution. At the heart of our scheme, a team of deterministic LA performs a controlled random walk on a discretized solution space. Comprehensive experimental results demonstrate that the discretization resolution determines the precision of our scheme, and that for a given precision, the current resource allocation solution is consistently improved, until a near-optimal solution is found – even for periodically switching environments. Thus, our scheme, while being novel to the entire field of LA, also efficiently handles a class of resource allocation problems previously not addressed in the literature.

VII - Regular Sessions: Knowledge Discovery in Applications | Pp. 263-272

Learning Symbolic User Models for Intrusion Detection: A Method and Initial Results

Ryszard S. Michalski; Kenneth A. Kaufman; Jaroslaw Pietrzykowski; Bartłomiej Śnieżyński; Janusz Wojtusiak

This paper briefly describes the LUS-MT method for automatically learning user signatures (models of computer users) from datastreams capturing users’ interactions with computers. The signatures are in the form of collections of multistate templates (MTs), each characterizing a pattern in the user’s behavior. By applying the models to new user activities, the system can detect an imposter or verify legitimate user activity. Advantages of the method include the high expressive power of the models (a single template can characterize a large number of different user behaviors) and the ease of their interpretation, which makes possible their editing or enhancement by an expert. Initial results are very promising and show the potential of the method for user modeling.

VII - Regular Sessions: Knowledge Discovery in Applications | Pp. 273-285

Multichannel Color Image Watermarking Using PCA Eigenimages

Kazuyoshi Miyara; Thai Duy Hien; Hanane Harrak; Yasunori Nagata; Zensho Nakao

In the field of image watermarking, research has been mainly focused on gray image watermarking, whereas the extension to the color case is usually accomplished by marking the image luminance, or by processing color channels separately. In this paper we propose a new digital watermarking method of three bands RGB color images based on Principal Component Analysis (PCA). This research, which is an extension of our earlier work, consists of embedding the same digital watermark into three RGB channels of the color image based on PCA eigenimages. We evaluated the effectiveness of the method against some watermark attacks. Experimental results show that the performance of the proposed method against most prominent attacks is good.

VII - Regular Sessions: Knowledge Discovery in Applications | Pp. 287-296

Developing a Model Agent-based Airline Ticket Auctioning System

Mladenka Vukmirovic; Maria Ganzha; Marcin Paprzycki

Large body of recent work has been devoted to multi-agent systems utilized in e-commerce scenarios. In particular, autonomous software agents participating in auctions have attracted a lot of attention. Interestingly, most of these studies involve purely virtual scenarios. In an initial attempt to fill this gap we discuss a model agent-based e-commerce system modified to serve as an airline ticket auctioning system. Here, the implications of forcing agents to obey actual rules that govern ticket sales are discussed and illustrated by UML-formalized depictions of agents, their relations and functionalities.

VII - Regular Sessions: Knowledge Discovery in Applications | Pp. 297-306

Multi-Label Classification of Emotions in Music

Alicja Wieczorkowska; Piotr Synak; Zbigniew W. Raś

This paper addresses the problem of multi-label classification of emotions in musical recordings. The testing data set contains 875 samples (30 seconds each). The samples were manually labelled into 13 classes, without limits regarding the number of labels for each sample. The experiments and test results are presented.

VII - Regular Sessions: Knowledge Discovery in Applications | Pp. 307-315