Catálogo de publicaciones - libros

Compartir en
redes sociales


Foundations of Intelligent Systems: 16th International Symposium, ISMIS 2006, Bari, Italy, September 27-29, 2006, Proceedings

Floriana Esposito ; Zbigniew W. Raś ; Donato Malerba ; Giovanni Semeraro (eds.)

En conferencia: 16º International Symposium on Methodologies for Intelligent Systems (ISMIS) . Bari, Italy . September 27, 2006 - September 29, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Artificial Intelligence (incl. Robotics); Information Storage and Retrieval; Information Systems Applications (incl. Internet); Database Management; User Interfaces and Human Computer Interaction; Computation by Abstract Devices

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-45764-0

ISBN electrónico

978-3-540-45766-4

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

Simulated Annealing Algorithm with Biased Neighborhood Distribution for Training Profile Models

Anton Bezuglov; Juan E. Vargas

Functional biological sequences, which typically come in families, have retained some level of similarity and function during evolution. Finding consensus regions, alignment of sequences, and identifying the relationship between a sequence and a family allow inferences about the function of the sequences. Profile hidden Markov models (HMMs) are generally used to identify those relationships. A profile HMM can be trained on unaligned members of the family using conventional algorithms such as Baum-Welch, Viterbi, and their modifications. The overall quality of the alignment depends on the quality of the trained model. Unfortunately, the conventional training algorithms converge to suboptimal models most of the time. This work proposes a training algorithm that early identifies many imperfect models. The method is based on the Simulated Annealing approach widely used in discrete optimization problems. The training algorithm is implemented as a component in HMMER. The performance of the algorithm is discussed on protein sequence data.

- Machine Learning | Pp. 642-651

A Conditional Model for Tonal Analysis

Daniele P. Radicioni; Roberto Esposito

Tonal harmony analysis is arguably one of the most sophisticated tasks that musicians deal with. It combines general knowledge with contextual cues, being ingrained with both faceted and evolving objects, such as musical language, execution style, or even taste. In the present work we introduce , a system for tonal analysis. automatically learns to analyse music using the recently developed framework of conditional models. The system is presented and assessed on a corpus of Western classical pieces from the 18 to the late 19 Centuries repertoire. The results are discussed and interesting issues in modeling this problem are drawn.

- Machine Learning | Pp. 652-661

Hypothesis Diversity in Ensemble Classification

Lorenza Saitta

The paper discusses the issue of hypothesis diversity in ensemble classifiers. The measures of diversity previously proposed in the literature are analyzed inside a unifying framework based on Monte Carlo stochastic algorithms. The paper shows that no measure is useful to predict ensemble performance, because all of them have only a very loose relation with the expected accuracy of the classifier.

- Machine Learning | Pp. 662-670

Complex Adaptive Systems: Using a Free-Market Simulation to Estimate Attribute Relevance

Christopher N. Eichelberger; Mirsad Hadžikadić

The authors have implemented a complex adaptive simulation of an agent-based exchange to estimate the relative importance of attributes in a data set. This simulation uses an individual, transaction-based voting mechanism to help the system estimate the importance of each variable at the system/aggregate level. Two variations of information gain – one using entropy and one using similarity – were used to demonstrate that the resulting estimates can be computed using a smaller subset of the data and greater accommodation for missing and erroneous data than traditional methods.

- Machine Learning | Pp. 671-680

Exploring Phrase-Based Classification of Judicial Documents for Criminal Charges in Chinese

Chao-Lin Liu; Chwen-Dar Hsieh

Phrases provide a better foundation for indexing and retrieving documents than individual words. Constituents of phrases make other component words in the phrase less ambiguous than when the words appear separately. Intuitively, classifiers that employ phrases for indexing should perform better than those that use words. Although pioneers have explored the possibility of indexing English documents decades ago, there are relatively fewer similar attempts for Chinese documents, partially because segmenting Chinese text into words correctly is not easy already. We build a domain dependent word list with the help of Chien’s PAT tree-based method and HowNet, and use the resulting word list for defining relevant phrases for classifying Chinese judicial documents. Experimental results indicate that using phrases for indexing indeed allows us to classify judicial documents that are closely similar to each other. With a relatively more efficient algorithm, our classifier offers better performances than those reported in related works.

- Text Mining | Pp. 681-690

Regularization for Unsupervised Classification on Taxonomies

Diego Sona; Sriharsha Veeramachaneni; Nicola Polettini; Paolo Avesani

We study unsupervised classification of text documents into a taxonomy of concepts annotated by only a few keywords. Our central claim is that the structure of the taxonomy encapsulates background knowledge that can be exploited to improve classification accuracy. Under our generative model for the document corpus, we show that the unsupervised classification algorithm provides robust estimates of the classification parameters by performing , and that our algorithm can be interpreted as a regularized algorithm. We also propose a technique for the automatic choice of the regularization parameter. In addition we propose a regularization scheme for for hierarchies. We experimentally demonstrate that both our regularized clustering algorithms achieve a higher classification accuracy over simple models like minimum distance, , and .

- Text Mining | Pp. 691-696

A Proximity Measure and a Clustering Method for Concept Extraction in an Ontology Building Perspective

Guillaume Cleuziou; Sylvie Billot; Stanislas Lew; Lionel Martin; Christel Vrain

In this paper, we study the problem of clustering textual units in the framework of helping an expert to build a specialized ontology. This work has been achieved in the context of a French project, called , handling botany corpora. Building an ontology, either automatically or semi-automatically is a difficult task. We focus on one of the main steps of that process, namely structuring the textual units occurring in the texts into classes, likely to represent concepts of the domain. The approach that we propose relies on the definition of a new non-symmetrical measure for evaluating the semantic proximity between lemma, taking into account the contexts in which they occur in the documents. Moreover, we present a non-supervised classification algorithm designed for the task at hand and that kind of data. The first experiments performed on botanical data have given relevant results.

- Text Mining | Pp. 697-706

An Intelligent Personalized Service for Conference Participants

Marco Degemmis; Pasquale Lops; Pierpaolo Basile

This paper presents the integration of linguistic knowledge in learning user profiles able to represent user interests in a more effective way with respect to classical keyword-based profiles. Semantic profiles are obtained by integrating a naïve Bayes approach for text categorization with a word sense disambiguation strategy based on the WordNet lexical database (Section 2). Semantic profiles are exploited by the “conference participant advisor” service in order to suggest papers to be read and talks to be attended by a conference participant. Experiments on a real dataset show the effectiveness of the service (Section 3).

- Text Mining | Pp. 707-712

Contextual Maps for Browsing Huge Document Collections

Krzysztof Ciesielski; Mieczysław A. Kłopotek

The increasing number of documents returned by search engines for typical requests makes it necessary to look for new methods of representation of contents of the results, like document maps. Though visually impressive, doc maps (e.g. WebSOM) are extensively resource consuming and hard to use for huge collections.

In this paper, we present a novel approach, which does not require creation of a complex, global map-based model for the whole document collection. Instead, a hierarchy of topic-sensitive maps is created. We argue that such approach is not only much less complex in terms of processing time and memory requirement, but also leads to a robust map-based browsing of the document collection.

- Text Mining | Pp. 713-722

Classification of Polish Email Messages: Experiments with Various Data Representations

Jerzy Stefanowski; Marcin Zienkowicz

Machine classification of Polish language emails into user-specific folders is considered. We experimentally evaluate the impact of different approaches to construct data representation of emails on the accuracy of classifiers. Our results show that language processing techniques have smaller influence than an appropriate selection of features, in particular ones coming from the email header or its attachments.

- Web Intelligence | Pp. 723-728