Catálogo de publicaciones - libros
Foundations of Intelligent Systems: 16th International Symposium, ISMIS 2006, Bari, Italy, September 27-29, 2006, Proceedings
Floriana Esposito ; Zbigniew W. Raś ; Donato Malerba ; Giovanni Semeraro (eds.)
En conferencia: 16º International Symposium on Methodologies for Intelligent Systems (ISMIS) . Bari, Italy . September 27, 2006 - September 29, 2006
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Artificial Intelligence (incl. Robotics); Information Storage and Retrieval; Information Systems Applications (incl. Internet); Database Management; User Interfaces and Human Computer Interaction; Computation by Abstract Devices
Disponibilidad
| Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
|---|---|---|---|---|
| No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-45764-0
ISBN electrónico
978-3-540-45766-4
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Tabla de contenidos
doi: 10.1007/11875604_71
Simulated Annealing Algorithm with Biased Neighborhood Distribution for Training Profile Models
Anton Bezuglov; Juan E. Vargas
Functional biological sequences, which typically come in families, have retained some level of similarity and function during evolution. Finding consensus regions, alignment of sequences, and identifying the relationship between a sequence and a family allow inferences about the function of the sequences. Profile hidden Markov models (HMMs) are generally used to identify those relationships. A profile HMM can be trained on unaligned members of the family using conventional algorithms such as Baum-Welch, Viterbi, and their modifications. The overall quality of the alignment depends on the quality of the trained model. Unfortunately, the conventional training algorithms converge to suboptimal models most of the time. This work proposes a training algorithm that early identifies many imperfect models. The method is based on the Simulated Annealing approach widely used in discrete optimization problems. The training algorithm is implemented as a component in HMMER. The performance of the algorithm is discussed on protein sequence data.
- Machine Learning | Pp. 642-651
doi: 10.1007/11875604_72
A Conditional Model for Tonal Analysis
Daniele P. Radicioni; Roberto Esposito
Tonal harmony analysis is arguably one of the most sophisticated tasks that musicians deal with. It combines general knowledge with contextual cues, being ingrained with both faceted and evolving objects, such as musical language, execution style, or even taste. In the present work we introduce , a system for tonal analysis. automatically learns to analyse music using the recently developed framework of conditional models. The system is presented and assessed on a corpus of Western classical pieces from the 18 to the late 19 Centuries repertoire. The results are discussed and interesting issues in modeling this problem are drawn.
- Machine Learning | Pp. 652-661
doi: 10.1007/11875604_73
Hypothesis Diversity in Ensemble Classification
Lorenza Saitta
The paper discusses the issue of hypothesis diversity in ensemble classifiers. The measures of diversity previously proposed in the literature are analyzed inside a unifying framework based on Monte Carlo stochastic algorithms. The paper shows that no measure is useful to predict ensemble performance, because all of them have only a very loose relation with the expected accuracy of the classifier.
- Machine Learning | Pp. 662-670
doi: 10.1007/11875604_74
Complex Adaptive Systems: Using a Free-Market Simulation to Estimate Attribute Relevance
Christopher N. Eichelberger; Mirsad Hadžikadić
The authors have implemented a complex adaptive simulation of an agent-based exchange to estimate the relative importance of attributes in a data set. This simulation uses an individual, transaction-based voting mechanism to help the system estimate the importance of each variable at the system/aggregate level. Two variations of information gain – one using entropy and one using similarity – were used to demonstrate that the resulting estimates can be computed using a smaller subset of the data and greater accommodation for missing and erroneous data than traditional methods.
- Machine Learning | Pp. 671-680
doi: 10.1007/11875604_75
Exploring Phrase-Based Classification of Judicial Documents for Criminal Charges in Chinese
Chao-Lin Liu; Chwen-Dar Hsieh
Phrases provide a better foundation for indexing and retrieving documents than individual words. Constituents of phrases make other component words in the phrase less ambiguous than when the words appear separately. Intuitively, classifiers that employ phrases for indexing should perform better than those that use words. Although pioneers have explored the possibility of indexing English documents decades ago, there are relatively fewer similar attempts for Chinese documents, partially because segmenting Chinese text into words correctly is not easy already. We build a domain dependent word list with the help of Chien’s PAT tree-based method and HowNet, and use the resulting word list for defining relevant phrases for classifying Chinese judicial documents. Experimental results indicate that using phrases for indexing indeed allows us to classify judicial documents that are closely similar to each other. With a relatively more efficient algorithm, our classifier offers better performances than those reported in related works.
- Text Mining | Pp. 681-690
doi: 10.1007/11875604_76
Regularization for Unsupervised Classification on Taxonomies
Diego Sona; Sriharsha Veeramachaneni; Nicola Polettini; Paolo Avesani
We study unsupervised classification of text documents into a taxonomy of concepts annotated by only a few keywords. Our central claim is that the structure of the taxonomy encapsulates background knowledge that can be exploited to improve classification accuracy. Under our generative model for the document corpus, we show that the unsupervised classification algorithm provides robust estimates of the classification parameters by performing , and that our algorithm can be interpreted as a regularized algorithm. We also propose a technique for the automatic choice of the regularization parameter. In addition we propose a regularization scheme for for hierarchies. We experimentally demonstrate that both our regularized clustering algorithms achieve a higher classification accuracy over simple models like minimum distance, , and .
- Text Mining | Pp. 691-696
doi: 10.1007/11875604_77
A Proximity Measure and a Clustering Method for Concept Extraction in an Ontology Building Perspective
Guillaume Cleuziou; Sylvie Billot; Stanislas Lew; Lionel Martin; Christel Vrain
In this paper, we study the problem of clustering textual units in the framework of helping an expert to build a specialized ontology. This work has been achieved in the context of a French project, called , handling botany corpora. Building an ontology, either automatically or semi-automatically is a difficult task. We focus on one of the main steps of that process, namely structuring the textual units occurring in the texts into classes, likely to represent concepts of the domain. The approach that we propose relies on the definition of a new non-symmetrical measure for evaluating the semantic proximity between lemma, taking into account the contexts in which they occur in the documents. Moreover, we present a non-supervised classification algorithm designed for the task at hand and that kind of data. The first experiments performed on botanical data have given relevant results.
- Text Mining | Pp. 697-706
doi: 10.1007/11875604_78
An Intelligent Personalized Service for Conference Participants
Marco Degemmis; Pasquale Lops; Pierpaolo Basile
This paper presents the integration of linguistic knowledge in learning user profiles able to represent user interests in a more effective way with respect to classical keyword-based profiles. Semantic profiles are obtained by integrating a naïve Bayes approach for text categorization with a word sense disambiguation strategy based on the WordNet lexical database (Section 2). Semantic profiles are exploited by the “conference participant advisor” service in order to suggest papers to be read and talks to be attended by a conference participant. Experiments on a real dataset show the effectiveness of the service (Section 3).
- Text Mining | Pp. 707-712
doi: 10.1007/11875604_79
Contextual Maps for Browsing Huge Document Collections
Krzysztof Ciesielski; Mieczysław A. Kłopotek
The increasing number of documents returned by search engines for typical requests makes it necessary to look for new methods of representation of contents of the results, like document maps. Though visually impressive, doc maps (e.g. WebSOM) are extensively resource consuming and hard to use for huge collections.
In this paper, we present a novel approach, which does not require creation of a complex, global map-based model for the whole document collection. Instead, a hierarchy of topic-sensitive maps is created. We argue that such approach is not only much less complex in terms of processing time and memory requirement, but also leads to a robust map-based browsing of the document collection.
- Text Mining | Pp. 713-722
doi: 10.1007/11875604_80
Classification of Polish Email Messages: Experiments with Various Data Representations
Jerzy Stefanowski; Marcin Zienkowicz
Machine classification of Polish language emails into user-specific folders is considered. We experimentally evaluate the impact of different approaches to construct data representation of emails on the accuracy of classifiers. Our results show that language processing techniques have smaller influence than an appropriate selection of features, in particular ones coming from the email header or its attachments.
- Web Intelligence | Pp. 723-728