Catálogo de publicaciones - libros
Discovery Science: 9th International Conference, DS 2006, Barcelona, Spain, October 7-10, 2006, Proceedings
Ljupčo Todorovski ; Nada Lavrač ; Klaus P. Jantke (eds.)
En conferencia: 9º International Conference on Discovery Science (DS) . Barcelona, Spain . October 7, 2006 - October 10, 2006
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Philosophy of Science; Artificial Intelligence (incl. Robotics); Database Management; Information Storage and Retrieval; Computer Appl. in Administrative Data Processing; Computer Appl. in Social and Behavioral Sciences
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-46491-4
ISBN electrónico
978-3-540-46493-8
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Cobertura temática
Tabla de contenidos
doi: 10.1007/11893318_1
e-Science and the Semantic Web: A Symbiotic Relationship
Carole Goble; Oscar Corcho; Pinar Alper; David De Roure
e-Science is scientific investigation performed through distributed global collaborations between scientists and their resources, and the computing infrastructure that enables this. Scientific progress increasingly depends on pooling know-how and results; making connections between ideas, people, and data; and finding and reusing knowledge and resources generated by others in perhaps unintended ways. It is about harvesting and harnessing the “collective intelligence” of the scientific community. The Semantic Web is an extension of the current Web in which information is given well-defined meaning to facilitate sharing and reuse, better enabling computers and people to work in cooperation. Applying the Semantic Web paradigm to e-Science has the potential to bring significant benefits to scientific discovery. We identify the benefits of lightweight and heavyweight approaches, based on our experiences in the Life Sciences.
I - Invited Papers | Pp. 1-12
doi: 10.1007/11893318_2
Data-Driven Discovery Using Probabilistic Hidden Variable Models
Padhraic Smyth
Generative probabilistic models have proven to be a very useful framework for machine learning from scientific data. Key ideas that underlie the generative approach include (a) representing complex stochastic phenomena using the structured language of graphical models, (b) using latent (hidden) variables to make inferences about unobserved phenomena, and (c) leveraging Bayesian ideas for learning and prediction. This talk will begin with a brief review of learning from data with hidden variables and then discuss some exciting recent work in this area that has direct application to a broad range of scientific problems. A number of different scientific data sets will be used as examples to illustrate the application of these ideas in probabilistic learning, such as time-course microarray expression data, functional magnetic resonance imaging (fMRI) data of the human brain, text documents from the biomedical literature, and sets of cyclone trajectories.
I - Invited Papers | Pp. 13-13
doi: 10.1007/11893318_3
Reinforcement Learning and Apprenticeship Learning for Robotic Control
Andrew Ng
Many control problems, such as autonomous helicopter flight, legged robot locomotion, and autonomous driving are difficult because (i) It is hard to write down, in closed form, a formal specification of the control task (for example, what is the cost function for ”driving well”?), (ii) It is difficult to learn good models of the robot’s dynamics, and (iii) It is expensive to find closed-loop controllers for high dimensional, highly stochastic domains. Using apprenticeship learning—in which we learn from a human demonstration of a task—as a unifying theme, I will present formal results showing how many control problems can be efficiently addressed given access to a demonstration. In presenting these ideas, I will also draw from a number of case studies, including applications in autonomous helicopter flight, quadruped obstacle negotiation, snake robot locomotion, and high-speed off-road navigation.
Finally, I will also describe the application of these ideas to the STAIR (STanford AI Robot) project, which has the long term goal of integrating methods from all major areas of AI—including spoken dialog/NLP, manipulation, vision, navigation, and planning—to build a general-purpose, “intelligent” home/office robotic assistant.
I - Invited Papers | Pp. 14-14
doi: 10.1007/11893318_4
The Solution of Semi-Infinite Linear Programs Using Boosting-Like Methods
Gunnar Rätsch
We consider methods for the solution of large linear optimization problems, in particular so-called Semi-Infinite Linear Programs (SILPs) that have a finite number of variables but infinitely many linear constraints. We illustrate that such optimization problems frequently appear in machine learning and discuss several examples including maximum margin boosting, multiple kernel learning and structure learning. In the second part we review methods for solving SILPs. Here, we are particularly interested in methods related to boosting. We review recent theoretical results concerning the convergence of these algorithms and conclude this work with a discussion of empirical results comparing these algorithms.
I - Invited Papers | Pp. 15-15
doi: 10.1007/11893318_5
Spectral Norm in Learning Theory: Some Selected Topics
Hans Ulrich Simon
In this paper, we review some known results that relate the statistical query complexity of a concept class to the spectral norm of its correlation matrix. Since spectral norms are widely used in various other areas, we are then able to put statistical query complexity in a broader context. We briefly describe some non-trivial connections to (seemingly) different topics in learning theory, complexity theory, and cryptography. A connection to the so-called Hidden Number Problem, which plays an important role for proving bit-security of cryptographic functions, will be discussed in somewhat more detail.
I - Invited Papers | Pp. 16-16
doi: 10.1007/11893318_6
Classification of Changing Regions Based on Temporal Context in Local Spatial Association
Jae-Seong Ahn; Yang-Won Lee; Key-Ho Park
We propose a method of modeling regional changes in local spatial association and classifying the changing regions based on the similarity of time-series signature of local spatial association. For intuitive recognition of time-series local spatial association, we employ Moran scatterplot and extend it to QS-TiMoS (Quadrant Sequence on Time-series Moran Scatterplot) that allows for examining temporal context in local spatial association using a series of categorical variables. Based on the QS-TiMoS signature of nodes and edges, we develop the similarity measures for “state sequence” and “clustering transition” of time-series local spatial association. The similarity matrices generated from the similarity measures are then used for producing the classification maps of time-series local spatial association that present the history of changing regions in clusters. The feasibility of the proposed method is tested by a case study on the rate of land price fluctuation of 232 administrative units in Korea, 1995-2004.
II - Long Papers | Pp. 17-28
doi: 10.1007/11893318_7
Kalman Filters and Adaptive Windows for Learning in Data Streams
Albert Bifet; Ricard Gavaldà
We study the combination of Kalman filter and a recently proposed algorithm for dynamically maintaining a sliding window, for learning from streams of examples. We integrate this idea into two well-known learning algorithms, the Naïve Bayes algorithm and the -means clusterer. We show on synthetic data that the new algorithms do never worse, and in some cases much better, than the algorithms using only memoryless Kalman filters or sliding windows with no filtering.
II - Long Papers | Pp. 29-40
doi: 10.1007/11893318_8
Scientific Discovery: A View from the Trenches
Catherine Blake; Meredith Rendall
One of the primary goals in discovery science is to understand the human scientific reasoning processes. Despite sporadic success of automated discovery systems, few studies have systematically explored the socio-technical environments in which a discovery tool will ultimately be embedded. Modeling day-to-day activities of experienced scientists as they develop and verify hypotheses provides both a glimpse into the human cognitive processes surrounding discovery and a deeper understanding of the characteristics that are required for a discovery system to be successful. In this paper, we describe a study of experienced faculty in chemistry and chemical engineering as they engage in what Kuhn would call “normal” science, focusing in particular on how these scientists characterize discovery, how they arrive at their research question, and the processes they use to transform an initial idea into a subsequent publication. We discuss gaps between current definitions used in discovery science, and examples of system design improvements that would better support the information environment and activities in normal science.
II - Long Papers | Pp. 41-52
doi: 10.1007/11893318_9
Optimal Bayesian 2D-Discretization for Variable Ranking in Regression
Marc Boullé; Carine Hue
In supervised machine learning, variable ranking aims at sorting the input variables according to their relevance w.r.t. an output variable. In this paper, we propose a new relevance criterion for variable ranking in a regression problem with a large number of variables. This criterion comes from a discretization of both input and output variables, derived as an extension of a Bayesian non parametric discretization method for the classification case. For that, we introduce a family of discretization grid models and a prior distribution defined on this model space. For this prior, we then derive the exact Bayesian model selection criterion. The obtained most probable grid-partition of the data emphasizes the relation (or the absence of relation) between inputs and output and provides a ranking criterion for the input variables. Preliminary experiments both on synthetic and real data demonstrate the criterion capacity to select the most relevant variables and to improve a regression tree.
II - Long Papers | Pp. 53-64
doi: 10.1007/11893318_10
Text Data Clustering by Contextual Graphs
Krzysztof Ciesielski; Mieczysław A. Kłopotek
In this paper, we focus on the class of graph-based clustering models, such as growing neural gas or idiotypic nets for the purpose of high-dimensional text data clustering. We present a novel approach, which does not require operation on the complex overall graph of clusters, but rather allows to shift majority of effort to context-sensitive, local subgraph and local sub-space processing. Savings of orders of magnitude in processing time and memory can be achieved, while the quality of clusters is improved, as presented experiments demonstrate.
II - Long Papers | Pp. 65-76