Catálogo de publicaciones - libros

Compartir en
redes sociales


Computing and Combinatorics: 13th Annual International Conference, COCOON 2007, Banff, Canada, July 16-19, 2007. Proceedings

Guohui Lin (eds.)

En conferencia: 13º International Computing and Combinatorics Conference (COCOON) . Banff, AB, Canada . July 16, 2007 - July 19, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Theory of Computation; Algorithm Analysis and Problem Complexity; Discrete Mathematics in Computer Science; Computer Communication Networks; Data Structures; Computer Graphics

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-73544-1

ISBN electrónico

978-3-540-73545-8

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

The Combinatorics of Sequencing the Corn Genome

Srinivas Aluru

The scientific community is engaged in an ongoing, concerted effort to sequence the corn (also known as maize) genome. This genome is approximately 2.5 billion nucleotides long with an estimated 65-80team of university and private laboratory researchers under the auspices of NSF/USDA/DOE is working towards deciphering the majority of the sequence information including all genes, determining their order and orientation, and anchoring them to genetic/physical maps. In this talk, I will present some of the combinatorial problems that arise in this context and outline the role of graph, string and parallel algorithms in solving them.

Pp. 1-1

Online Frequency Assignment in Wireless Communication Networks

Francis Y. L. Chin

Wireless communication has many applications since its invention more than a century ago. The frequency spectrum used for communication is a scarce resource and the Frequency Assignment Problem (FAP), aiming for better utilization of the frequencies, has been extensively studied in the past 20-30 years. Because of the rapid development of new wireless applications such as digital cellular network, cellular phone, the FAP problem has become more important.

In Frequency Division Multiplexing (FDM) networks, a geographic area is divided into small cellular regions or cells, usually regular hexagons in shape. Each cell contains one base station that communicates with other base stations via a high-speed wired network. Calls between any two clients (even within the same cell) must be established through base stations. When a call arrives, the nearest base station must assign a frequency from the available spectrum to the call without causing any interference with other calls. Interference may occur, which distorts the radio signals, when the same frequency is assigned to two different calls emanating from cells that are geographically close to each other. Thus the FAP problem can be viewed as a problem of multi-coloring a hexagon graph with the minimum number of colors when each vertex of the graph is associated with an integer that represents the number of calls in a cell.

FAP has attracted more attention recently because of the following:

a) Online analysis techniques: FAP problem is known to be NP-complete and many approximation algorithms have been proposed in the past. As frequency assignments have to be done without knowledge of future call requests and releases, online algorithms have been proposed and competitive analysis has been used to measure their performance.

b) New technology and application: Wideband Code-Division Multiple-Access (W-CDMA) technology is a new technology used for the implementation of 3G cellular system. Orthogonal Variable Spreading Factor (OVSF) codes are used to satisfy requests with different data rate requirements. FAP with OVSF code trees representing the frequency spectrum becomes an important problem.

Pp. 2-2

Information Distance from a Question to an Answer

Ming Li

We know how to measure distance from Beijing to Toronto. However, do you know how to measure the distance between two information carrying entities? For example: two genomes, two music scores, two programs, two articles, two emails, or from a question to an answer? Furthermore, such a distance measure must be application-independent, must be universal in the sense it is provably better than all other distances, and must be applicable.

From a simple and accepted assumption in thermodynamics, we have developed such a theory. I will present this theory and will present one of the new applications of this theory: a question answering system.

Pp. 3-3

A New Field Splitting Algorithm for Intensity-Modulated Radiation Therapy

Danny Z. Chen; Mark A. Healy; Chao Wang; Xiaodong Wu

In this paper, we present an almost linear time algorithm for the problem of splitting an intensity map of radiation (represented as an integer matrix) into multiple subfields (submatrices), subject to a given maximum allowable subfield width, to minimize the total delivery error caused by the splitting. This problem arises in intensity-modulated radiation therapy (IMRT) for cancer treatments. This is the first field splitting result on minimizing the total delivery error of the splitting. Our solution models the problem as a shortest path problem on a directed layered graph, which satisfies the staircase Monge property. Consequently, the resulting algorithm runs in almost linear time and generates an optimal quality field splitting.

Pp. 4-15

A New Recombination Lower Bound and the Minimum Perfect Phylogenetic Forest Problem

Yufeng Wu; Dan Gusfield

Understanding recombination is a central problem in population genetics. In this paper, we address an established problem in Computational Biology: compute lower bounds on the minimum number of historical recombinations for generating a set of sequences [11,13,9,1,2,15]. In particular, we propose a new recombination lower bound: the forest bound. We show that the forest bound can be formulated as the minimum perfect phylogenetic forest problem, a natural extension to the classic binary perfect phylogeny problem, which may be of interests on its own. We then show that the forest bound is provably higher than the optimal haplotype bound [13], a very good lower bound in practice [15]. We prove that, like several other lower bounds [2], computing the forest bound is NP-hard. Finally, we describe an integer linear programming (ILP) formulation that computes the forest bound precisely for certain range of data. Simulation results show that the forest bound may be useful in computing lower bounds for low quality data.

Pp. 16-26

Seed-Based Exclusion Method for Non-coding RNA Gene Search

Jean-Eudes Duchesne; Mathieu Giraud; Nadia El-Mabrouk

Given an RNA family characterized by conserved sequences and folding constraints, the problem is to search for all the instances of the RNA family in a genomic database. As seed-based heuristics have been proved very efficient to accelerate the classical homology based search methods such as BLAST, we use a similar idea for RNA structures. We present an exclusion method for RNA search allowing for possible nucleotide insertion, deletion and substitution. It is based on a partition of the RNA stem-loops into consecutive seeds and a preprocessing of the target database. This algorithm can be used to improve time efficiency of current methods, and is guaranteed to find all occurrences that contain at least one exact seed.

Pp. 27-39

A New Quartet Approach for Reconstructing Phylogenetic Trees: Quartet Joining Method

Lei Xin; Bin Ma; Kaizhong Zhang

In this paper we introduce a new quartet-based method for phylogenetic inference. This method concentrates on reconstructing reliable phylogenetic trees while tolerating as many quartet errors as possible. This is achieved by carefully selecting two possible neighbor leaves to merge and assigning weights intelligently to the quartets that contain newly merged leaves. Theoretically we prove that this method will always reconstruct the correct tree when a completely consistent quartet set is given. Intensive computer simulations show that our approach outperforms widely used quartet-based program TREE-PUZZLE in most of cases. Under the circumstance of low quartet accuracy, our method still can outperform distance-based method such as Neighbor-joining. Experiments on the real data set also shows the potential of this method. We also propose a simple technique to improve the quality of quartet set. Using this technique we can improve the results of our method.

Pp. 40-50

Integer Programming Formulations and Computations Solving Phylogenetic and Population Genetic Problems with Missing or Genotypic Data

Dan Gusfield; Yelena Frid; Dan Brown

Several central and well-known combinatorial problems in phylogenetics and population genetics have efficient, elegant solutions when the input is complete or consists of haplotype data, but lack efficient solutions when input is either incomplete, consists of genotype data, or is for problems generalized from decision questions to optimization questions. Unfortunately, in biological applications, these harder problems arise very often. Previous research has shown that integer-linear programming can sometimes be used to solve hard problems in practice on a range of data that is realistic for current biological applications. Here, we describe a set of related integer linear programming (ILP) formulations for several additional problems, most of which are known to be NP-hard. These ILP formulations address either the issue of missing data, or solve with objective functions that model more complex biological phenomena than previous formulations. These ILP formulations solve efficiently on data whose composition reflects a range of data of current biological interest. We also assess the biological quality of the ILP solutions: some of the problems, although not all, solve with excellent quality. These results give a practical way to solve instances of some central, hard biological problems, and give practical ways to assess how well certain natural objective functions reflect complex biological phenomena. Perl code to generate the ILPs (for input to CPLEX) is on the web at wwwcsif.cs.ucdavis.edu/ gusfield.

Pp. 51-64

Improved Exact Algorithms for Counting 3- and 4-Colorings

Fedor V. Fomin; Serge Gaspers; Saket Saurabh

We introduce a generic algorithmic technique and apply it on decision and counting versions of graph coloring. Our approach is based on the following idea: either a graph has nice (from the algorithmic point of view) properties which allow a simple recursive procedure to find the solution fast, or the pathwidth of the graph is small, which in turn can be used to find the solution by dynamic programming. By making use of this technique we obtain the fastest known exact algorithms

Pp. 65-74

Connected Coloring Completion for General Graphs: Algorithms and Complexity

Benny Chor; Michael Fellows; Mark A. Ragan; Igor Razgon; Frances Rosamond; Sagi Snir

An - of a graph is a coloring of the vertices so that each color class induces a subgraph having at most connected components. The concept has been well-studied for  = 1, in the case of trees, under the rubric of , used in modeling perfect phylogenies. Several applications in bioinformatics of connected coloring problems on general graphs are discussed, including analysis of protein-protein interaction networks and protein structure graphs, and of phylogenetic relationships modeled by splits trees. We investigate the - (-CCC) problem, that takes as input a partially colored graph, having uncolored vertices, and asks whether the partial coloring can be completed to an -component connected coloring. For  = 1 this problem is shown to be NP-hard, but fixed-parameter tractable when parameterized by the number of uncolored vertices, solvable in time (8). We also show that the 1-CCC problem, parameterized (only) by the treewidth of the graph, is fixed-parameter tractable; we show this by a method that is of independent interest. The -CCC problem is shown to be [1]-hard, when parameterized by the treewidth bound , for any  ≥ 2. Our proof also shows that the problem is NP-complete for  = 2, for general graphs.

Pp. 75-85