Catálogo de publicaciones - libros

Compartir en
redes sociales


High Performance Computing: HiPC 2007: 14th International Conference, Goa, India, December 18-21, 2007. Proceedings

Srinivas Aluru ; Manish Parashar ; Ramamurthy Badrinath ; Viktor K. Prasanna (eds.)

En conferencia: 14º International Conference on High-Performance Computing (HiPC) . Goa, India . December 18, 2007 - December 21, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Processor Architectures; Software Engineering/Programming and Operating Systems; Computer Systems Organization and Communication Networks; Algorithm Analysis and Problem Complexity; Computation by Abstract Devices; Mathematics of Computing

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-77219-4

ISBN electrónico

978-3-540-77220-0

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

Multi-objective Peer-to-Peer Neighbor-Selection Strategy Using Genetic Algorithm

Ajith Abraham; Benxian Yue; Chenjing Xian; Hongbo Liu; Millie Pant

Peer-to-peer (P2P) topology has significant influence on the performance, search efficiency and functionality, and scalability of the application. In this paper, we present a Genetic Agorithm (GA) approach to the problem of multi-objective Neighbor Selection (NS) in P2P Networks. The encoding representation is from the upper half of the peer-connection matrix through the undirected graph, which reduces the search space dimension. Experiment results indicate that GA usually could obtain better results than Particle Swarm Optimization (PSO).

- Session VII - P2P and Internet Applications | Pp. 443-451

Effect of Dynamicity on Peer to Peer Networks

Bivas Mitra; Sujoy Ghose; Niloy Ganguly

In this paper, we propose based on percolation theory to assess the robustness of peer to peer networks in face of user churns and/or attacks targeted towards important nodes. It is observed in practice that in spite of churn of peers, superpeer networks show exceptional robustness and do not disintegrate into disconnected components. With the help of the analytical framework developed, we formally measure its stability against user churn and validate the general observation. The effect of intentional attacks upon the superpeer networks is also investigated. Our analysis shows that fraction of superpeers in the network and their connectivity have profound impact upon the stability of the network. The results obtained from the theoretical analysis are validated through simulation. The simulation results and theoretical predictions match with high degree of precision.

- Session VII - P2P and Internet Applications | Pp. 452-463

Hierarchical Multicast Routing Scheme for Mobile Ad Hoc Network

A. Shajin Nargunam; M. P. Sebastian

Ad hoc network is a dynamic multi-hop wireless network that is established by a group of mobile nodes on a shared wireless channel. The plain flooding algorithm provokes a high number of unnecessary packet rebroadcasts, causing contention, packet collisions and ultimately wasting precious limited bandwidth. Another barrier is that routing in ad hoc networks does not scale up as easily as in fixed network. Hierarchical techniques have long been known to afford scalability in networks. By introducing hierarchical routing scheme to ad hoc networks, we can effectively address this problem. Clustering provides a method to build and maintain hierarchical routing scheme in ad hoc networks. By summarizing topology detail via a hierarchical map of the network topology, network nodes are able to conserve memory and link resources. This paper proposes a fully distributed cluster based routing algorithm for mobile ad hoc networks. Non-overlapping clusters are formed using the dynamic cluster creation algorithm. Packets are routed according to the routing information available with each gateway node. The mobility issues are also handled locally in this routing architecture. The proposed cluster maintenance algorithm dynamically adapts to the changes and hence the efficiency is not degraded by node mobility. In addition, our analysis shows that building clustered hierarchies is affordable and that clustering algorithms can also be used to enhance network quality of service.

- Session VII - P2P and Internet Applications | Pp. 464-475

The Impact of Noise on the Scaling of Collectives: The Nearest Neighbor Model [Extended Abstract]

Nisheeth K. Vishnoi

This paper presents a theoretical study of the impact of noise on the scaling of a cluster when the processors participate in “local” collectives with their nearest neighbors. The model considered here is an extension of that introduced in [9] for understanding the effect of noise on the scaling of “global” collectives in large clusters. In this paper, the scaling is studied with respect to three fundamental aspects: (1) the distribution of noise: whether it is heavy or light tailed; (2) the temporal independence of noise; (3) the topology of the cluster. When the noise has a “light” tail and is temporally independent, it is shown that the cluster scales well, i.e., the slowdown per phase is just proportional to the (logarithm of the) maximum degree of the communication topology. This implies that for popular topologies such as grids and toruses the slowdown per phase is just a constant factor, which is independent of the number of processors. In the light tailed case, assuming only a weak temporal independence, a general upper bound is derived in terms of an “expansion” parameter of the communication topology. For grid-like graphs this establishes an exponential speedup compared to what was shown for global collective operations in [9].

- Session VIII - Communication and Routing | Pp. 476-487

Optimization of Collective Communication in Intra-cell MPI

M. K. Velamati; A. Kumar; N. Jayam; G. Senthilkumar; P. K. Baruah; R. Sharma; S. Kapoor; A. Srinivasan

The Cell is a heterogeneous multi-core processor, which has eight co-processors, called SPEs. The SPEs can access a common shared main memory through DMA, and each SPE can directly operate on a small distinct local store. An MPI implementation can use each SPE as if it were a node for an MPI process. In this paper, we discuss the efficient implementation of collective communication operations for intra-Cell MPI, both for cores on a single chip, and for a Cell blade. While we have implemented all the collective operations, we describe in detail the following: barrier, broadcast, and reduce. The main contributions of this work are (i) describing our implementation, which achieves low latencies and high bandwidths using the unique features of the Cell, and (ii) comparing different algorithms, and evaluating the influence of the architectural features of the Cell processor on their effectiveness.

- Session VIII - Communication and Routing | Pp. 488-499

Routing-Contained Virtualization Based on Up*/Down* Forwarding

Åshild Grønstad Solheim; Olav Lysne; Thomas Sødring; Tor Skeie; Jakob Aleksander Libak

Virtualization of computing resources is becoming increasingly important both for high-end servers and multi-core CPUs. In a virtualized system, the set of resources that constitute a virtual compute entity should be spatially separated from each other. Dividing the cores on a chip, or the CPUs in a high end server into disjoint sets for each task is a trivial problem. Ensuring that they use disjoint parts of the interconnection network is, however, complex, and in existing methods the requirement of routing-containment of each virtual partition severely degrades the utilization of the system. In this paper, we present an allocation strategy that is based on Up*/Down* routing. Through simulations, we demonstrate increases (in some cases above 30%) in system utilization relative to state-of-the-art in a Dimension Order routed mesh - a topology that is assumed to be widely deployed in Networks on Chip.

- Session VIII - Communication and Routing | Pp. 500-513

A Routing Methodology for Dynamic Fault Tolerance in Meshes and Tori

Nils Agne Nordbotten; Tor Skeie

This paper proposes a fully distributed fault-tolerant routing methodology for tori and meshes. A dynamic fault-model is supported, enabling the network to remain fully operational at all times. Contrary to most previous proposals that support a dynamic fault-model, the methodology is able to tolerate concave fault regions, thereby avoiding disabling healthy nodes in most practical scenarios. The methodology provides high network performance through the use of adaptive routing and provides graceful performance degradation in the presence of faults.

- Session VIII - Communication and Routing | Pp. 514-527

Fault-Tolerant Topology Adaptation by Localized Distributed Protocol Switching

Sushanta Karmakar; Arobinda Gupta

Adaptation is a desirable requirement in a distributed system. For many problems, there exists more than one protocol such that one protocol performs better in one environment while the other performs better in another. In such cases, adaptive distributed systems can be designed by dynamically switching between the protocols as the environment changes. In this work, we present distributed algorithms to switch from a BFS tree to a DFS tree and from a DFS tree to a BFS tree. For low network load, a BFS tree is a better choice for broadcast since it also minimizes delay, whereas for higher network load, a DFS tree may be more suitable to reduce the load on any one node. The proposed switching algorithms can handle arbitrary crash failures. They ensure that switching eventually completes in spite of failures with the desired tree as the output. Also, all messages are correctly broadcast in the absence of failures even in the presence of switching.

- Session VIII - Communication and Routing | Pp. 528-539

Accomplishing Approximate FCFS Fairness Without Queues

K. Subramani; K. Madduri

First Come First Served (FCFS) is a policy that is accepted for implementing fairness in a number of application domains such as scheduling in Operating Systems, scheduling web requests, and so on. We also have orthogonal applications of FCFS policies in proving correctness of search algorithms such as Breadth-First Search, the Bellman-Ford FIFO implementation for finding single-source shortest paths, program verification and static analysis. The data structure used to implementing FCFS policies, the , suffers from two principal drawbacks, viz., non-trivial verifiability and lack of scalability. In case of large distributed networks, maintaining an explicit queue to enforce FCFS is prohibitively expensive. The question of interest then, is whether queues are to implement FCFS policies; this paper provides empirical evidence answering this question in the negative. The principal contribution of this paper is the design and analysis of a randomized protocol to implement approximate FCFS policies without queues. From the Software Engineering perspective, the techniques that are developed find direct applications in program verification, model checking, in the implementation of distributed queues and in the design of incremental algorithms for Shortest path problems.

- Session IX - Cluster and Grid Applications | Pp. 540-551

A Novel Force Matrix Transformation with Optimal Load-Balance for 3-Body Potential Based Parallel Molecular Dynamics Using Atom-Decomposition in a Heterogeneous Cluster Environment

J. V. Sumanth; David Swanson; Hong Jiang

Evaluating the Force Matrix constitutes the most computationally intensive part of a Molecular Dynamics (MD) simulation. In three-body MD simulations, the total energy of the system is determined by the energy of every unique triple in the system and the force matrix is three-dimensional. The execution time of a three-body MD algorithm is thus proportional to the cube of the number of atoms in the system. Fortunately, there exist symmetries in the Force Matrix that can be exploited to improve the running time of the algorithm. While this optimization is straight forward to implement in the case of sequential code, it has proven to be nontrivial for parallel code even in a homogeneous environment.

In this paper, we present a force matrix transformation that is capable of exploiting the symmetries in the force matrix in both a homogeneous and a heterogeneous environment while balancing the load among all the participating processors. The proposed transformation distributes the number of interactions to be computed uniformly among all the slices of the force matrix along any of the axes. The transformed matrix can be scheduled using any well known heterogeneous slice-level scheduling technique. We also derive theoretical bounds for efficiency and load balance for prior work in the literature. We then prove some interesting and useful properties of our transformation and evaluate its advantages and disadvantages. A loop reordering optimization for the symmetric transformation is described. The performance of an MPI implementation of the transformation is studied in terms of the Step Time Variation Ratio (STVR) in a homogeneous and heterogeneous environment.

- Session IX - Cluster and Grid Applications | Pp. 552-565