Catálogo de publicaciones - libros

Compartir en
redes sociales


Parallel Computing Technologies: 9th International Conference, PaCT 2007, Pereslavl-Zalessky, Russia, September 3-7, 2007. Proceedings

Victor Malyshkin (eds.)

En conferencia: 9º International Conference on Parallel Computing Technologies (PaCT) . Pereslavl-Zalessky, Russia . September 3, 2007 - September 7, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Programming Techniques; Computer System Implementation; Software Engineering/Programming and Operating Systems; Computer Systems Organization and Communication Networks; Computation by Abstract Devices; Algorithm Analysis and Problem Complexity

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-73939-5

ISBN electrónico

978-3-540-73940-1

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

CAOS: A Domain-Specific Language for the Parallel Simulation of Cellular Automata

Clemens Grelck; Frank Penczek; Kai Trojahner

We present the design and implementation of CAOS, a domain-specific high-level programming language for the parallel simulation of extended cellular automata. CAOS allows scientists to specify complex simulations with limited programming skills and effort. Yet the CAOS compiler generates efficiently executable code that automatically harnesses the potential of contemporary multi-core processors, shared memory multiprocessors, workstation clusters and supercomputers.

- Cellular Automata | Pp. 410-417

Parallel Hardware Architecture to Simulate Movable Creatures in the CA Model

Mathias Halbach; Rolf Hoffmann

The general question of our investigation is: how can the simulation of moving objects (or agents) in a cellular automaton (CA) be accelerated by hardware architectures. We exemplify our approach using the creatures’ exploration problem: creatures are moving around in an unknown environment in order to visit all cells in shortest time. This problem is modeled as CA because this model is massively parallel and therefore it can be perfectly supported by hardware (FPGA technology). We need a very fast simulation because we want to observe and evaluate the collaborative performance for a different number of creatures, different behaviors of the creatures and for many different environments. As a main result from these simulations and evaluations we expect to find the best algorithms which can fulfill the task with the lowest work units (generations × creatures). In this contribution we have investigated the question how the creatures’ exploration problem can be accelerated in hardware with a minimum of hardware resources. We have designed and evaluated five different architectures that vary in the combination or separation of the logic for the environment, for the creatures and for the collision detection. A speedup in the range of thousands compared to software can be reached using an architecture which separates the environment from the creatures and makes use of the memory banks embedded in the FPGA.

- Cellular Automata | Pp. 418-431

Comparison of Evolving Uniform, Non-uniform Cellular Automaton, and Genetic Programming for Centroid Detection with Hardware Agents

Marcus Komann; Andreas Mainka; Dietmar Fey

Current industrial applications require fast and robust in systems with low size and power dissipation. One of the main tasks in industrial vision is fast detection of centroids of objects. This paper compares three different approaches for finding for centroid detection which are appropriate for a fine-grained parallel hardware architecture in an embedded vision chip. The algorithms shall comprise emergent capabilities and high problem-specific functionality without requiring large amounts of states or memory. For that problem, we consider and (CA) as well as . Due to the inherent complexity of the problem, an ary approach is applied. The appropriateness of these approaches for centroid detection is discussed.

- Cellular Automata | Pp. 432-441

Associative Version of Italiano’s Decremental Algorithm for the Transitive Closure Problem

Anna Nepomniaschaya

We propose a natural implementation of Italiano’s algorithm for updating the transitive closure of directed graphs after deletion of an edge on a model of associative (content addressable) parallel systems with vertical processing (the STAR–machine). The associative version of Italiano’s decremental algorithm is given as procedure DeleteArc, whose correctness is proved and time complexity is evaluated. We compare implementations of Italiano’s decremental algorithm and its associative version and enumerate the main advantages of the associative version.

- Cellular Automata | Pp. 442-452

Support for Fine-Grained Synchronization in Shared-Memory Multiprocessors

Vladimir Vlassov; Oscar Sierra Merino; Csaba Andras Moritz; Konstantin Popov

It has been already verified that hardware-supported finegrain synchronization provides a significant performance improvement over coarse-grained synchronization mechanisms, such as barriers. Support for fine-grain synchronization on individual data items becomes notably important in order to efficiently exploit thread-level parallelism available on multi-threading and multi-core processors. Fine-grained synchronization can be achieved using the full/empty tagged shared memory. We define the complete set of synchronizing memory instructions as well as the architecture of the full/empty tagged shared memory that provides support for these operations. We develop a snoopy cache coherency protocol for an SMP with the centralized full/empty tagged memory.

- Cellular Automata | Pp. 453-467

Self-organised Criticality in a Model of the Rat Somatosensory Cortex

Grzegorz M. Wojcik; Wieslaw A. Kaminski; Piotr Matejanka

Large Hodgkin-Huxley (HH) neural networks were examined and the structures discussed in this article simulated a part of the rat somatosensory cortex. We used a modular architecture of the network divided into layers and sub-regions. Because of a high degree of complexity effective parallelisation of algorithms was required. The results of parallel simulations were presented. An occurrence of the self-organised criticality (SOC) was demonstrated. Most notably, in large biological neural networks consisting of artificial HH neurons, the SOC was shown to manifest itself in the frequency of its appearance as a function of the size of spike potential avalanches generated within such nets. These two parameters followed the power law characteristic of other systems exhibiting the SOC behaviour.

- Cellular Automata | Pp. 468-476

Control of Fuzzy Cellular Automata: The Case of Rule 90

Samira El Yacoubi; Angelo B. Mingarelli

This paper is dedicated to the study of fuzzy rule 90 in relation with control theory. The dynamics and global evolution of fuzzy rules have been recently investigated and some interesting results have been obtained in [10,15,16]. The long term evolution of all 256 one-dimensional fuzzy cellular automata (FCA) has been determined using an analytical approach. We are interested in this paper in the FCA state at a given time and ask whether it can coincide with a desired state by controlling only the initial condition. We investigate two initial states consisting of a single control value on a background of zeros and one seed adjacent to the controlled site in a background of zeros.

- Cellular Automata | Pp. 477-486

Intensive Atmospheric Vortices Modeling Using High Performance Cluster Systems

Arutyun I. Avetisyan; Varvara V. Babkova; Sergey S. Gaissaryan; Alexander Yu. Gubar

The goal of the paper is development of a scalable parallel program calculating the numerical solution of the system of equations modeling the processes and origin conditions of intensive atmospheric vortices (IAV) in 3D compressible atmosphere according to the theory of mesovortice turbulence by Nikolaevskiy. Original system of non-linear equations, and its initial and boundary conditions are discussed. The structure of a parallel program for high performance cluster is developed. The problems concerning to optimization of the program in order to increase its scalability are studied. In summary the results of numerical computations are discussed.

- Methods and Tools of Parallel Programming of Multicomputers | Pp. 487-495

Dynamic Strategy of Placement of the Replicas in Data Grid

Ghalem Belalem; Farouk Bouhraoua

Grid computing is a type of parallel and distributed systems, that is designed to provide pervasive and reliable access to data and computational resources over wide are network. Data Grids connect a collect of geographically distributed computers and storage resources located in different parts of the world to facilitate sharing of data and resources. These grids are concentrated on the reduction of the execution time of the applications that require a great number of processing cycles by the computer. In such environment, these advantages are not possible unless by the use of the replication. This later is considered as an important technique to reduce the cost of access to the data in grid. In this present paper, we present our contribution to a cost model whose objective is to reduce the cost of access to replicated data. These costs depend on many factors like the bandwidth, data size, network latency and the number of the read/ write operations.

- Methods and Tools of Parallel Programming of Multicomputers | Pp. 496-506

ISO: Comprehensive Techniques Toward Efficient GEN_BLOCK Redistribution with Multidimensional Arrays

Shih-Chang Chen; Ching-Hsien Hsu

Runtime data redistribution is usually required in parallel algorithms to enhance data locality, achieve dynamic load balancing and reduce remote data access on distributed memory multicomputers. In this paper, we present comprehensive techniques to implement GEN_BLOCK redistribution in parallelizing compilers, including schemes for communication sets generation, a contention-free communication algorithm and an technique for improving communication efficiency. Both theoretical analysis and experimental results show that the proposed techniques can efficiently perform GEN_BLOCK data redistribution during runtime.

- Methods and Tools of Parallel Programming of Multicomputers | Pp. 507-515