Catálogo de publicaciones - libros

Compartir en
redes sociales


Shared Memory Parallel Programming with Open MP: 5th International Workshop on Open MP Application and Tools, WOMPAT 2004, Houston, TX, USA, May 17-18, 2004

Barbara M. Chapman (eds.)

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Software Engineering/Programming and Operating Systems; Computer Systems Organization and Communication Networks; Theory of Computation; Mathematics of Computing

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2005 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-24560-5

ISBN electrónico

978-3-540-31832-3

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin/Heidelberg 2005

Tabla de contenidos

An Evaluation of Auto-Scoping in OpenMP

Michael Voss; Eric Chiu; Patrick Man Yan Chow; Catherine Wong; Kevin Yuen

In [1], Dieter an Mey proposes the addition of an AUTO attribute to the OpenMP DEFAULT clause. A DEFAULT(AUTO) clause would cause the compiler to automatically determine the scoping of variables that are not explicitly classified as shared, private or reduction. While this new feature would be useful and powerful, its implementation would rely on automatic parallelization technology, which has been shown to have significant limitations. In this paper, we implement support for the DEFAULT(AUTO) clause in the Polaris parallelizing compiler. Our modified version of the compiler will translate regions with DEFAULT(AUTO) clauses into regions that have explicit scopings for all variables. An evaluation of our implementation on a subset of the SPEC OpenMP Benchmark Suite shows that with current automatic parallelization technologies, a number of important regions cannot be statically scoped, resulting in a significant loss of speedup. We also compare our compiler’s performance to that of an Early Access version of the Sun Studio 9 Fortran 95 compiler [2].

Palabras clave: Benchmark Suite; Parallel Region; Parallel Loop; Automatic Parallelization; OpenMP Directive.

Pp. 98-109

Structure and Algorithm for Implementing OpenMP Workshares

Guansong Zhang; Raul Silvera; Roch Archambault

Although OpenMP has become the leading standard in parallel programming languages, the implementation of its runtime environment is not well discussed in the literature. In this paper, we introduce some of the key data structures required to implement OpenMP workshares in our runtime library and also discuss considerations on how to improve its performance. This includes items such as how to set up a workshare control block queue, how to initialize the data within a control block, how to improve barrier performance and how to handle implicit barrier and nowait situations. Finally, we discuss the performance of this implementation focusing on the EPCC benchmark.

Palabras clave: OpenMP; parallel region; workshare; barrier; nowait.

Pp. 110-120

Efficient Implementation of OpenMP for Clusters with Implicit Data Distribution

Zhenying Liu; Lei Huang; Barbara Chapman; Tien-Hsiung Weng

This paper discusses an approach to implement OpenMP on clusters by translating it to Global Arrays (GA). The basic translation strategy from OpenMP to GA is described. GA requires a data distribution; we do not expect the user to supply this; rather, we show how we perform data distribution and work distribution according to OpenMP static loop scheduling. An inspector-executor strategy is employed for irregular applications in order to gather information on accesses to potentially non-local data, group non-local data transfers and overlap communications with local computations. Furthermore, a new directive INVARIANT is proposed to provide information about the dynamic scope of data access patterns. This directive can help us generate efficient codes for irregular applications using the inspector-executor approach. Our experiments show promising results for the corresponding regular and irregular GA codes.

Palabras clave: Hash Table; Parallel Loop; Work Distribution; Loop Schedule; Data Access Pattern.

Pp. 121-136

Runtime Adjustment of Parallel Nested Loops

Alejandro Duran; Raúl Silvera; Julita Corbalán; Jesús Labarta

OpenMP allows programmers to specify nested parallelism in parallel applications. In the case of scientific applications, parallel loops are the most important source of parallelism. In this paper we present an automatic mechanism to dynamically detect the best way to exploit the parallelism when having nested parallel loops. This mechanism is based on the number of threads, the problem size, and the number of iterations on the loop. To do that, we claim that programmers must specify the potential application parallelism and give the runtime the responsibility to decide the best way to exploit it. We have implemented this mechanism inside the IBM XL runtime library. Evaluation shows that our mechanism dynamically adapts the parallelism generated to the application and runtime parameters, reaching the same speedup as the best static parallelization (with a priori information).

Palabras clave: Outer Loop; Nest Loop; Mixed Approach; Parallel Loop; Outer Level.

Pp. 137-147