Catálogo de publicaciones - libros
Shared Memory Parallel Programming with Open MP: 5th International Workshop on Open MP Application and Tools, WOMPAT 2004, Houston, TX, USA, May 17-18, 2004
Barbara M. Chapman (eds.)
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Software Engineering/Programming and Operating Systems; Computer Systems Organization and Communication Networks; Theory of Computation; Mathematics of Computing
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2005 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-24560-5
ISBN electrónico
978-3-540-31832-3
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2005
Información sobre derechos de publicación
© Springer-Verlag Berlin/Heidelberg 2005
Cobertura temática
Tabla de contenidos
An Evaluation of Auto-Scoping in OpenMP
Michael Voss; Eric Chiu; Patrick Man Yan Chow; Catherine Wong; Kevin Yuen
In [1], Dieter an Mey proposes the addition of an AUTO attribute to the OpenMP DEFAULT clause. A DEFAULT(AUTO) clause would cause the compiler to automatically determine the scoping of variables that are not explicitly classified as shared, private or reduction. While this new feature would be useful and powerful, its implementation would rely on automatic parallelization technology, which has been shown to have significant limitations. In this paper, we implement support for the DEFAULT(AUTO) clause in the Polaris parallelizing compiler. Our modified version of the compiler will translate regions with DEFAULT(AUTO) clauses into regions that have explicit scopings for all variables. An evaluation of our implementation on a subset of the SPEC OpenMP Benchmark Suite shows that with current automatic parallelization technologies, a number of important regions cannot be statically scoped, resulting in a significant loss of speedup. We also compare our compiler’s performance to that of an Early Access version of the Sun Studio 9 Fortran 95 compiler [2].
Palabras clave: Benchmark Suite; Parallel Region; Parallel Loop; Automatic Parallelization; OpenMP Directive.
Pp. 98-109
Structure and Algorithm for Implementing OpenMP Workshares
Guansong Zhang; Raul Silvera; Roch Archambault
Although OpenMP has become the leading standard in parallel programming languages, the implementation of its runtime environment is not well discussed in the literature. In this paper, we introduce some of the key data structures required to implement OpenMP workshares in our runtime library and also discuss considerations on how to improve its performance. This includes items such as how to set up a workshare control block queue, how to initialize the data within a control block, how to improve barrier performance and how to handle implicit barrier and nowait situations. Finally, we discuss the performance of this implementation focusing on the EPCC benchmark.
Palabras clave: OpenMP; parallel region; workshare; barrier; nowait.
Pp. 110-120
Efficient Implementation of OpenMP for Clusters with Implicit Data Distribution
Zhenying Liu; Lei Huang; Barbara Chapman; Tien-Hsiung Weng
This paper discusses an approach to implement OpenMP on clusters by translating it to Global Arrays (GA). The basic translation strategy from OpenMP to GA is described. GA requires a data distribution; we do not expect the user to supply this; rather, we show how we perform data distribution and work distribution according to OpenMP static loop scheduling. An inspector-executor strategy is employed for irregular applications in order to gather information on accesses to potentially non-local data, group non-local data transfers and overlap communications with local computations. Furthermore, a new directive INVARIANT is proposed to provide information about the dynamic scope of data access patterns. This directive can help us generate efficient codes for irregular applications using the inspector-executor approach. Our experiments show promising results for the corresponding regular and irregular GA codes.
Palabras clave: Hash Table; Parallel Loop; Work Distribution; Loop Schedule; Data Access Pattern.
Pp. 121-136
Runtime Adjustment of Parallel Nested Loops
Alejandro Duran; Raúl Silvera; Julita Corbalán; Jesús Labarta
OpenMP allows programmers to specify nested parallelism in parallel applications. In the case of scientific applications, parallel loops are the most important source of parallelism. In this paper we present an automatic mechanism to dynamically detect the best way to exploit the parallelism when having nested parallel loops. This mechanism is based on the number of threads, the problem size, and the number of iterations on the loop. To do that, we claim that programmers must specify the potential application parallelism and give the runtime the responsibility to decide the best way to exploit it. We have implemented this mechanism inside the IBM XL runtime library. Evaluation shows that our mechanism dynamically adapts the parallelism generated to the application and runtime parameters, reaching the same speedup as the best static parallelization (with a priori information).
Palabras clave: Outer Loop; Nest Loop; Mixed Approach; Parallel Loop; Outer Level.
Pp. 137-147