Catálogo de publicaciones - libros
High Performance Computing for Computational Science: VECPAR 2006: 7th International Conference, Rio de Janeiro, Brazil, June 10-13, 2006, Revised Selected and Invited Papers
Michel Daydé ; José M. L. M. Palma ; Álvaro L. G. A. Coutinho ; Esther Pacitti ; João Correia Lopes (eds.)
En conferencia: 7º International Conference on High Performance Computing for Computational Science (VECPAR) . Rio de Janeiro, Brazil . June 10, 2006 - June 13, 2006
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Computer System Implementation; Software Engineering/Programming and Operating Systems; Theory of Computation; Computer Communication Networks; Mathematics of Computing
Disponibilidad
| Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
|---|---|---|---|---|
| No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-71350-0
ISBN electrónico
978-3-540-71351-7
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Cobertura temática
Tabla de contenidos
Combinatorial Scientific Computing: The Enabling Power of Discrete Algorithms in Computational Science
Bruce Hendrickson; Alex Pothen
Combinatorial algorithms have long played a crucial, albeit under-recognized role in scientific computing. This impact ranges well beyond the familiar applications of graph algorithms in sparse matrices to include mesh generation, optimization, computational biology and chemistry, data analysis and parallelization. Trends in science and in computing suggest strongly that the importance of discrete algorithms in computational science will continue to grow. This paper reviews some of these many past successes and highlights emerging areas of promise and opportunity.
- Chapter 3: Numerical Methods | Pp. 260-280
Improving the Numerical Simulation of an Airflow Problem with the BlockCGSI Algorithm
C. Balsa; M. Braza; M. Daydé; J. Palma; D. Ruiz
Partial spectral information associated with the smallest ei- genvalues can be used to improve the solution of successive linear systems of equations, namely in the simulation of time-dependent partial differential equations, where at each time step there are several systems with the same spectral properties to be solved. We propose to perform a partial spectral decomposition with the BlockCGSI algorithm in the first time step, and exploit this information to improve the convergence of the Conjugate Gradient algorithm in the solution of the following linear systems. We describe in summary the BlockCGSI algorithm, that is a combination of the block Conjugate Gradient (blockCG) with the Inverse Subspace Iteration. Then, we validate the accelerating strategy in the simulation of the flow around an airplane wing, where the Conjugate Gradient is accelerated through the deflation of the starting residual.
- Chapter 3: Numerical Methods | Pp. 281-291
EdgePack: A Parallel Vertex and Node Reordering Package for Optimizing Edge-Based Computations in Unstructured Grids
Marcos Martins; Renato Elias; Alvaro Coutinho
A new and simple method is proposed to choose the best data configuration in terms of processing phase time according to previous probing of edge-based matrix-vector products for codes using iterative solvers in unstructured grid problems. This method is realized as a suite of routines named EdgePack, acting during both pre-solution and solution phase, based on data locality optimization techniques and variations of matrix-vector product algorithm. Results have been demonstrating the great flexibility and simplicity of this method, which is suitable for distributed memory platforms in which different data configurations can coexist.
- Chapter 3: Numerical Methods | Pp. 292-304
Parallel Processing of Matrix Multiplication in a CPU and GPU Heterogeneous Environment
Satoshi Ohshima; Kenji Kise; Takahiro Katagiri; Toshitsugu Yuba
GPUs for numerical computations are becoming an attractive alternative in research. In this paper, we propose a new parallel processing environment for matrix multiplications by using both CPUs and GPUs. The execution time of matrix multiplications can be decreased to 40.1% by our method, compared with using the fastest of either CPU only case or GPU only case. Our method performs well when matrix sizes are large.
- Chapter 3: Numerical Methods | Pp. 305-318
Robust Two-Level Lower-Order Preconditioners for a Higher-Order Stokes Discretization with Highly Discontinuous Viscosities
Duilio Conceição; Paulo Goldfeld; Marcus Sarkis
The main goal of this paper is to present new robust and scalable preconditioned conjugate gradient algorithms for solving Stokes equations with large viscosity jumps across subregion interfaces and discretized on non-structured meshes. The proposed algorithms do not require the construction of a coarse mesh and avoid expensive communications between coarse and fine levels. The algorithms belong to the family of preconditioners based on non-overlapping decomposition of subregions known as balancing domain decomposition methods. The local problems employ two-level element-wise/subdomain-wise direct factorizations to reduce the size and the cost of the local Dirichlet and Neumann Stokes solvers. The Stokes coarse problem is based on subdomain constant pressures and on connected subdomain interface flux functions and rigid body motions. This guarantees scalability and solvability of the local Neumann problems. Estimates on the condition numbers and numerical experiments based on a parallel implementation for unstructured meshes are also discussed.
- Chapter 3: Numerical Methods | Pp. 319-333
The Impact of Parallel Programming Models on the Performance of Iterative Linear Solvers for Finite Element Applications
Kengo Nakajima
Parallel iterative linear solvers for unstructured grids in FEM applications, originally developed for the Earth Simulator (ES), are ported to various types of parallel computer. The performance of flat MPI and hybrid parallel programming models is compared for the ES, Hitachi SR8000, IBM SP-3 and IBM p5-model 595 supercomputers. The effect of coloring and of different storage methods for coefficient matrices are evaluated in various types of application. Performance for more than 10 processors is estimated using measured data for up to 10 processors.
- Chapter 3: Numerical Methods | Pp. 334-348
Efficient Parallel Algorithm for Constructing a Unit Triangular Matrix with Prescribed Singular Values
Georgina Flores-Becerra; Victor M. Garcia; Antonio M. Vidal
The problem tackled in this paper is the parallel construction of a unit triangular matrix with prescribed singular values, when these fulfill Weyl’s conditions [9] ; this is a particular case of the Inverse Singular Value Problem. A sequential algorithm for this problem was proposed in [10] by Kosowsky and Smoktunowicz. In this paper parallel versions of this algorithm will be described, both for shared memory and distributed memory architectures. The proposed parallel implementation is better suited for the shared memory paradigm; this is confirmed by the numerical experiments; the shared memory version, reaches an efficiency over 90%, and reduces substantially the execution times compared with the sequential algorithm.
- Chapter 3: Numerical Methods | Pp. 349-362
A Rewriting System for the Vectorization of Signal Transforms
Franz Franchetti; Yevgen Voronenko; Markus Püschel
We present a rewriting system that automatically vectorizes signal transform algorithms at a high level of abstraction. The input to the system is a transform algorithm given as a formula in the well-known Kronecker product formalism. The output is a “vectorized” formula, which means it consists exclusively of constructs that can be directly mapped into short vector code. This approach obviates compiler vectorization, which is known to be limited in this domain. We included the formula vectorization into the Spiral program generator for signal transforms, which enables us to generate vectorized code and further optimize for the memory hierarchy through search over alternative algorithms. Benchmarks for the discrete Fourier transform (DFT) show that our generated floating-point code is competitive with and that our fixed-point code clearly outperforms the best available libraries.
- Chapter 3: Numerical Methods | Pp. 363-377
High Order Fourier-Spectral Solutions to Self Adjoint Elliptic Equations
Moshe Israeli; Alexander Sherman
We develop a High Order Fourier solver for nonseparable, selfadjoint elliptic equations with variable (diffusion) coefficients. The solution of an auxiliary constant coefficient equation, serves in a transformation of the dependent variable. There results a ”modified Helmholtz” elliptic equation with almost constant coefficients. The small deviations from constancy are treated as correction terms. We developed a highly accurate, fast, Fourier-spectral algorithm to solve such constant coefficient equations. A small number of correction steps is required in order to achieve very high accuracy. This is achieved by optimization of the coefficients in the auxiliary equation. For given coefficients the approximation error becomes smaller as the domain decreases. A highly parallelizable hierarchical procedure allows a decomposition into smaller sub-domains where the solution is efficiently computed. This step is followed by hierarchical matching to reconstruct the global solution. Numerical experiments illustrate the high accuracy of the approach even at coarse resolutions.
- Chapter 3: Numerical Methods | Pp. 378-390
Multiresolution Simulations Using Particles
Michael Bergdorf; Petros Koumoutsakos
We present novel multiresolution particle methods with extended dynamic adaptivity in areas where increased resolution is required. In the framework of smooth particle methods we present two adaptive approaches: one based on globally adaptive mappings and one employing a wavelet-based multiresolution analysis to guide the allocation of computational elements. Preliminary results are presented from the application of these methods to problems involving the development of sharp vorticity gradients. The present particle methods are employed in large scale parallel computer architectures demonstrating a high degree of parallelization and enabling state of the art large scale simulations of continuum systems using particles.
- Chapter 3: Numerical Methods | Pp. 391-402