Catálogo de publicaciones - libros
Título de Acceso Abierto
Optimizing HPC Applications with Intel Cluster Tools
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Computer science
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No requiere | 2014 | Directory of Open access Books | ||
No requiere | 2014 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-1-4302-6496-5
ISBN electrónico
978-1-4302-6497-2
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2014
Cobertura temática
Tabla de contenidos
No Time to Read This Book?
Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken
We know what it feels like to be under pressure. Try out a few quick and proven optimization stunts described below. They may provide a good enough performance gain right away.
Pp. 1-10
Overview of Platform Architectures
Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken
In order to optimize software you need to understand hardware. In this chapter we give you a brief overview of the typical system architectures found in the high-performance computing (HPC) today. We also introduce terminology that will be used throughout the book.
Pp. 11-37
Top-Down Software Optimization
Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken
The tuning of a previously unoptimized hardware/software combination is a difficult task, one that even experts struggle with. Anything can go wrong here, from the proper setup to the compilation and execution of individual machine instructions. It is, therefore, of paramount importance to follow a logical and systematic approach to improve performance incrementally, continuously exposing the next bottleneck to be fixed.
Pp. 39-53
Addressing System Bottlenecks
Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken
We start with a bold statement: every application has a bottleneck. By that, we mean that there is always something that limits performance of a given application in a system. Even if the application is well optimized and it may seem that no additional improvements are possible by tuning it on the other levels, it still has a bottleneck, and that bottleneck is in the system the program runs on. The tuning starts and ends at the system level.
Pp. 55-85
Addressing Application Bottlenecks: Distributed Memory
Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken
The first application optimization level accessible to the ever-busy performance analyst is the distributed memory one, normally expressed in terms of the Message Passing Interface (MPI). By its very nature, the distributed memory paradigm is concerned with communication. Some people consider all communication as that is, something intrinsically harmful that needs to be eliminated. We tend to call it “investment.” Indeed, by moving data around in the right manner, you hope to get more computational power in return. The main point, then, is to optimize this investment so that your returns are maximized.
Pp. 87-171
Addressing Application Bottlenecks: Shared Memory
Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken
Chapter 5 talks about the potential bottlenecks in your application and the system it runs on. In this chapter, we will have a close look at how the application code performs on the level of an individual cluster node. It is a fair assumption that there will also be bottlenecks on this level. Removing these bottlenecks will usually translate directly to increased performance, in addition to the optimizations discussed in the previous chapters.
Pp. 173-200
Addressing Application Bottlenecks: Microarchitecture
Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken
Microarchitectural performance tuning is one of the most difficult parts of the performance tuning process. In contrast to other tuning activities, it is not immediately clear what the bottlenecks are. Usually, discovering this requires study of processor manuals, which provide the details of the execution flow. Furthermore, a certain understanding of assembly language is needed to reflect the findings back onto the original source code. Each processor model will also have its own microarchitectural characteristics that have to be considered when writing efficient software.
Pp. 201-246
Application Design Considerations
Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken
In Chapters 5 to 7 we reviewed the methods, tools, and techniques for application tuning, explained by using examples of HPC applications and benchmarks. The whole process followed the top-down software optimization framework explained in Chapter 3. The general approach to the tuning process is based on a quantitative analysis of execution resources required by an application and how these match the capabilities of the platform the application is run on. The blueprint analysis of platform capabilities and system-level tuning considerations were provided in Chapter 4, based on several system architecture metrics discussed in Chapter 2.
Pp. 247-264