Catálogo de publicaciones - libros

Compartir en
redes sociales


Título de Acceso Abierto

Optimizing HPC Applications with Intel Cluster Tools

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer science

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No requiere 2014 Directory of Open access Books acceso abierto
No requiere 2014 SpringerLink acceso abierto

Información

Tipo de recurso:

libros

ISBN impreso

978-1-4302-6496-5

ISBN electrónico

978-1-4302-6497-2

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Tabla de contenidos

No Time to Read This Book?

Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken

We know what it feels like to be under pressure. Try out a few quick and proven optimization stunts described below. They may provide a good enough performance gain right away.

Pp. 1-10

Overview of Platform Architectures

Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken

In order to optimize software you need to understand hardware. In this chapter we give you a brief overview of the typical system architectures found in the high-performance computing (HPC) today. We also introduce terminology that will be used throughout the book.

Pp. 11-37

Top-Down Software Optimization

Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken

The tuning of a previously unoptimized hardware/software combination is a difficult task, one that even experts struggle with. Anything can go wrong here, from the proper setup to the compilation and execution of individual machine instructions. It is, therefore, of paramount importance to follow a logical and systematic approach to improve performance incrementally, continuously exposing the next bottleneck to be fixed.

Pp. 39-53

Addressing System Bottlenecks

Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken

We start with a bold statement: every application has a bottleneck. By that, we mean that there is always something that limits performance of a given application in a system. Even if the application is well optimized and it may seem that no additional improvements are possible by tuning it on the other levels, it still has a bottleneck, and that bottleneck is in the system the program runs on. The tuning starts and ends at the system level.

Pp. 55-85

Addressing Application Bottlenecks: Distributed Memory

Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken

The first application optimization level accessible to the ever-busy performance analyst is the distributed memory one, normally expressed in terms of the Message Passing Interface (MPI). By its very nature, the distributed memory paradigm is concerned with communication. Some people consider all communication as that is, something intrinsically harmful that needs to be eliminated. We tend to call it “investment.” Indeed, by moving data around in the right manner, you hope to get more computational power in return. The main point, then, is to optimize this investment so that your returns are maximized.

Pp. 87-171

Addressing Application Bottlenecks: Shared Memory

Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken

Chapter 5 talks about the potential bottlenecks in your application and the system it runs on. In this chapter, we will have a close look at how the application code performs on the level of an individual cluster node. It is a fair assumption that there will also be bottlenecks on this level. Removing these bottlenecks will usually translate directly to increased performance, in addition to the optimizations discussed in the previous chapters.

Pp. 173-200

Addressing Application Bottlenecks: Microarchitecture

Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken

Microarchitectural performance tuning is one of the most difficult parts of the performance tuning process. In contrast to other tuning activities, it is not immediately clear what the bottlenecks are. Usually, discovering this requires study of processor manuals, which provide the details of the execution flow. Furthermore, a certain understanding of assembly language is needed to reflect the findings back onto the original source code. Each processor model will also have its own microarchitectural characteristics that have to be considered when writing efficient software.

Pp. 201-246

Application Design Considerations

Alexander Supalov; Andrey Semin; Michael Klemm; Christopher Dahnken

In Chapters 5 to 7 we reviewed the methods, tools, and techniques for application tuning, explained by using examples of HPC applications and benchmarks. The whole process followed the top-down software optimization framework explained in Chapter 3. The general approach to the tuning process is based on a quantitative analysis of execution resources required by an application and how these match the capabilities of the platform the application is run on. The blueprint analysis of platform capabilities and system-level tuning considerations were provided in Chapter 4, based on several system architecture metrics discussed in Chapter 2.

Pp. 247-264