Catálogo de publicaciones - libros

Compartir en
redes sociales


Compiler Construction: 16th International Conference, CC 2007, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2007, Braga, Portugal, March 26-30, 2007. Proceedings./

Shriram Krishnamurthi ; Martin Odersky (eds.)

En conferencia: 16º International Conference on Compiler Construction (CC) . Braga, Portugal . March 26, 2007 - March 30, 2007

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

No disponibles.

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2007 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-71228-2

ISBN electrónico

978-3-540-71229-9

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2007

Tabla de contenidos

New Algorithms for SIMD Alignment

Liza Fireman; Erez Petrank; Ayal Zaks

Optimizing programs for modern multiprocessor or vector platforms is a major important challenge for compilers today. In this work, we focus on one challenging aspect: the simd alignment problem. Previously, only heuristics were used to solve this problem, without guarantees on the number of shifts in the obtained solution. We study two interesting and realistic special cases of the simd alignment problem and present two novel and efficient algorithms that provide optimal solutions for these two cases. The new algorithms employ dynamic programming and a min-cut / max-flow algorithm as subroutines. We also discuss the relation between the simd alignment problem and the multiway cut and multiway cut problems; and we show how to derive an approximated solution to the simd alignment problem based on approximation algorithms to these two known problems.

Palabras clave: Directed Acyclic Graph; Sink Node; Single Instruction Multiple Data; Shift Operation; Minimum Node.

- Architecture | Pp. 1-15

Preprocessing Strategy for Effective Modulo Scheduling on Multi-issue Digital Signal Processors

Doosan Cho; Ravi Ayyagari; Gang-Ryung Uh; Yunheung Paek

To achieve high resource utilization for multi-issue Digital Signal Processors (DSPs), production compilers commonly include variants of the iterative modulo scheduling algorithm. However, excessive cyclic data dependences, which exist in communication and media processing loops, often prevent the modulo scheduler from achieving ideal loop initiation intervals. As a result, replicated functional units in multi-issue DSPs are frequently underutilized. In response to this resource underutilization problem, this paper describes a compiler preprocessing strategy that capitalizes on two techniques for effective modulo scheduling, referred to as cloning1 and cloning2 . The core of the proposed techniques lies in the direct relaxation of cyclic data dependences by exploiting functional units which are otherwise left unused. Since our preprocessing strategy requires neither code duplication nor additional hardware support, it is relatively easy to implement in DSP compilers. The strategy proposed has been validated by an implementation for a StarCore SC140 optimizing C compiler.

Palabras clave: Splittable Point; Loop Body; Loop Schedule; Preprocessing Strategy; Loop Buffer.

- Architecture | Pp. 16-31

An Array Allocation Scheme for Energy Reduction in Partitioned Memory Architectures

K. Shyam; R. Govindarajan

This paper presents a compiler technique that reduces the energy consumption of the memory subsystem, for an off-chip partitioned memory architecture having multiple memory banks and various low-power operating modes for each of these banks. More specifically, we propose an efficient array allocation scheme to reduce the number of simultaneously active memory banks, so that the other memory banks that are inactive can be put to low power modes to reduce the energy. We model this problem as a graph partitioning problem, and use well known heuristics to solve the same. We also propose a simple Integer Linear Programming (ILP) formulation for the above problem. Our approach achieves, on an average, 20% energy reduction over the base scheme, and 8% to 10% energy reduction over previously suggested methods. Further, the results obtained using our heuristic are within 1% of optimal results obtained by using our ILP method.

Palabras clave: Integer Linear Program; Energy Reduction; Memory Bank; Memory Architecture; Integer Linear Program Problem.

- Architecture | Pp. 32-47

Using Prefetching to Improve Reference-Counting Garbage Collectors

Harel Paz; Erez Petrank

Reference counting is a classical garbage collection method. Recently, a series of papers have extended the basic method to drastically reduce its notorious overhead and extend the basic method to run concurrently and efficiently on a modern computing platform. In this paper we investigate the use of prefetching to further improve the efficiency of the reference-counting collector. We propose potential prefetching opportunities for the advanced reference-counting collector and report an implementation of a collector that employs such prefetching. The proposed prefetch instructions were inserted into the Jikes reference-counting collector obtaining an average reduction of 8.7% of the memory management overheads.

Palabras clave: Garbage Collection; Cache Line; Garbage Collector; Free List; Reference Counting.

- Garbage Collection and Program Analysis | Pp. 48-63

Accurate Garbage Collection in Uncooperative Environments with Lazy Pointer Stacks

Jason Baker; Antonio Cunei; Filip Pizlo; Jan Vitek

Implementing a new programming language by the means of a translator to an existing language is attractive as it provides portability over all platforms supported by the host language and reduces the development time as many low-level tasks can be delegated to the host compiler. The C and C++ programming languages are popular choices for many language implementations due to the availability of efficient compilers on many platforms, and good portability. For garbage-collected languages, however, they are not a perfect match as they provide no support for accurately discovering pointers to heap-allocated data. We evaluate the published techniques, and propose a new mechanism, lazy pointer stacks, for performing accurate garbage collection in such uncooperative environments. We implemented the new technique in the Ovm Java virtual machine with our own Java-to-C++ compiler and GCC as a back-end, and found that our technique outperforms existing approaches.

Palabras clave: Virtual Machine; Garbage Collection; Exception Handler; Garbage Collector; Safe Point.

- Garbage Collection and Program Analysis | Pp. 64-79

Correcting the Dynamic Call Graph Using Control-Flow Constraints

Byeongcheol Lee; Kevin Resnick; Michael D. Bond; Kathryn S. McKinley

To reason about programs, dynamic optimizers and analysis tools use sampling to collect a dynamic call graph (DCG). However, sampling has not achieved high accuracy with low runtime overhead. As object-oriented programmers compose increasingly complex programs, inaccurate call graphs will inhibit analysis and optimizations. This paper demonstrates how to use static and dynamic control flow graph (CFG) constraints to improve the accuracy of the DCG. We introduce the frequency dominator (FDOM), a novel CFG relation that extends the dominator relation to expose static relative execution frequencies of basic blocks. We combine conservation of flow and dynamic CFG basic block profiles to further improve the accuracy of the DCG. Together these approaches add minimal overhead (1%) and achieve 85% accuracy compared to a perfect call graph for SPEC JVM98 and DaCapo benchmarks. Compared to sampling alone, accuracy improves by 12 to 36%. These results demonstrate that static and dynamic control-flow information offer accurate information for efficiently improving the DCG.

Palabras clave: Basic Block; Correction Algorithm; Control Flow Graph; Call Graph; Call Site.

- Garbage Collection and Program Analysis | Pp. 80-95

Obfuscating Java: The Most Pain for the Least Gain

Michael Batchelder; Laurie Hendren

Bytecode, Java’s binary form, is relatively high-level and therefore susceptible to decompilation attacks. An obfuscator transforms code such that it becomes more complex and therefore harder to reverse engineer. We develop bytecode obfuscations that are complex to reverse engineer but also do not significantly degrade performance. We present three kinds of techniques that: (1) obscure intent at the operational level; (2) complicate control flow and object-oriented design (i.e. program structure); and (3) exploit the semantic gap between what is legal in source code and what is legal in bytecode. Obfuscations are applied to a benchmark suite to examine their affect on runtime performance, control flow graph complexity and decompilation. These results show that most of the obfuscations have only minor negative performance impacts and many increase complexity. In almost all cases, tested decompilers fail to produce legal source code or crash completely. Those obfuscations that are decompilable greatly reduce the readability of output source.

Palabras clave: Reverse Engineer; Java Source Code; Jump Target; Fail Fail; Opaque Predicate.

- Garbage Collection and Program Analysis | Pp. 96-110

A Fast Cutting-Plane Algorithm for Optimal Coalescing

Daniel Grund; Sebastian Hack

Recent work has shown that the subtasks of register allocation (spilling, register assignment, and coalescing) can be completely separated. This work presents an algorithm for the coalescing subproblem that relies on this separation. The algorithm uses 0/1 Linear Programming (ILP), a general-purpose optimization technique, to derive optimal solutions. We provide the first optimal solutions for a benchmark called “Optimal Coalescing Challenge”, i.e., our ILP model outperforms previous approaches. Additionally, we use these optimal solutions to assess the quality of well-known heuristics. A second benchmark on SPEC CPU2000 programs emphasizes the practicality of our algorithm.

Palabras clave: Integer Linear Programming; Chordal Graph; Integer Linear Programming Model; Integer Linear Programming Formulation; Register Allocation.

- Register Allocation | Pp. 111-125

Register Allocation and Optimal Spill Code Scheduling in Software Pipelined Loops Using 0-1 Integer Linear Programming Formulation

Santosh G. Nagarakatte; R. Govindarajan

In achieving higher instruction level parallelism, software pipelining increases the register pressure in the loop. The usefulness of the generated schedule may be restricted to cases where the register pressure is less than the available number of registers. Spill instructions need to be introduced otherwise. But scheduling these spill instructions in the compact schedule is a difficult task. Several heuristics have been proposed to schedule spill code. These heuristics may generate more spill code than necessary, and scheduling them may necessitate increasing the initiation interval. We model the problem of register allocation with spill code generation and scheduling in software pipelined loops as a 0-1 integer linear program. The formulation minimizes the increase in initiation interval (II) by optimally placing spill code and simultaneously minimizes the amount of spill code produced. To the best of our knowledge, this is the first integrated formulation for register allocation, optimal spill code generation and scheduling for software pipelined loops. The proposed formulation performs better than the existing heuristics by preventing an increase in II in 11.11% of the loops and generating 18.48% less spill code on average among the loops extracted from Perfect Club and SPEC benchmarks with a moderate increase in compilation time.

Palabras clave: Decision Variable; Register Pressure; Memory Unit; Integer Linear Programming Formulation; Register Allocation.

- Register Allocation | Pp. 126-140

Extended Linear Scan: An Alternate Foundation for Global Register Allocation

Vivek Sarkar; Rajkishore Barik

In this paper, we extend past work on Linear Scan register allocation, and propose two Extended Linear Scan ( ELS ) algorithms that retain the compile-time efficiency of past Linear Scan algorithms while delivering performance that can match or surpass that of Graph Coloring. Specifically, this paper makes the following contributions: – We highlight three fundamental theoretical limitations in using Graph Coloring as a foundation for global register allocation, and introduce a basic Extended Linear Scan algorithm, ELS _0, which addresses all three limitations for the problem of Spill-Free Register Allocation. – We introduce the ELS _1 algorithm which extends ELS _0 to obtain a greedy algorithm for the problem of Register Allocation with Total Spills. – Finally, we present experimental results to compare the Graph Coloring and Extended Linear Scan algorithms. Our results show that the compile-time speedups for ELS _1 relative to GC were significant, and varied from 15× to 68×. In addition, the resulting execution time improved by up to 5.8%, with an average improvement of 2.3%. Together, these results show that Extended Linear Scan is promising as an alternate foundation for global register allocation, compared to Graph Coloring, due to its compile-time scalability without loss of execution time performance.

Palabras clave: Extend Linear; Graph Coloring; Program Point; Register Allocation; Instruction Schedule.

- Register Allocation | Pp. 141-155