Catálogo de publicaciones - libros
Embedded Computer Systems: Architectures, Modeling, and Simulation: 5th International Workshop, SAMOS 2005, Samos, Greece, July 18-20, Proceedings
Timo D. Hämäläinen ; Andy D. Pimentel ; Jarmo Takala ; Stamatis Vassiliadis (eds.)
En conferencia: 5º International Workshop on Embedded Computer Systems (SAMOS) . Samos, Greece . July 18, 2005 - July 20, 2005
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Theory of Computation; Computer Hardware; Processor Architectures; Computer Communication Networks; System Performance and Evaluation; Computer System Implementation
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2005 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-26969-4
ISBN electrónico
978-3-540-31664-0
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2005
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2005
Tabla de contenidos
doi: 10.1007/11512622_11
Flux Caches: What Are They and Are They Useful?
Georgi N. Gaydadjiev; Stamatis Vassiliadis
In this paper, we introduce the concept of flux caches envisioned to improve processor performance by dynamically changing the cache organization and implementation. Contrary to the traditional approaches, processors designed with flux caches instead of assuming a hardwired cache organization change their cache ”design” on program demand. Consequently program (data and instruction) dynamic behavior determines the cache hardware design. Experimental results to confirm the flux caches potential are also presented.
- Processor Architectures, Design and Simulation | Pp. 93-102
doi: 10.1007/11512622_12
First-Level Instruction Cache Design for Reducing Dynamic Energy Consumption
Cheol Hong Kim; Sunghoon Shim; Jong Wook Kwak; Sung Woo Chung; Chu Shik Jhon
Microarchitects should consider energy consumption, together with performance, when designing instruction cache architecture, especially in embedded processors. This paper proposes a power-aware instruction cache architecture, named Partitioned Instruction Cache (PI-Cache), to reduce dynamic energy consumption in the instruction cache. The proposed PI-Cache is composed of several small sub-caches. When the PI-Cache is accessed, only one sub-cache is accessed by utilizing the locality of applications. In the meantime, the other sub-caches are not accessed, resulting in dynamic energy reduction. The PI-Cache also reduces energy consumption by eliminating energy consumed in tag matching. Moreover, performance loss is little, considering the physical cache access time. We evaluated the energy efficiency by running cycle accurate simulator, SimpleScalar, with power parameters obtained from CACTI. Simulation results show that the PI-Cache reduces dynamic energy consumption by 42% – 59%.
- Processor Architectures, Design and Simulation | Pp. 103-111
doi: 10.1007/11512622_13
A Novel JAVA Processor for Embedded Devices
Yiyu Tan; Chihang Yau; Kaiman Lo; Paklun Mok; Anthony S. Fong
As a result of its object-oriented (OO) feature and corresponding advantages of security, robustness and platform independence, Java is widely applied in embedded devices. However, among current solutions to Java execution engine implemented by software or hardware, the overheads of executing OO related bytecodes are costly and have a great impacts on the overall performance of Java applications, especially in embedded devices, where real-time operations and low power consumptions are required in the case of limited memory. To solve this problem, a novel Java processor architecture called jHISC is proposed where the OO related bytecodes are supported in hardware directly. In jHISC, an object is represented by the hardware-readable data structure -object context, which then makes it possible to implement complex OO related bytecodes at hardware level and access some fields of object in parallel to improve the execution speed. It mainly targets J2ME and implements about 93% bytecodes and 83% OO related bytecodes in hardware directly, and the OO related operations are executed much faster in jHISC than by software traps.
- Processor Architectures, Design and Simulation | Pp. 112-121
doi: 10.1007/11512622_14
Formal Specification of a Protocol Processor
Tomi Westerlund; Juha Plosila
To ensure the correctness of functional and temporal properties of modern network hardware devices is becoming increasingly challenging because the growing complexity and demanding time-to-market requirements. In this paper we address the problem by deriving a TACO protocol processor model in the formal framework of Timed Action Systems. Formal methods offer a prominent approach to specify, design, and verify such devices with the benefits of a rigorous mathematical basis. The derivation demonstrates the capability of preserving correctness when considering an important hardware design decision.
- Processor Architectures, Design and Simulation | Pp. 122-131
doi: 10.1007/11512622_15
Tuning a Protocol Processor Architecture Towards DSP Operations
Jani Paakkulainen; Seppo Virtanen; Jouni Isoaho
In this paper we present an experiment in enhancing our transport triggered protocol processor hardware platform to support DSP applications. Our focus is on integrating support for both application domains into a single processor without loss of performance in either domain. Such a processor could be taken advantage of in applications like Voice-over-IP communication using hand-held devices, where functionality is needed from both domains. As our first step in bridging the gap between the protocol processing and DSP domains we implement support for FIR filtering. We analyze four different architectural instances for implementing FIR filters according to their performance and bus utilisation. We were able to determine that protocol processing and DSP operations can be executed in parallel very efficiently. The implementations were verified with VHDL simulations and synthesis using 0.18 m CMOS technology.
- Processor Architectures, Design and Simulation | Pp. 132-141
doi: 10.1007/11512622_16
Observations on Power-Efficiency Trends in Mobile Communication Devices
Olli Silvén; Kari Jyrkkä
Computing solutions used in mobile communications equipment are essentially the same as those in personal and mainframe computers. The key differences between the implementations are found at the chip level: in mobile devices low leakage silicon technology and lower clock frequency are used. So far, the improvements of the silicon processes in mobile phones have been exploited by software designers to increase functionality and to cut development time, while usage times, and energy efficiency, have been kept at levels that satisfy the customers. In this paper, we explain some of the observed developments.
- Processor Architectures, Design and Simulation | Pp. 142-151
doi: 10.1007/11512622_17
CORDIC-Augmented Sandbridge Processor for Channel Equalization
Mihai Sima; John Glossner; Daniel Iancu; Hua Ye; Andrei Iancu; A. Joseph Hoane
In this paper we analyze an architectural extension for a Sandbridge processor which encompasses a CORDIC functional unit and the associated instructions. Specifically, the first instruction is that configure the CORDIC unit in one of the rotation and vectoring modes for circular, linear, and hyperbolic coordinate systems. The second instruction is that launches CORDIC operations into execution. As case study, we consider channel estimation and correction of the Orthogonal Frequency Division Multiplexing (OFDM) demodulation. In particular, we propose a scheme to implement OFDM channel correction within the extended instruction set. Preliminary results indicate a performance improvement over the base instruction set architecture of more than 80% for doing channel correction, which translates to an improvement of 50% for the entire channel estimation and correction task.
- Processor Architectures, Design and Simulation | Pp. 152-161
doi: 10.1007/11512622_18
Power-Aware Branch Logic: A Hardware Based Technique for Filtering Access to Branch Logic
Sunghoon Shim; Jong Wook Kwak; Cheol Hong Kim; Sung Tae Jhang; Chu Shik Jhon
In this paper, we propose a power-aware branch logic for high performance embedded processors by filtering access to BTB and branch predictor. The proposed scheme reduces the energy consumed in BTB and branch predictor. For reducing the energy consumption in the BTB and the branch predictor, we present an aggressive hardware-based scheme that reduces the number of access to the BTB and the branch predictor. Moreover, compared with general branch logic, the proposed branch logic has no performance degradation. This scheme reduces the number of access to the BTB and the branch predictor by 21% – 50% and reduces the energy consumption in the BTB and the branch predictor by 15% – 41%.
- Processor Architectures, Design and Simulation | Pp. 162-171
doi: 10.1007/11512622_19
Exploiting Intra-function Correlation with the Global History Stack
Fei Gao; Suleyman Sair
The demand for more computation power in high-end embedded systems has put embedded processors on parallel evolution track as the RISC processors. Caches and deeper pipelines are standard features on recent embedded microprocessors. As a result of this, some of the performance penalties associated with branch instructions in RISC processors are becoming more prevalent in these processors. As is the case in RISC architectures, designers have turned to dynamic branch prediction to alleviate this problem. Global correlating branch predictors take advantage of the influence past branches have on future ones. The conditional branch outcomes are recorded in a global history register (). Based on the hypothesis that most correlation is among intra-function branches, we provide a detailed analysis of the in this paper. The GHS saves the global history in the return address stack when a call instruction is executed. Following the subsequent return, the history is restored from the stack. In addition, to preserve the correlation between the callee branches and the caller branches following the call instruction, we save a few of the history bits coming from the end of the callee’s execution. We also investigate saving the GHR of a function in the Branch Target Buffer (BTB) when it returns so that it can be restored when that function is called again. Our results show that these techniques improve the accuracy of several global history based prediction schemes by 4% on average. Consequently, performance improvements as high as 13% are attained.
- Processor Architectures, Design and Simulation | Pp. 172-181
doi: 10.1007/11512622_20
Power Efficient Instruction Caches for Embedded Systems
Dinesh C. Suresh; Walid A. Najjar; Jun Yang
Instruction caches typically consume 27% of the total power in modern high-end embedded systems. We propose a compiler-managed instruction store architecture (K-store) that places the computation intensive loops in a scratch-pad like SRAM memory and allocates the remaining instructions to a regular instruction cache. At runtime, execution is switched dynamically between the instructions in the traditional instruction cache and the ones in the K-store, by inserting jump instructions. The necessary jump instructions add 0.038% on an average to the total dynamic instruction count. We compare the performance and energy consumption of our K-store with that of a conventional instruction cache of equal size. When used in lieu of a 8KB, 4-way associative instruction cache, K-store provides 32% reduction in energy and 7% reduction in execution time. Unlike loop caches, K-store maps the frequent code in a reserved address space and hence, it can switch between the kernel memory and the instruction cache without any noticeable performance penalty.
- Processor Architectures, Design and Simulation | Pp. 182-191