Catálogo de publicaciones - libros

Compartir en
redes sociales

Advances in Computer Systems Architecture: 11th Asia-Pacific Conference, ACSAC 2006, Shanghai, China, September 6-8, 2006, Proceedings

Chris Jesshope ; Colin Egan (eds.)

En conferencia: 11º Asia-Pacific Conference on Advances in Computer Systems Architecture (ACSAC) . Shanghai, China . September 6, 2006 - September 8, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Computer System Implementation; Arithmetic and Logic Structures; Input/Output and Data Communications; Logic Design; Computer Communication Networks; Processor Architectures

Disponibilidad

Institución detectada	Año de publicación	Navegá	Descargá	Solicitá
No detectada	2006	SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-40056-1

ISBN electrónico

978-3-540-40058-5

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

2006

Información sobre derechos de publicación

Cobertura temática

Ciencias de la computación e información

Artes

Tabla de contenidos

Verificá que desde tu institución tengas acceso para descargar o solicitar el libro completo o alguno de sus capítulos.

doi: 10.1007/11859802_21

Combining Wireless Sensor Network with Grid for Intelligent City Traffic

Feilong Tang; Minglu Li; Chuliang Weng; Chongqing Zhang; Wenzhe Zhang; Hongyu Huang; Yi Wang

Intelligent city traffic for travelling navigation, traffic prediction and decision support needs to collect large-scale real-time data from numerous vehicles. As a small, economical yet reasonably efficient device, wireless sensors can conveniently serve for this purpose. In this paper, we investigate how to deploy wireless sensor networks in buses to gather traffic data for intelligent city traffic. The paper presents a self-organization mechanism and a routing protocol for the proposed sensor networks. Our work has three advantages: (1)adaptive network topology, which satisfies highly mobile city traffic environment, (2)directed data transmission, saving energy consumption of sensor nodes with limited power resource, and (3)longer lifetime because of fewer redundant network communication and balanced power usage of sensor nodes in a network.

Palabras clave: Sensor Network; Sensor Node; Wireless Sensor Network; Cluster Head; Super Node.

Pp. 260-269

doi: 10.1007/11859802_22

A Novel Processor Architecture for Real-Time Control

Xiaofeng Wu; Vassilios Chouliaras; Jose Nunez-Yanez; Roger Goodall; Tanya Vladimirova

This paper describes a control system processor architecture based on ΔΣ modulation (ΔΣ-CSP). The ΔΣ-CSP uses 1-bit processing which is a new concept in digital control to remove multi-bit multiplications. A simple conditional-negate-and-add (CNA) unit is proposed for most operations of control laws. For this reason, the targeted processor is small and very fast, making it ideal for embedded real-time control applications. The ΔΣ-CSP has been implemented as a VLSI hard macro in a high-performance 0.13 μm silicon process. Results show that it compares very favorably to other digital processors in terms of area and clock frequency.

Palabras clave: Control System Processing; Program Counter; Digital Simulation; Digital Processor; Pulse Density Modulation.

Pp. 270-280

doi: 10.1007/11859802_23

A 0-1 Integer Linear Programming Based Approach for Global Locality Optimizations

Jun Xia; Li Luo; Xuejun Yang

Compiler optimizations aimed at improving cache locality are critical in realizing the performance potential of memory subsystem. For scientific programs, loop and data transformations are two important compiler optimization methods to improve cache locality. In this paper, we combine loop and data transformations and present a 0-1 integer linear programming (0-1 ILP) based approach that attempts to solve global locality optimization problems. We use the treelike memory layout graph (TMLG) to describe a program’s locality characteristics, formulate the locality optimization problems as the problems of finding the optimal path sets in TMLGs, and then use 0-1 ILP to find the optimal path sets. Our approach is applicable not only to perfectly nested loops but also to non-perfectly nested loops. Moreover, the approach is suitable for handling the circumstances that arrays are accessed not only along dimensions but also along diagonal-like directions. The experimental results show the effectiveness of our approach.

Palabras clave: Cache locality; compiler optimizations; memory layouts; loop transformations; data transformations; integer linear programming.

Pp. 281-294

doi: 10.1007/11859802_24

Design and Analysis of Low Power Image Filters Toward Defect-Resilient Embedded Memories for Multimedia SoCs

Kang Yi; Kyeong Hoon Jung; Shih-Yang Cheng; Young-Hwan Park; Fadi Kurdahi; Ahmed Eltawil

In the foreseeable future, System-on-Chip design will suffer from the problem of low yield especially in embedded memories. This can be a critical problem in a multimedia application like H.264 since it needs a huge amount of embedded memory. Existing approaches to solve this problem are not feasible given the higher memory defect density rates in technologies below 90 nm. In this paper, we present a new defect-resilience technique which employs the directional image filter in order to recover data from corrupted embedded memory. According to the analysis based on simulation the proposed filter can greatly improve the visual quality of the defected H.264 video streams with errors in data memory reaching up to 1.0% memory BER (Bit Error Rate) with lower power consumption relative to conventional median filter. Therefore, the proposed method can be a good solution to overcome the problem of low yield in multimedia SoC memory without suffering from additional redundant memory overhead.

Palabras clave: Low power image filter design; Embedded memory; Memory yield enhancement; Memory-error resilient design; BIST; BISR; H.264 codec.

Pp. 295-308

doi: 10.1007/11859802_25

Entropy Throttling: A Physical Approach for Maximizing Packet Mobility in Interconnection Networks

Takashi Yokota; Kanemitsu Ootsu; Fumihito Furukawa; Takanobu Baba

A large-scale direct interconnection network usually consists of enormous number of simple routers. However, its behavior is sometimes very complicated. Such a complicated behavior prevents us from accurate understanding and efficient control of the network. Among serious problems in interconnection networks, congestion control is of extreme importance since network performance is drastically degraded by a congested situation. We focus our discussion on throttling, injection limitation in other words, as one of the most hopeful solutions to the congestion problem. Our approach is inspired from physics. We define entropy as a desirable metric for representing the network’s congestion level. We also define packet mobility ratio as a proper approximation of entropy. Thus we reach a new throttling method called ‘Entropy Throttling’ that is based on theoretical discussion on congestion. Evaluation results by our simulator reveal effectiveness of the proposed method.

Palabras clave: Congestion Control; Interconnection Network; Average Latency; Entropy Measure; Congestion Level.

Pp. 309-322

doi: 10.1007/11859802_26

Design of an Efficient Flexible Architecture for Color Image Enhancement

Ming Z. Zhang; Li Tao; Ming-Jung Seow; Vijayan K. Asari

A novel architecture for performing digital color image enhancement based on reflectance/illumination model is proposed in this paper. The approach promotes the log-domain computation to eliminate all multiplications, divisions and exponentiations utilizing the approximation techniques for efficient estimation of log_2 and inverse-log_2. A new quadrant symmetric architecture is also incorporated into the design of homomorphic filter to achieve very high throughput rate which is part of V component enhancement in Hue-Saturation-Value (HSV) color space. The pipelined design of the filter features the flexibility in reloading a wide range of kernels for different frequency responses. A generalized architecture of max/min filter is also presented for efficient extraction of V component. With effective color space conversion, the HSV-domain image enhancement architecture is able to achieve a throughput rate of 182.65 million outputs per second (MOPS) or equivalently 52.8 billion operations per second on Xilinx’s Virtex II XC2V2000-4ff896 field programmable gate array (FPGA) at a clock frequency of 182.65 MHz. It can process over 174.2 mega-pixel (1024×1024) frames per second and consumes approximately 70.7% less hardware resource when compared to the design presented in [10].

Palabras clave: color image enhancement; reflectance/illumination model; HSV-domain image processing; log-domain computation; 2D convolution; multiplier-less architecture; homomorphic filter; quadrant symmetric architecture; parallel-pipelined architecture.

Pp. 323-336

doi: 10.1007/11859802_27

Hypercube Communications on Optical Chordal Ring Networks with Chord Length of Three

Yawen Chen; Hong Shen; Haibo Zhang

In this paper, we study routing and wavelength assignment for realizing hypercube communications on optical WDM chordal ring networks with chord length of 3. Specifically, we design an embedding scheme and identify a lower bound on the number of wavelengths required, and provide a wavelength assignment algorithm which achieves the lower bound. Our result for this type of chordal ring is about half of that on WDM ring with the same number of nodes.

Palabras clave: Wavelength Division Multiplexing (WDM); routing and wavelength assignment(RWA); hypercube communication; chordal ring.

Pp. 337-343

doi: 10.1007/11859802_28

PMPS(3): A Performance Model of Parallel Systems

Chen Yong-ran; Qi Xing-yun; Qian Yue; Dou Wen-hua

In this paper, an open performance model framework PMPS(n) and a realization of this framework PMPS(3), including memory, I/O and network, are presented and used to predict runtime of NPB benchmarks on P4 cluster. The experimental results demonstrates that PMPS(3) can work much better than PERC for I/O intensive applications, and can do as well as PERC for memory-intensive applications. Through further analysis, it is indicated that the results of the performance model can be influenced by the data correlations, control correlations and operation overlaps and which must be considered in the models to improve the prediction precision. The experimental results also showed that PMPS(n) be of great scalability.

Palabras clave: Performance Model; Parallel; I/O; Convolution Methods.

Pp. 344-350

doi: 10.1007/11859802_29

Issues and Support for Dynamic Register Allocation

Abhinav Das; Rao Fu; Antonia Zhai; Wei-Chung Hsu

Post-link and dynamic optimizations have become important to achieve program performance. A major challenge in post-link and dynamic optimizations is the acquisition of registers for inserting optimization code in the main program. It is difficult to achieve both correctness and transparency when software-only schemes for acquiring registers are used, as described in [1]. We propose an architecture feature that builds upon existing hardware for stacked register allocation on the Itanium processor. The hardware impact of this feature is minimal, while simultaneously allowing post-link and dynamic optimization systems to obtain registers for optimization in a “safe” manner, thus preserving the transparency and improving the performance of these systems.

Palabras clave: Dynamic Optimization; Architecture Feature; Regular Mode; Register Allocation; Register Window.

Pp. 351-358

doi: 10.1007/11859802_30

A Heterogeneous Multi-core Processor Architecture for High Performance Computing

Jianjun Guo; Kui Dai; Zhiying Wang

The increasing application demands put great pressure on high performance processor design. This paper presents a multi-core System-on-Chip architecture for high performance computing. It is composed of a sparcv8-compliant LEON3 host processor and a data parallel coprocessor based on transport triggered architecture, all of which are tied with a 32-bit AMBA AHB bus. The LEON3 processor performs control tasks and the data parallel coprocessor performs computing intensive tasks. The chip is fabricated in 0.18um standard-cell technology, occupies about 5.3mm^2 and runs at 266MHz.

Palabras clave: SoC; heterogeneous; multi-core; TTA.

Pp. 359-365