Home
 Mirror Sites
 General Information
 Confernce Schedule
 Technical Program
 Tutorials
 Industry Technology Tracks
 Exhibits
 Sponsors
 Registration
 Coming to Phoenix
 Call for Papers
 Author's Kit
 On-line Review
 Future Conferences
 Help
|
Abstract: Session DISPS-1 |
|
DISPS-1.1
|
High-level Modeling of Switching Activity With Application to Low-power DSP System Synthesis
Magnus Lundberg (Lulea University of Technology, Lulea, Sweden),
Khurram Muhammad,
Kaushik Roy (Purdue University, West Lafayette, IN 47907),
Sarah Kate Wilson (Lulea University of Technology, Lulea, Sweden)
We address the issue of high-level synthesis of
low-power {\em digital signal processing} (DSP) systems
by proposing switching activity models. In particular,
we present a technology independent hierarchical scheme to
compare relative power performance of two competing DSP systems.
The basic building blocks considered for such system are a
full-adder and a one-bit delay. Estimates of switching activity at the
output of these building blocks is used to model the activity
in different architectural primitives used for building
DSP systems. This method is very fast and simple and
simulations show accuracy within 4\% of extensive bit-level
simulations. Therefore, it can easily be integrated into current
communications/DSP CAD tools for low-power applications.
The models show that the choice of multiplier/multiplicand is
important when using array multipliers in a data-path.
If the input signal with smaller variance is chosen as the
as the multiplicand, up to 20\% savings in switching
activity can be achieved. This observation is verified by
analog simulation.
|
DISPS-1.2
|
Activity Models for use in Low Power, High-Level Synthesis
Russell E. Henning,
Chaitali Chakrabarti (Dept of Electrical Engineering, Arizona State University, Tempe, AZ 85287)
Characteristics of the data being processed can be used to reduce the power consumption in the data path of a VLSI circuit by exploiting their relationship with transition activity during high-level synthesis. Important relationships between fixed-point, two's complement data characteristics and 0->1 transition activity in static CMOS circuits are presented in this paper. Models for computing transition activity in terms of a new set of transition parameters are developed. Propagation of data characteristics through multiplication and addition functional units is discussed. The use of the relationships and models to analyze and significantly reduce 0->1 transition activity with little computational effort is illustrated with examples.
|
DISPS-1.3
|
Multirate as a Hardware Paradigm
Bruce W Suter (Air Force Research Laboratory),
Kenneth S Stevens (Intel),
Scott R Velazquez (V Company),
Truong Nguyen (Boston University)
Architecture and circuit design are the two most effective means of
reducing power in CMOS VLSI. Mathematical manipulations, based on
applying the ideas from multirate signal processing have been applied
to create high performance, low power architectures. To illustrate
this approach, two case studies are presented - one concerns the design
of a fast Fourier transform(FFT) device, while the other one is concerned
with the design of analog-to-digital converters.
|
DISPS-1.4
|
Unfolding Probabilistic Data-flow Graphs Under Different Timing Models
Sissades Tongsima,
Timothy W O'Neil,
Edwin Sha (Dept. of Computer Science and Engr., University of Notre Dame)
It is known that in many applications, because of
selection statements, e.g., if-statement, the
computation time of a node can be represented by a
random variable. This paper focuses on any iterative
application (containing loops) reflecting those
uncertainties. Such an application can then be
transformed to a probabilistic data-flow graph. A
challenging problem is to derive graph transformation
techniques which can produce a good schedule. This
paper introduces two timing models, the time-invariant
and time-variant models, to characterize the nature of
these applications. Furthermore, for the time-invariant
model, we propose a means of selecting a minimum
rate-optimal unfolding factor which guarantees the
best schedule length. We also propose a good estimation
for choosing an unfolding factor for a graph under
the time-variant model.
|
DISPS-1.5
|
Low-power Channel Coding via Dynamic Reconfiguration
Manish Goel,
Naresh R Shanbhag (University of Illinois at Urbana-Champaign)
Presented in this paper are energy-optimum reconfiguration
strategies for channel codecs. These strategies are
derived by solving an optimization problem, which has
energy consumption as the objective function and a
constraint on the bit error-rate (BER). The energy
consumption models for the reconfigurable Reed-Solomon
(RS) codec are derived via gate-level simulation of the
finite field arithmetic modules. These energy models
along with the BER expressions are then employed to
derive the energy-optimum reconfiguration strategies.
The energy savings are computed by comparing the energy
consumption of the reconfigurable codec with that of
the static codec. The energy savings range from 0%-83%
for channel signal-to-noise ratio (SNR) variations from
7dB-10dB. On an average 55% energy savings are achieved.
|
DISPS-1.6
|
Closed-Form and Real-Time Wordlength Adaptation
Paul D Fiore (Sanders, A Lockheed Martin Co.),
Li Lee (Massachusetts Institute of Technology)
FPGA and configurable computing-based DSP algorithms have
demonstrated significant performance improvements over software
implementations. This has caused recent renewed interest in developing or
mapping DSP algorithms to custom hardware. An algorithm will be
successfully mapped if the intermediate wordlengths can be reduced to
maintain reasonable hardware size. In this paper, we consider linear
hardware cost functions, for which we can derive closed-form
expressions for the reduced wordlengths. We then apply these results
to an adaptive LMS filter, where we adapt not only the tap weights,
but also the wordlengths as a function of the data in real-time.
|
DISPS-1.7
|
Synthesis of DSP Soft Real-Time Multiprocessor Systems-on-Silicon
Darko Kirovski,
Miodrag Potkonjak (COMPUTER SCIENCE DEPARTMENT, UNIVERSITY OF CALIFORNIA, LOS ANGELES)
The recent convergence of applications (Internet and embedded
applications) and technology (reuse and very high integration
level) trends resulted in a strong need for design of soft real-time
DSP systems on silicon. We developed a new hierarchical modular
approach for synthesis of area efficient soft real-time DSP
systems on silicon. This synthesis strategy employs a number of
optimization intensive scheduling, performance monitoring,
and allocation steps. The backbone of the optimization approach
is a novel on-line scheduling algorithm which uses meta-algorithmic
techniques for on-the-fly heuristic selection and parameter tuning.
Resource allocation refers to a predetermined lower-bound system
performance, to perform a branch-and-bound resource allocation
search for an area-efficient multiprocessor configuration where
each processor has local instruction and data cache.
In order to bridge the gap between the profiling, modeling, and
synthesis tools of the two traditionally independent synthesis
domains (architecture and CAD), we develop a new synthesis and
evaluation platform which integrates the existing
modeling, profiling, and simulation tools with the new developed
system-level synthesis tools. The effectiveness of the approach is
demonstrated on the industrial strength MediaBench benchmark suite.
|
DISPS-1.8
|
A Generic Methodology for the Software Managing of Caches in Multi-Processors DSP Architectures
Frantz LOHIER,
Lionel Lacassagne (EIA/LIS),
Patrick Garda (LIS)
This article introduces a novel software engineering methodology designed for the real-time execution of low-level image operators running on multi-processors DSP architectures. We detail the results we gained while implementing our approach on the TMS320C80, a shared memory multi-processors architecture [1].
Our contribution compares to other existing C80's image processing libraries [2][3] in terms of genericity, flexibility, and performance improvement.
More specifically, generic mechanisms allow to address various operator's requirements as well as expanding them using a standard framework.
Our approach is flexible enough to allow for the dynamic composing of concurrent and reconfigurable processing chains thanks to a modular library implementing basic operators.
Processing chains work on various image sizes and with any number of processors.
Above all, our methodology permits performance improvement by enhancing data locality.
|
|