- •Contents
- •1 Introduction
- •1.1 Objectives
- •1.2 Overview
- •2 Background
- •2.1 Digital Design for DSP Engineers
- •2.1.2 The Field-Programmable Gate Array
- •2.1.3 Arithmetic on FPGAs
- •2.2 DSP for Digital Designers
- •2.3 Computation Graphs
- •2.4 The Multiple Word-Length Paradigm
- •2.5 Summary
- •3 Peak Value Estimation
- •3.1 Analytic Peak Estimation
- •3.1.1 Linear Time-Invariant Systems
- •3.1.2 Data-range Propagation
- •3.2 Simulation-based Peak Estimation
- •3.3 Hybrid Techniques
- •3.4 Summary
- •4 Word-Length Optimization
- •4.1 Error Estimation
- •4.1.1 Word-Length Propagation and Conditioning
- •4.1.2 Linear Time-Invariant Systems
- •4.1.3 Extending to Nonlinear Systems
- •4.2 Area Models
- •4.3.1 Convexity and Monotonicity
- •4.4 Optimization Strategy 1: Heuristic Search
- •4.5 Optimization Strategy 2: Optimum Solutions
- •4.5.1 Word-Length Bounds
- •4.5.2 Adders
- •4.5.3 Forks
- •4.5.4 Gains and Delays
- •4.5.5 MILP Summary
- •4.6 Some Results
- •4.6.1 Linear Time-Invariant Systems
- •4.6.2 Nonlinear Systems
- •4.6.3 Limit-cycles in Multiple Word-Length Implementations
- •4.7 Summary
- •5 Saturation Arithmetic
- •5.1 Overview
- •5.2 Saturation Arithmetic Overheads
- •5.3 Preliminaries
- •5.4 Noise Model
- •5.4.1 Conditioning an Annotated Computation Graph
- •5.4.2 The Saturated Gaussian Distribution
- •5.4.3 Addition of Saturated Gaussians
- •5.4.4 Error Propagation
- •5.4.5 Reducing Bound Slackness
- •5.4.6 Error estimation results
- •5.5 Combined Optimization
- •5.6 Results and Discussion
- •5.6.1 Area Results
- •5.6.2 Clock frequency results
- •5.7 Summary
- •6 Scheduling and Resource Binding
- •6.1 Overview
- •6.2 Motivation and Problem Formulation
- •6.3 Optimum Solutions
- •6.3.1 Resources, Instances and Control Steps
- •6.3.2 ILP Formulation
- •6.4 A Heuristic Approach
- •6.4.1 Overview
- •6.4.2 Word-Length Compatibility Graph
- •6.4.3 Resource Bounds
- •6.4.4 Latency Bounds
- •6.4.5 Scheduling with Incomplete Word-Length Information
- •6.4.6 Combined Binding and Word-Length Selection
- •6.5 Some Results
- •6.6 Summary
- •7 Conclusion
- •7.1 Summary
- •7.2 Future Work
- •A.1 Sets and functions
- •A.2 Vectors and Matrices
- •A.3 Graphs
- •A.4 Miscellaneous
- •A.5 Pseudo-Code
- •References
- •Index
22 3 Peak Value Estimation
3.1.2 Data-range Propagation
If the algorithm under consideration is not linear, or is not time-invariant, then one mechanism for estimating the peak value reached by each signal is to consider the propagation of data ranges through the computation graph. This is only possible for non-recursive algorithms.
Forward Propagation
¨
A naive way of approaching this problem is to examine the binary point position “naturally” resulting from each hardware operator. Such an approach, illustrated below, is an option in the Xilinx system generator tool [HMSS01].
Consider the computation graph shown in Fig. 3.3. If we consider that each input has a range (−1, 1), then we require a binary point location of p = log2 max |(−1, 1)| + 1 = 0 at each input. Let us consider each of the adders in turn. Adder a1 adds two inputs with p = 0, and therefore produces an output with p = max(0, 0)+1 = 1. Adder a2 adds one input with p = 0 and one with p = 1, and therefore produces an output with p = max(0, 1) + 1 = 2. Similarly, the output of a3 has p = 3, and the output of a4 has p = 4. While we have successfully determined a binary point location for each signal that will not lead to overflow, the disadvantage of this approach should be clear. The range of values reachable by the system output is actually 5 (−1, 1) = (−5, 5), so p = log2 max(−5, 5) + 1 = 3 is su cient; p = 4 is an overkill of one MSB.
+ + + +
a1 |
a2 |
a3 |
a4 |
Fig. 3.3. A computation graph representing a string of additions
A solution to this problem that has been used in practice, is to propagate data ranges rather than binary point locations [WP98, BP00]. This approach can be formally stated in terms of interval analysis. Following [BP00],
Definition 3.4. An interval extension, denoted by f(x1, x2, . . . xn), of a real function f (x1, x2, . . . , xn) is defined as any function of the n intervals x1, x2,
. . . , xn that evaluates to the value of f when its arguments are the degenerate intervals x1, x2, . . . , xn, i.e.
f(x1, x2, . . . , xn) = f (x1, x2, . . . , xn) |
(3.13) |
3.1 Analytic Peak Estimation |
23 |
Definition 3.5. If xi yi, for i = 1, 2, . . . , n and f(x1, x2, . . . , xn) |
|
f(y1, y2, . . . , yn), then the interval extension f(X) is said to be inclusion monotonic.
Let us denote by fr (x1, x2, . . . , xn) the range of function f over the given intervals. We may then use the result that fr (x1, x2, . . . , xn) f(x1, x2, . . . , xn) [Moo66] to find an upper-bound on the range of the function.
Let us apply this technique to the example of Fig. 3.3. We may think of each node in the computation graph as implementing a distinct function. For addition, f (x, y) = x + y, and we may define the inclusion monotonic interval extension f((x1, x2), (y1, y2)) = (x1 + y1, x2 + y2). Then the output of adder a1 is a subset of (−2, 2) and thus is assigned p = 1, the output of adder a2 is a subset of (−3, 3) and is thus assigned p = 2, the output of adder a3 is a subset of (−4, 4) and is thus assigned p = 3, and the output of adder a4 is a subset of (−5, 5) and is thus assigned p = 3. For this simple example, the problem of peak-value detection has been solved, and indeed fr = f.
However, such a tight solution is not always possible with data-range propagation. Under circumstances where the DFG contains one or more branches (fork nodes), which later reconverge, such a “local” approach to range propagation can be overly pessimistic. As an example, consider the computation graph representing a complex constant coe cient multiplication shown in Fig. 3.4.
x1[n] |
(−0.6,0.6) |
2.1 |
(−3.42,3.42) |
|
|
+ |
|
(−0.6,0.6) |
(−1.26,1.26) |
||
|
|
|
y1[n] |
(−0.6,0.6) |
(−2.16,2.16) |
|
+(−1.2,1.2) −1.8 (−2.16,2.16)
|
(−2.16,2.16) |
|
|
(−0.6,0.6) |
|
−1 |
|
|
|
|
|
|
(−2.16,2.16) |
|
|
(−0.6,0.6) (−0.6,0.6) −1.6 |
(−0.96,0.96) |
+ |
(−3.12,3.12) |
x2[n] |
|
|
y [n] |
|
|
|
2 |
Fig. 3.4. Range propagation through a computation graph
Each signal has been labelled with a propagated range, assuming that the primary inputs have range (−0.6, 0.6). Under this approach, both outputs