## **Reducing Bus Delay in Submicron Technology Using Coding**

Paul P. Sotiriadis Department of EECS Massachusetts Inst. of Technology Cambridge, MA 02139 pps@mit.edu

Abstract. In this paper we study the delay associated with transmission of data through busses. Previous work in this area has presented models for delay assuming a distributed wire model or a lumped capacitive coupling between wires. In this paper we extend the Elmore delay to account for a distributed model with distributed coupling component and an arbitrary number of lines driven by independent sources. The effect of data patterns is taken into account allowing us to estimate the delay on a sample by sample basis instead of making a worst case assumption.

Using this detailed wire delay model, we propose a technique to speed up the communication through a data bus using coding. The idea is to encode the data being transmitted through the bus with the goal of eliminating certain types of transitions that require a large delay. We show that by using proper encoding techniques, the bus can be sped up by a factor of 2.

## **1. INTRODUCTION**

Although the literature on delay estimation is quite abundant, the interaction patterns of multiple, capacitively coupled and independently driven lines have not been studied. Sakurai *et.al.* in [7] and [8] present an excellent analysis of the case of a single line and of a pair of lines. Interesting results are also available in [3] by Hendel, in [5] and [6] by Kahng *et.al.* McCormick and Allen in [4] also refer to the case of coupled lines.

In this paper the case of a general bus with m lines is examined. Instead of referring to the notions of the *aggressor* line, the *victim* line or the *noise* introduced, we try to answer the following question:

Given the set of present data in the bus  $u^o = [u_1^o, ..., u_m^o]^T$ 

(Figure 1) and the next set of data  $u^N = [u_1^N, ..., u_m^N]^T$ , how much delay does the k-th line, k = 1, 2, ..., m experience?

In this way, the delay in the k-th line is a function  $T_k(u^o, u^N)$  of  $\sim$ 

the vectors  $u^{o}$  and  $u^{N}$ . This approach allows a classification of the, old state - new state, patterns based on the delay functions. Furthermore, this classification leads to the design of coding schemes that can accelerate the data transmission through the bus.

## 2. DELAY ESTIMATION FOR BUSSES

In this section we present an electrical model for submicron busses. Based on this model we approximate the delay function loosely introduced above. Anantha Chandrakasan Department of EECS Massachusetts Inst. of Technology Cambridge, MA 02139 anantha@mtl.mit.edu

### 2.1 Coupled Bus Lines and Drivers

Figure 1 shows a model for the drivers and busses in submicron technologies. The lines are assumed capacitively coupled (distributed R-C).  $c_L$  is the parasitic capacitance per unit length between each line and ground, and  $c_I$  is the interwire capacitance per unit length between adjacent lines. r is the distributed series resistance of the lines per unit length. The drivers are modeled as voltage sources  $\mu_i$  with series resistances  $r_d^i$ .



Figure 1: Lines and Drivers

# 2.2 Elmore Delay of Coupled Lines and Multiple sources

A commonly used approximate measure of the delay of the propagation of a step excitation through a linear system is the Elmore delay [9]. For a system H(s) driven by a step input u(t) and producing an output y(t) the delay T is formally defined as the solution of the equation,

$$\int_{0}^{\infty} (t-T) \cdot \frac{dy}{dt} dt = 0$$
(1)

For the definition to be physically meaningful y(t) has to be increasing and its limit for  $t \to \infty$  must exist. Elmore delay is used

109

as a delay metric even in cases where the monotonicity condition doesn't hold. Assuming that  $y(\infty)$  exists and it is different than y(0), equation (1) implies that,

$$T = \frac{1}{y(\infty) - y(0)} \cdot \int_{0}^{t} t \cdot \frac{dy}{dt} dt$$
(2)

or in the Laplace domain,

$$T = \frac{-1}{y(\infty) - y(0)} \left. \frac{d}{ds} [sY(s)] \right|_{s = 0}$$
(3)

Normalizing the supply voltage,  $V_{dd} = 1$ , equation (3) becomes,

$$T = -[y(\infty) - y(0)] \cdot \frac{d}{ds} [sY(s)] \Big|_{s=0}$$
(4)

In the case of the data bus there is more than one source exciting the network simultaneously. Moreover, since the data is random, the pattern of the driving voltages can be arbitrary. This introduces the need for a more generalized definition of the delay.

To simplify the situation we assume that the source voltages in Figure 1 are steps of the form,

$$u_{k}(t) = \begin{cases} u_{k}^{o} \text{ for } t < 0 \\ \\ u_{k}^{N} \text{ for } t \ge 0 \end{cases}$$
(5)

where  $u_k^o, u_k^N \in \{0, 1\}$  for all k = 1, ..., m. Now let

 $V(x, t) = [V_1(x, t), \dots, V_m(x, t)]^T$  be the voltages of the lines at time t and at distance  $x \in [0, L]$  from the drivers, where L is the physical length of the lines. Then for the voltages at the ends of the lines we have that,  $V_k(L, 0) = u_k^o$  and in the limit as  $t \to \infty$ ,  $V_k(L, \infty) = u_k^N$ . Since the lines of the bus are electrically coupled, the delay at every line is a function of the initial and the final conditions  $u^o = [u_1^o, \dots, u_m^o]^T$  and  $u^N = [u_1^N, \dots, u_m^N]^T$  respectively. We define the delay function of the k-th line as,

$$T_k: \{0, 1\}^m \times \{0, 1\}^m \to [0, \infty)$$
 (6)

such that,

$$T_{k}(u^{o}, u^{N}) = -(u_{k}^{N} - u_{k}^{o}) \cdot \frac{d}{ds}[sV_{k}(L, s)]\Big|_{s=0}$$
(7)

## 2.3 Calculation of the Delay Functions

As mentioned before, the lines of the bus are assumed distributed with uniformly distributed parasitic series resistance/per unit length r, capacitance to ground per unit length  $c_L$  and interwire parasitic capacitance per unit length  $c_I$ .

Now let  $I(x, t) = [I_1(x, t), ..., I_m(x, t)]^T$  be the currents running through the lines at time t and at distance  $x \in [0, L]$  from the

drivers. The equivalent network of the bus in Figure 1 satisfies the following system of partial differential equations,

$$\frac{\partial}{\partial x}V(x,t) + R \cdot I(x,t) = 0$$

$$\frac{\partial}{\partial x}I(x,t) + C \cdot \frac{\partial}{\partial t}V(x,t) = 0$$
(8)

for all  $t \ge 0$  and  $x \in [0, L]$ . The capacitance matrix C corregional sponding to the capacitive coupling in the network (Figure 1) is,/

$$C = \begin{bmatrix} 1+\lambda & -\lambda & 0 & \dots & 0 \\ -\lambda & 1+2\lambda & -\lambda & : & 0 \\ 0 & -\lambda & : & : \\ \vdots & : & : & 1+2\lambda & -\lambda \\ 0 & 0 & \dots & -\lambda & 1+\lambda \end{bmatrix} \cdot c_L \quad (9)$$
  
here  $\lambda = \frac{c_I}{c_L}$ . The resistance matrix  $R$  is  $R = \begin{bmatrix} r & 0 & \dots & 0 \\ 0 & r & \dots & 0 \\ \vdots & \vdots & \vdots \\ 0 & 0 & \dots & r \end{bmatrix}$ .

The network of the lines satisfies the Initial Condition,

$$V(x, 0^{+}) = u^{0}$$
 (10)

for all  $x \in (0, L)$ , where  $u^o = / [u_1^o, ..., u_m^o]^T$ , and the Boundary Conditions,

$$V(0, t) + R_d \cdot I(0, t) = u^N$$
 (11)

and

w

$$(L,t) = 0 \tag{12}$$

for all 
$$t > 0$$
,  $R_d = \begin{cases} r_d^1 \ 0 \ \dots \ 0 \\ 0 \ r_d^2 \ \dots \ 0 \\ \vdots \ \vdots \ \vdots \\ 0 \ 0 \ \dots \ r_d^m \end{cases}$  where  $r_d^i$  is the output resistivity of  $r_d^i$  is the output resistivity of  $r_d^i$ .

tance of the *i-th* driver.

Let V(x, s) and I(x, s) be the Laplace transforms of V(x, t) and I(x, t) with respect to the time variable. Then, from (8) we have,

$$\frac{\partial}{\partial x} \begin{bmatrix} V \\ \tilde{I} \end{bmatrix} = \begin{bmatrix} 0 & -R \\ -Cs & 0 \end{bmatrix} \cdot \begin{bmatrix} V \\ \tilde{I} \end{bmatrix} + \begin{bmatrix} 0 \\ C \cdot V(x, 0^+) \end{bmatrix}$$
(13)

We set

$$H(s) = \begin{bmatrix} 0 & -R \\ -Cs & 0 \end{bmatrix}$$
(14)

and use (13), (10) and (14) to get,

$$\frac{\partial}{\partial x} \begin{bmatrix} \tilde{V} \\ \tilde{I} \end{bmatrix} = H(s) \cdot \begin{bmatrix} \tilde{V} \\ \tilde{I} \end{bmatrix} + \begin{bmatrix} 0 \\ C \cdot u^{o} \end{bmatrix}$$
(15)

In the Laplace domain the boundary conditions (11) and (12) become,

$$\tilde{V}(0,s) + R_d \cdot \tilde{I}(0,s) = \frac{1}{s} \cdot u^N$$
 (16)

and

$$I(L,s) = 0 \tag{17}$$

Let  $e^{H(s) \cdot L}$  be the  $2m \times 2m$  matrix exponential associated with equation (15). From the definition of the exponential we have,

$$e^{H(s) \cdot L} = \sum_{k=0}^{\infty} \frac{L^{k}}{k!} \cdot H^{k}(s)$$
  
=  $\sum_{k=0}^{\infty} \frac{L^{2k}}{(2k)!} \cdot \begin{bmatrix} RC & 0 \\ 0 & CR \end{bmatrix}^{k} \cdot s^{k} +$  (18)  
+  $\sum_{k=0}^{\infty} \frac{L^{2k+1}}{(2k+1)!} \cdot \begin{bmatrix} 0 & -R \\ -Cs & 0 \end{bmatrix} \cdot \begin{bmatrix} RC & 0 \\ 0 & CR \end{bmatrix}^{k} \cdot s^{k}$ 

We can decompose the exponential into  $m \times m$  blocks as,

$$e^{H(s) \cdot L} = \begin{bmatrix} \alpha(s) \ \beta(s) \\ \gamma(s) \ \delta(s) \end{bmatrix}$$
(19)

where  $\alpha,\,\beta,\,\gamma,\,\delta\,$  are analytic matrix functions of the form,

$$\alpha(s) = I + \frac{L^2}{2}RCs + O(s^2)$$
  

$$\beta(s) = -LR - \frac{L^3}{3!}RCRs + O(s^2)$$
  

$$\gamma(s) = -LCs + O(s^2)$$
  

$$\delta(s) = I + \frac{L^2}{2}CRs + O(s^2)$$
(20)

Equations (15) form a system of first order linear differential equations with constant coefficients, so their solution ([1]) evaluated at x = L is given by,

$$\begin{bmatrix} \overline{V}(L,s) \\ \overline{I}(L,s) \end{bmatrix} = e^{H(s) \cdot L} \left\{ \begin{bmatrix} \overline{V}(0,s) \\ \overline{I}(0,s) \end{bmatrix} + H(s)^{-1} \begin{bmatrix} 0 \\ C \cdot u^{o} \end{bmatrix} \right\} - -H(s)^{-1} \begin{bmatrix} 0 \\ C \cdot u^{o} \end{bmatrix}$$
(21)

Note that for  $s \neq 0$  the matrix H(s) is invertible and we have that,

$$(H(s))^{-1} = \begin{bmatrix} 0 & -s^{-1} \cdot C^{-1} \\ -R^{-1} & 0 \end{bmatrix}$$
(22)

Now using (14) and (19), equation (21) simplifies to,

$$\begin{bmatrix} \tilde{V}(L,s)\\ \tilde{I}(L,s) \end{bmatrix} = \begin{bmatrix} \alpha & \beta\\ \gamma & \delta \end{bmatrix} \cdot \left\{ \begin{bmatrix} \tilde{V}(0,s)\\ \tilde{I}(0,s) \end{bmatrix} - \begin{bmatrix} \frac{1}{s}u^{o}\\ 0 \end{bmatrix} \right\} + \begin{bmatrix} \frac{1}{s}u^{o}\\ 0 \end{bmatrix}$$
(23)

From (16), (17) and (23) we have equation (24)

$$\begin{bmatrix} I & R_d \\ \gamma(s) & \delta(s) \end{bmatrix} \cdot \begin{bmatrix} \tilde{V}(0, s) \\ \tilde{I}(0, s) \end{bmatrix} = \frac{1}{s} \begin{bmatrix} u^N \\ \gamma(s) \cdot u^o \end{bmatrix}$$
(24)

which implies that,

\_

$$\begin{bmatrix} \tilde{V}(0,s)\\ \tilde{I}(0,s) \end{bmatrix} = \frac{1}{s} \cdot \begin{bmatrix} u^{N}\\ 0 \end{bmatrix} + \frac{1}{s} \cdot \begin{bmatrix} R_{d}(\delta - \gamma R_{d})^{-1}\gamma\\ -(\delta - \gamma R_{d})^{-1}\gamma \end{bmatrix} \cdot (u^{N} - u^{o}) \quad (25)$$

Replacing (25) into (23) we get,

$$\tilde{V}(L,s) = \frac{1}{s} \{ u^o + [\alpha + (\alpha R_d - \beta)(\delta - \gamma R_d)^{-1} \gamma](u^N - u^o) \}$$

which through (20) implies,

$$\frac{d}{ds}[\tilde{sV}(L,s)]\Big|_{s=0} = -\left(R_d + \frac{L}{2} \cdot R\right)LC(u^N - u^o)$$
(26)

We define the total resistance and capacitance *matrices* of the circuit as,

$$R_T \equiv R_d + \frac{L}{2} \cdot R \tag{27}$$
$$C_T \equiv L \cdot C$$

Then from (7), (26) and (27), it is,

$$T_{k}(u^{o}, u^{N}) = (u_{k}^{N} - u_{k}^{o}) \cdot e_{k}^{T} \cdot R_{T} \cdot C_{T} \cdot (u^{N} - u^{o})$$
(28)

where,  $e_k^T$  is the row vector with 1 in the *i*-th coordinate and 0 everywhere else. Equation (28) can also be written in the vector form,

$$T(u^{o}, u^{N}) = diag(u^{N} - u^{o}) \cdot R_{T} \cdot C_{T} \cdot (u^{N} - u^{o})$$
(29)

If we make the assumption that the output resistances of all the drivers are the same and independent of their logical outputs, i.e.  $r_d^i = r_d$  for every i = 1, ..., m, then (29) is simplified to,

 $r_d$  for every r = r, ..., m, then (29) is simplified to,

$$T(u^{o}, u^{N}) = diag(u^{N} - u^{o}) \cdot C_{T} \cdot (u^{N} - u^{o}) \cdot r_{T}$$
(30)

where  $r_T = r_d + \frac{L}{2} \cdot r$ . Finally, note that in this case, the sum of the delays in the bus is given by the quadratic form:

$$\sum_{k=1}^{m} T_{k}(u^{o}, u^{N}) = (u^{N} - u^{o})^{T} \cdot C_{T} \cdot (u^{N} - u^{o}) \cdot r_{T}$$
(31)

The results were verified with Hspice. The numbers  $0.69 \times T_k(u^o, u^N)$  were within 16% accuracy of the measured 50% propagation delays of the waveforms  $V_k(L, t)$ .

## 3. DELAY AND ENERGY: A RELATION

From ref. [2] we know that the energy drawn from  $V_{dd}$  during the transition  $u^o \rightarrow u^N$  is given by the non-symmetric quadratic form,

$$E_{tr}(u^{o}, u^{N}) = (u^{N})^{T} \cdot C_{T} \cdot (u^{N} - u^{o})$$
(32)

The total energy stored in the capacitances of the lines before the transition is given by,  $E_C(u^o) = \frac{1}{2} \cdot (u^o)^T \cdot C_T \cdot (u^o)$  and after the transition the transition by,  $E_C(u^N) = \frac{1}{2} \cdot (u^N)^T \cdot C_T \cdot (u^N)$ . The energy  $E_{tr}(u^o, u^N)$  drawn from  $V_{dd}$  must equal the change in the energy stored in the capacitances plus the energy  $E_R(u^o, u^N)$  that is dissipated into heat on the resistances. Therefore,

$$E_{R}(u^{o}, u^{N}) = E_{Ir}(u^{o}, u^{N}) - \{E_{C}(u^{N}) - E_{C}(u^{o})\}$$
(33)

From (33) we have that,

$$E_{R}(u^{o}, u^{N}) = \frac{1}{2} \cdot (u^{N} - u^{o})^{T} C_{T} \cdot (u^{N} - u^{o})$$
(34)

Comparing (31) with (34), we get the following relation between the dissipated energy and the sum of the delays,

$$\sum_{k=1}^{m} T_{k}(u^{o}, u^{N}) = 2 \cdot r_{T} \cdot E_{R}(u^{o}, u^{N})$$
(35)

(35) can be written as,

$$\frac{1}{n} \cdot \sum_{k=1}^{m} T_{k}(u^{o}, u^{N}) = 2 \cdot r_{T} \cdot \frac{E_{R}(u^{o}, u^{N})}{m}$$
(36)

and so,

## 4. PROPERTIES OF THE DELAY FUNCTIONS

Following the assumption about the resistances of the drivers, equation (30) can be written more explicitly as,

$$\frac{T_k(u^o, u^N)}{r_T \cdot C_L} = \begin{cases} (1+\lambda)\Delta_1^2 - \lambda\Delta_1\Delta_2 &, k=1\\ (1+2\lambda)\Delta_k^2 - \lambda\Delta_k(\Delta_{k-1} + \Delta_{k+1}), 1 < k < m\\ (1+\lambda)\Delta_m^2 - \lambda\Delta_m\Delta_{m-1} &, k=m \end{cases}$$

where  $\Delta_k$  is the change of the voltage of the *k-th* line, i.e.  $\Delta_k = u_k^N - u_k^o$  and  $C_L = L \cdot c_L$  is the total capacitance between a line and the ground. Since  $u_k^N, u_k^o \in \{0, 1\}$ , it is  $\Delta_k \in \{-1, 0, 1\}$ . Table 1 shows the delay in the *k*-th line as a function of  $\Delta_{k-1}, \Delta_k$  and  $\Delta_{k+1}$ , where k = 2, 3, ..., m-1. We use the upward arrow  $\uparrow$  for the case  $\Delta_i = 1$ , the downward arrow  $\downarrow$  when  $\Delta_i = -1$  and "-" when  $\Delta_i = 0$ . The possible normalized delay values of an intermediate line are, 0, 1,  $1 + \lambda$ ,  $1 + 2\lambda$ ,  $1 + 3\lambda$  and  $1 + 4\lambda$ . Each of these values corresponds to a set of transitions  $[u_{k-1}^o, u_{k}^o, u_{k+1}^o] \rightarrow [u_{k-1}^N, u_{k}^N, u_{k+1}^N]$ .

| $\frac{T_k(u^o, u^N)}{r-C}$ |              |                |                    |  |  |  |  |  |
|-----------------------------|--------------|----------------|--------------------|--|--|--|--|--|
| Line $k-1$                  | Line k       | Line<br>k+1    | Delay of<br>line k |  |  |  |  |  |
| $\Delta_{k-1}$              | $\Delta_k$   | $\Delta_{k+1}$ | $T_k/(r_T C_L)$    |  |  |  |  |  |
| -                           | -            | -              | 0                  |  |  |  |  |  |
| -                           | -            | Ŷ              | 0                  |  |  |  |  |  |
| -                           | -            | $\downarrow$   | 0                  |  |  |  |  |  |
| Ŷ                           | -            | -              | 0                  |  |  |  |  |  |
| <b>↑</b>                    | -            | <b>↑</b>       | 0                  |  |  |  |  |  |
| ↑                           | -            | Ļ              | 0                  |  |  |  |  |  |
| $\downarrow$                | -            | -              | 0                  |  |  |  |  |  |
| $\downarrow$                | -            | <b>↑</b>       | 0                  |  |  |  |  |  |
| ↓                           | · -          | $\downarrow$   | 0                  |  |  |  |  |  |
| ^                           | $\uparrow$   | ↑              | 1                  |  |  |  |  |  |
| $\downarrow$                | Ļ            | Ļ              | 1                  |  |  |  |  |  |
| -                           | <u></u>      | <b>↑</b>       | $1 + \lambda$      |  |  |  |  |  |
| -                           | Ļ            | ↓ I            | $1 + \lambda$      |  |  |  |  |  |
| <br>↑                       | ↑            | -              | $1 + \lambda$      |  |  |  |  |  |
| Ļ                           | Ļ            | -              | $1 + \lambda$      |  |  |  |  |  |
| -                           | $\uparrow$   | -              | $1 + 2\lambda$     |  |  |  |  |  |
| -                           | $\downarrow$ | -              | 1 + 2λ             |  |  |  |  |  |
| <b>↑</b>                    | <b>↑</b>     | Ļ              | $1+2\lambda$       |  |  |  |  |  |
| Ŷ                           | Ļ            | $\downarrow$   | $1+2\lambda$       |  |  |  |  |  |
| $\downarrow$                | <u>↑</u>     | <b>↑</b>       | 1 + 2λ             |  |  |  |  |  |
| $\downarrow$                | ↓            | Ŷ              | 1 + 2λ             |  |  |  |  |  |
| -                           | 1            | ↓              | $1 + 3\lambda$     |  |  |  |  |  |
| -                           | ↓            | <b>↑</b>       | $1+3\lambda$       |  |  |  |  |  |
| Î                           | $\downarrow$ | -              | $1+3\lambda$       |  |  |  |  |  |
| ↓                           | <b>↑</b>     | -              | 1 + 3λ             |  |  |  |  |  |
| <br>↑                       | Ļ            | <b>↑</b>       | 1 + 4λ             |  |  |  |  |  |
| ↓                           | Ŷ            | $\downarrow$   | 1 + 4λ             |  |  |  |  |  |

Table 1: The Delay Function of Intermediate Lines

| $\frac{T_1(u^o, u^N)}{r_T \cdot C_L}$ |                |                    |  |  |  |  |  |
|---------------------------------------|----------------|--------------------|--|--|--|--|--|
| Line 1                                | Line 2         | Delay of<br>line 1 |  |  |  |  |  |
| Δ                                     | Δ <sub>2</sub> | $T_1/(r_T C_L)$    |  |  |  |  |  |
| -                                     | -              | 0                  |  |  |  |  |  |
| -                                     | <b>↑</b>       | 0                  |  |  |  |  |  |
| -                                     | Ļ              | 0                  |  |  |  |  |  |
| <b>↑</b>                              | <b>↑</b>       | 1                  |  |  |  |  |  |
| ↓                                     | ↓              | 1                  |  |  |  |  |  |
| 1                                     | -              | $1 + \lambda$      |  |  |  |  |  |
| ↓                                     | -              | $1 + \lambda$      |  |  |  |  |  |
| <b>↑</b>                              | Ļ              | $1+2\lambda$       |  |  |  |  |  |
| Ļ                                     | <b>↑</b>       | $1+2\lambda$       |  |  |  |  |  |

For the boundary lines l and m we have Table 2. (lines l, 2 can be replaced by  $m, m \cdot l$  respectively).

Table 2: The Delay Function of the Boundary Lines

Here the possible values of the normalized delays are, 0, 1, 1+ $\lambda$ , and 1+2 $\lambda$ . Each of these values corresponds to a set of transitions  $[u_1^o, u_2^o] \rightarrow [u_1^N, u_2^N]$ , (or  $[u_m^o, u_{m-1}^o] \rightarrow [u_m^N, u_{m-1}^N]$ ).

We define the following classes  $D_{00}$ ,  $D_0$ ,  $D_1$ ,  $D_2$ ,  $D_3$  and  $D_4$  of transition patterns,  $u^o \rightarrow u^N$  as,

$$D_{00} = \left\{ all (u^{o}, u^{N}) \text{ such that } \frac{T_{k}(u^{o}, u^{N})}{r_{T} \cdot C_{L}} = 0 \right\}$$

$$D_{r} = \left\{ all (u^{o}, u^{N}) \text{ such that } \frac{T_{k}(u^{o}, u^{N})}{r_{T} \cdot C_{L}} = 1 + r\lambda \right\}$$

$$r = 0, 1, 2, 3, 4$$
(37)

For example, if the bus has only 3 lines, i.e.  $u^0, u^N \in \{0, 1\}^3$  we have the delay pattern of Table 3.

## 5. CODING FOR SPEED

In the traditional operation of data busses, the clock period  $T_c$  is sufficiently large so that all the transitions in the bus have enough time to be completed. In other words it must be that,

$$T_c \ge \eta \cdot (1 + 4\lambda) \tag{38}$$

where  $\eta$  is a technology parameter.

The analysis above suggests that we could use a smaller  $T_c$ , i.e. speed up the bus, if we could avoid time-expensive transitions. For

example if only transitions of the classes  $D_{00}$ ,  $D_0$ ,  $D_1$ ,  $D_2$  were allowed, then inequality (38) could be replaced by inequality (39).

| De<br>Clas                | lay<br>ss of | u <sup>N</sup>        |                       |                 |                       |                       |                       |                       |                  |
|---------------------------|--------------|-----------------------|-----------------------|-----------------|-----------------------|-----------------------|-----------------------|-----------------------|------------------|
| $u^{o} \rightarrow u^{N}$ |              | 000                   | 001                   | 010             | 011                   | 100                   | 101                   | 110                   | 111              |
| u                         | 000          | D <sub>00</sub>       | <i>D</i> <sub>1</sub> | $D_2$           | <i>D</i> <sub>1</sub> | <i>D</i> <sub>1</sub> | <i>D</i> <sub>1</sub> | $D_1$                 | D <sub>0</sub>   |
|                           | 001          | <i>D</i> <sub>1</sub> | D <sub>00</sub>       | D <sub>3</sub>  | D <sub>2</sub>        | $D_1$                 | <i>D</i> <sub>1</sub> | $D_2$                 | $\overline{D}_1$ |
|                           | 010          | <i>D</i> <sub>2</sub> | $D_3$                 | D <sub>00</sub> | <i>D</i> <sub>1</sub> | <i>D</i> <sub>3</sub> | D <sub>4</sub>        | D                     | D <sub>1</sub>   |
|                           | 011          | <i>D</i> <sub>1</sub> | <i>D</i> <sub>2</sub> | $D_1$           | D <sub>00</sub>       | $D_2$                 | <i>D</i> <sub>3</sub> | $D_1$                 | $D_1$            |
|                           | 100          | <i>D</i> <sub>1</sub> | <i>D</i> <sub>1</sub> | D <sub>3</sub>  | D <sub>2</sub>        | D <sub>00</sub>       | $D_1$                 | <i>D</i> <sub>2</sub> | $D_1$            |
|                           | 101          | $D_1$                 | $D_1$                 | D <sub>4</sub>  | D <sub>3</sub>        | $D_1$                 | D <sub>00</sub>       | $D_3$                 | $D_2$            |
|                           | 110          | <i>D</i> <sub>1</sub> | $D_2$                 | $D_1$           | $D_1$                 | $D_2$                 | <i>D</i> <sub>3</sub> | D <sub>00</sub>       | $D_1$            |
|                           | 111          | D <sub>0</sub>        | D <sub>1</sub>        | $D_1$           | $D_1$                 | <i>D</i> <sub>1</sub> | <i>D</i> <sub>2</sub> | <i>D</i> <sub>1</sub> | D <sub>00</sub>  |

Table 3: Delay Classes in  $\{0, 1\}^3$ 

$$T_c \ge \eta \cdot (1+2\lambda) \tag{39}$$

This means that for large values of  $\lambda$  the speed of the bus can almost double. Of course, by not allowing some transitions we automatically reduce the rate of information through the bus.

Here is a concrete example. Suppose for simplicity that the bus has m = 4 lines and let *TR*<sub>2</sub> be the set of all transitions that have (normalized) delay 0, 1,  $1 + \lambda$  or  $1 + 2\lambda$ . By definition, it is *TR*<sub>2</sub> =  $D_{00} \cup D_0 \cup D_1 \cup D_2$ . *TR*<sub>2</sub> is shown in Table 4 where the dots stand for the allowed transitions and the **x**'s for the forbid-den ones.



Table 4: TR2 for 4-bit bus

Although TR2 does not have any regular pattern, all the possible transitions are allowed among the states 0,1,3,6,7,8,9,12,14 and 15. If the set of states is reduced to  $\{0, 1, 3, 6, 7, 8, 9, 12, 14, 15\}$  then the worst case delay is only  $1 + 2\lambda$ . In this case, the number of bits that can be transmitted each time is decreased from four to  $log_2(10)$  which is about 3.3 bits. Also the speed has been increased by a factor of  $\frac{1+4\lambda}{1+2\lambda}$ . This ratio is about 1.85 in 0.18 $\mu$ 

 $1 + 2\lambda$ technology. On the other hand, the number of bits per transition has been decreased by a factor of  $\frac{4}{3.3} \approx 1.21$ . So the net result is

about 1.53 times the initial data rate. The encoder and decoder needed for this example are very simple static maps.

Now, if the set of states is further reduced to  $\{0, 1, 6, 7, 8, 9, 14, 15\}$ , then exactly 3 bits per transition (an integer number of bits) are possible. This makes the encoding - decoding scheme trivial and the net result is about 1.4 times the initial data rate.



Figure 2: Coding for Speed, an Example

### 6. CONCLUSIONS

The functions of the delays of the signals in the lines of general busses were estimated. The properties of these functions were studied and the interaction patterns among the lines causing delay were revealed. This allowed a classification of the data transitions according to the time they need to take place. Finally, the possibility of coding for speed was presented through an example.

#### 7. ACKNOWLEDGMENTS

This paper acknowledges support from the MARCO Focused Research Center on Interconnects which is funded at the Massachusetts Institute of Technology, through a subcontract from the Georgia Institute of Technology. Paul Sotiriadis is partially supported by the Alexander S. Onassis Public Benefit Foundation, the Greek Section of Scholarships and Research.

### 8. REFERENCES

- [1] G. Birkhoff and G.C. Rota, "Ordinary Differential Equations", *New York, Wiley 1978.*
- [2] P. Sotiriadis, A. Wang and A. Chandrakasan, "Transition Pattern Coding: An approach to reduce Energy in Interconnect", ESSCIRC'2000, Stockholm, Sweden.

- [3] C.G. Lin-Hendel, "Accurate Interconnect Modeling for High Frequency LSI/VLSI Circuits and Systems", Computer Design: VLSI in Computers and Processors, 1990. ICCD ' 90.
- [4] S. McCormick and J. Allen, "Waveform Moment Methods for Improved Interconnection Analysis", 27th ACM/IEEE Design Automation Conference, 1990.
- [5] A. Kahng, K. Masuko and S. Muddu, "Delay Models for MCM Interconnects When Response in Non-monotone", Multi-Chip Module Conference, 1997. MCMC ' 97., 1997 IEEE.
- [6] A. Kahng and S. Muddu, "An Analytical Delay Model for RLC Interconnects", *IEEE Trans. on Computer-aided design* of integrated circuits and systems, Vol. 16 NO.12 Dec. 1997.
- [7] T. Sakurai, "Closed-Form Expressions for Interconnect Delay, Coupling and Crosstalk in VLSI's", *IEEE Trans. on Electron Devices*, Vol. 40, No. 1, Jan. 1993.
- [8] T. Sakurai, S. Kobayashi and M. Noda, "Simple Expressions for Interconnection Delay", *IEEE International Symposium* on Circuits and Systems, 1991.
- [9] W. Elmore, "The transient Response of Damped Linear Network with Particular Regard to Wide-band Amplifier", *Journal of Applied Physics, Vol. 19, pp. 55-63, 1948*
- [10] G. Efthivoulidis, personal communication.
- [11] M. Ghausi and J. Kelly, "Introduction to Distributed-Parameter Networks", N. York, Holt, Rinehart and Winston, 1968