Chapter 2
Fundamentals of Adiabatic Logic

2.1 The Charging Process in Adiabatic Logic Compared to Static CMOS

First the energy dissipation caused by switching of a simple CMOS inverter as shown in Fig. 2.1 is observed. The capacitor \( C \) at the output of the gate represents the input capacitance of following gates. Depending on the input signal, in steady-state either the PMOS device or the NMOS device is on, the remainder is off. If an input transition from 1 to 0 occurs, energy is transferred from the voltage source to charge the output capacitor to the voltage \( V_{DD} \). A charge of \( Q = CV_{DD} \) is taken from the voltage source, an energy quantum of

\[
E_{V_{DD}} = QV_{DD} = CV_{DD}^2
\]  

(2.1)

is withdrawn from the voltage source. The energy stored on the capacitor at the voltage \( V_{DD} \) is equal to

\[
E_C = \frac{1}{2} CV_{DD}^2.
\]  

(2.2)

The difference between the delivered energy and the stored energy is dissipated in the PMOS switch. Now, if the input switches from 0 to 1, in steady-state condition the NMOS channel is on, the PMOS off. Charge stored on the output capacitance is then dissipated via the NMOS device to ground. The energy dissipation of a switching event in a static CMOS gate is given as

\[
E_{CMOS} = \alpha \frac{1}{2} CV_{DD}^2,
\]  

(2.3)

where \( \alpha \) is the switching probability, as there is no dissipation (except leakage losses) in static CMOS gates, if there is no switching event at all. Different approaches are useful to reduce the energy dissipation in static CMOS. Reducing the number of transitions needed for a computation of a certain task can be done on algorithmic, on structural and on circuit level [26]. Reducing the capacitive load...
Fundamentals of Adiabatic Logic

**Fig. 2.1** Schematic of a static CMOS inverter

**Fig. 2.2** An ECRL buffer and an exemplary scheme of the signals in the gate in operation

\[ V_{DD} \]

\[ \text{in} \rightarrow \text{out} \]

\[ C \]

\[
\text{C} \text{ is strongly limited by the technology and its intrinsic device capacitance. But wiring capacitance can be reduced by choosing a proper architecture and a carefully designed layout. Reducing the voltage supply } V_{DD} \text{ is a very powerful method to reduce the power dissipation, but as downside the performance is degraded. Nevertheless, (2.3) is the lower bound for the dissipation per switching event.}

In contrast Adiabatic Logic does not abruptly switch from 0 to \( V_{DD} \) (and vice versa), but a voltage ramp is used to charge and recover the energy from the output. The principle of operating an adiabatic gate is presented for a buffer gate in the Efficient Charge Recovery Logic (ECRL, [19]) in Fig. 2.2. The gate consists of two cross-coupled PMOS devices that are used to store the information. The logic function is constructed via two NMOS devices. Cascaded gates are operated by a four-phase power-clock signal that is presented in Sect. 2.2.2. Input signals for the ECRL gate in Fig. 2.2 are shifted by 90° with respect to the applied power-clock signal.

Now for instance it is assumed, that input \( \text{in} \) is at logic one and the dual input \( \overline{\text{in}} \) is at zero. Then the NMOS device N1 will conduct and connect \( \text{out} \) to ground, while N2 is disabled. As soon as the power-clock \( \phi \) ramped from 0 to \( V_{DD} \) reaches the threshold voltage \( V_{th,p} \) of the PMOS device, P2 will be turned on. Thus the output signal \( \text{out} \) will follow the power-clock \( \phi \). Now the gate voltage of device P1 is equal to the supply voltage, the gate-to-source voltage is zero, thus this device stays disabled. As soon as \( \phi \) reaches the maximum level \( V_{DD} \) the input signals are ramped down, as the preceding gate recovers the energy at this time. The PMOS devices will take care of storing the information while both NMOS devices are disabled. Then the power-clock is descending from \( V_{DD} \) to 0. While \( \phi \) is above \( V_{th,p} \) charge from the output \( \text{out} \) is restored to \( \phi \). A certain fraction of energy \( \frac{1}{2} C_{\text{out}} V_{th,p}^2 \) remains
2.1 Charging Process in AL Compared to Static CMOS

Fig. 2.3 Equivalent circuit to determine the losses by adiabatically loading a capacitance

on the according output capacitance that is dissipated or reused in the next cycle, according to the succeeding input signals.

To calculate the energy consumed by charging a capacitance adiabatically, the equivalent circuit in Fig. 2.3 for an adiabatic gate is used.

$R$ is the resistance in the charging path of the circuit, consisting of the on-resistance of transistors in the charging path and the sheet resistance of the signal line. For the observations of the energy dissipation $R$ is considered to be constant.

The voltage is ramped from 0 to $V_{DD}$ within $T$, slow enough that $v_C(t)$ is able to follow signal $v(t)$ instantly, so $v_C(t) \approx v(t)$. Therefore, the current into the circuit can be determined by

$$i(t) = C \frac{dv(t)}{dt} = \frac{CV_{DD}}{T}. \quad (2.4)$$

The energy for a charging event is calculated by integrating the power $p(t)$ during the transition time $T$:

$$E = \int_0^T p(t)dt = \int_0^T v(t) \cdot i(t)dt = \int_0^T (v_R(t) + v_C(t)) \cdot i(t)dt. \quad (2.5)$$

The integral of $v_C(t) \cdot i(t)$ over one clock cycle will be zero, as no energy is dissipated in the capacitance. Thus by replacing the voltage $v_R(t)$ in (2.5) with $i(t) \cdot R$ and inserting (2.4) into (2.5) results in

$$E = \int_0^T R \frac{C^2V_{DD}^2}{T^2} dt = \frac{RC}{T} CV_{DD}^2. \quad (2.6)$$

A whole cycle consists of charging and recovering. As the recover process will lead to the same amount of energy dissipation, the overall dissipation in Adiabatic Logic (AL) is

$$E_{AL} = 2 \frac{RC}{T} CV_{DD}^2. \quad (2.7)$$

Observing (2.7) shows that the operating speed impacts the energy dissipation. The slower the circuit is charged, the less energy is dissipated. The opportunity to further reduce the consumption by scaling the supply voltage or by reduction of the capacitive load also exists in Adiabatic Logic. In contrast to static CMOS the size of the switch transistor also has an effect on the energy dissipation, as $R$ is found in the equation for the energy dissipation in Adiabatic Logic. If (2.3) and (2.7) are opposed, a minimum for the transition time $T$ can be found, up to which adiabatic circuits are more energy efficient than static CMOS circuits, it is $T > 4 \frac{RC}{\alpha}$. In static
CMOS during one cycle the gates output either stays constant, switches from 0 to 1 or from 1 to 0. The activity factor $\alpha$ in the expression $T > 4 \frac{RC}{\alpha}$ reveals that applications with a moderate to high activity factor are suitable for the operation with AL. Otherwise static CMOS is superior, as it doesn’t suffer from losses in a steady input state as long as leakage losses are neglected.

2.1.1 The Definition of the Energy Saving Factor (ESF)

Comparing static CMOS and Adiabatic Logic with respect to their energy dissipation calls for the definition of the Energy Saving Factor (ESF). It is a measure for how much more energy is used in a static CMOS gate or system with respect to an Adiabatic Logic counterpart. The precise definition of the ESF depends on the considered hierarchical level. If the efficiency of an Adiabatic Logic family shall be compared with respect to static CMOS the ESF compares the losses in a single gate. On system level also the generation of the supply voltage in static CMOS and the power-clock in AL and losses due to layout parasitics have to be included in the calculation for the ESF. A general definition of the ESF is

$$\text{ESF} = \frac{\sum_{CMOS} E}{\sum_{AL} E}.$$  \hspace{1cm} (2.8)

All the energy dissipation fractions under consideration have to be summed up. An explanation has to be given at the time where the ESF is used, whether gate level comparison or a comparison on system level is performed.

2.2 An Adiabatic System

Each adiabatic system consists of two main parts, the digital core design made up of adiabatic gates and the generator of the power-clock signals. Two adiabatic families are used in this work, both are shortly introduced in Sect. 2.2.1. Resultant design considerations due to the inherent properties of these Adiabatic Logic families are explained in detail in Sect. 2.5. The power-clock generation is a very important topic in adiabatic systems, as an efficient generation of the four phases making up the power-clock is essential to get high overall saving factors. The four-phase power-clock used in the adiabatic families in this work is presented in Sect. 2.2.2.

2.2.1 Introducing Adiabatic Logic Families Used in This Work

Two Adiabatic Logic families are used in the investigations in the presented work. One is the Positive Feedback Adiabatic Logic (PFAL [18]), and the other is the
2.2 An Adiabatic System

Efficient Charge Recovery Logic (ECRL [19]). Both share the property, that they are operated with a four-phase power-clock. PFAL consists of a latch element formed by two cross-coupled inverters to store the output state when the input signals are ramped down. ECRL, based on the Cascode Voltage Switch Logic (CVSL [27]), uses a cross-coupled PMOS pair as latching element. Logic blocks constructed of NMOS transistors only are used for PFAL and ECRL. As both families use identical function blocks, design procedures presented for CVSL [28] can be used for ECRL and also for PFAL. Logic blocks are connected from the power-clock $\phi$ to the output nodes for PFAL, and from the output to GND for ECRL.

As example, an inverter is sketched for PFAL (Fig. 2.4a) and ECRL (Fig. 2.4b). For more complex gates the logic block transistors $N_F$ and $N_{\overline{F}}$ are replaced by logic function blocks. If e.g. a NAND gate has to be constructed, a series connection of two transistors is use instead of $N_F$, using $A$ and $B$ as input vectors. A dual block composed of two parallel transistors, having $\overline{A}$ and $\overline{B}$ as inputs, is connected at the position of $N_F$.

2.2.2 The Four-Phase Power-Clock

Adiabatic Logic circuits are operated with an oscillating power-supply, the so-called power-clock. Depending on the regarded adiabatic family, more than one power-clock signal is used to operate an system consisting of Adiabatic Logic gates. In this work adiabatic families are employed, that use a four-phase power-clock $\phi_0$-$\phi_3$ (Fig. 2.5).

Each power-clock cycle consists of four intervals. In the evaluate (E) interval, the outputs are evaluated from the stable input signals. During the hold (H) interval, outputs are kept stable for supplying the subsequent gate with a stable input signal. Energy is recovered in the interval called recover (R). And for symmetry reasons a wait (W) interval is inserted, as symmetric signals are easier and more efficient to be generated. Data in adiabatic systems is processed in a pipeline fashion, data is handed over as shown in Fig. 2.5. Valid data words 1, 2, 3 and 4 are sketched in phase $\phi_0$. Data word 1 is transferred during the H interval of $\phi_0$ and while $\phi_1$ is in E.
It is processed by the logical function given in the succeeding gate and valid at the outputs as $1^*$ for further processing in the next gates. As mentioned before, signals have to be kept constant during $E$, therefore a $90^\circ$ phase shift between subsequent phases is obtained. In a pipeline, subsequent gates have to be connected to the right phases in order to guarantee a transfer of valid input data.

### 2.3 Loss Mechanisms in Adiabatic Logic

In an ideal adiabatic system losses are expected to follow (2.7), but shrinking devices into the sub-$\mu$m regime and the non-existence of zero-$V_{th}$ transistors lead to additional loss mechanisms. These effects can dominate the energy consumption and also exhibit a lower bound for the energy dissipation. With ongoing shrinking, leakage currents gain more impact on the overall dissipation of static CMOS gates. One of the dominant leakage currents is the so-called sub-threshold current. It is expressed by [29]

$$I_D = I_D 0 e^{\frac{V_{GS}-V_{th}}{nV_T}} \left(1 - e^{\frac{-V_{DS}}{V_T}}\right),$$

(2.9)

where $V_T$ is the thermal voltage, $V_{th}$ is the threshold voltage of the device and $V_{GS}$ and $V_{DS}$ are the terminal voltages. As long as $V_{DS}$ is zero, no leakage current will flow. Only for values of $V_{DS}$ that are multiples of the thermal voltage, the leakage increases to its maximum value. Besides that, also a junction leakage exists and in state-of-the-art CMOS processes leakage currents tunnel through the thin gate oxide.

In Adiabatic Logic, during evaluation, hold and recovery, leakage currents flow from the voltage supply to ground, leading to dissipation of charge that cannot be recovered. All leakage mechanism can be summarized in a mean current $I_{leak}$, that
2.3 Loss Mechanisms in Adiabatic Logic

Fig. 2.6 $E_{AL}$ are proportional, leakage losses $E_{leak}$ are inverse proportional to the frequency and the non-adiabatic losses are independent of the frequency. An optimum frequency exists for Adiabatic Logic circuits, as can be seen from the overall losses $\Sigma$

leads to the energy consumption per cycle of

$$E_{leak} = V_{DD} I_{leak} \frac{1}{f}. \quad (2.10)$$

Leakage-related dissipation increases for lower frequencies, as leakage losses are accumulated over a longer time interval.

Discharging a gate in PFAL and ECRL will lead to a residual voltage at the output node that is in the range of the threshold voltage $V_{th,p}$ of the PMOS device. As long as the gate evaluates the same input in the next cycle, in ECRL, the residual charge will be reused in the next cycle, otherwise it is dismissed to ground. In PFAL, this charge is dissipated when the output signal changes, as the output is then connected to ground via the NMOS device in the latch in the evaluate interval. If the output state remains the same, the charge is dissipated in the $W$ interval, as the input transistors are turned on and connect the output to the power-clock (that is on ground potential in the $W$ interval). Besides that, in ECRL the output cannot instantly follow the rising power-clock. Only when the power-clock is at least $|V_{th,p}|$, the charging path over the PMOS device is opened. Then the output voltage follows the power-clock abruptly, leading to a dynamic loss. All these losses are related to the threshold voltage and lead to a non-adiabatic dissipation of

$$E_{non-adia} = \frac{1}{2} C V_{th,p}^2. \quad (2.11)$$

Non-adiabatic losses are independent of the operating frequency, leading to an offset in the energy dissipation over the whole frequency range. Thus, three loss mechanisms that contribute to the overall losses are found in Adiabatic Logic. Adiabatic losses (equation (2.7)) and leakage losses (equation (2.10)) are dependent on the operating frequency $f$. Figure 2.6 shows the three loss mechanisms in dependence of the frequency, and the overall dissipation is gained by summarizing all three components.

A minimum dissipation of the energy at a certain frequency is observed. Therefore an optimum frequency exists in Adiabatic Logic, where energy consumed per cycle is minimized.
2.3.1 Impact of Process Variations on the Losses in Adiabatic Logic

As in today’s CMOS technologies variations in the process are a major concern, circuit designers are confronted with new challenges in designing robust circuits. Also in Adiabatic Logic process variations have an impact on the circuit, mainly on the energy consumption. In static CMOS, functional errors due to process variations can be induced in circuits operated at high speed. If single transistors are too slow or too fast, timing constraints are violated leading to system fails. A lot of effort is put into methods to deal with these variations. As adiabatic circuits are operated with a frequency that is relatively low, timing issues are not of concern. But in Adiabatic Logic the variations will impact the energy consumption of the circuit [9, 30].

If (2.7) is considered, the effective charging path resistance $R$, which is composed by the on-resistance of the MOS device charging the output and the resistance of interconnects, impacts the energy dissipation. The charging MOS device is operated in the linear region most of the time, the charging path resistance can be estimated via the equation for the drain current in the linear region [31]:

$$I_D = k'_n \frac{W}{L} \left( (V_{GS} - V_{th})V_{DS} - \frac{V_{DS}^2}{2} \right). \tag{2.12}$$

The factor $k'_n$ summarizes the mobility of the majority carriers $\mu$ and the specific oxide capacitance $C_{OX}$. To operate an adiabatic circuit efficiently, the frequency has to be slow enough to allow the output to follow the power-clock such that a very small $V_{DS}$ will appear. The on-resistance $R_{on} = \frac{V_{DS}}{I_D}$ can therefore be approximated by

$$R_{on} = \frac{L}{k'_n \frac{W}{L}} (V_{GS} - V_{th})^{-1}. \tag{2.13}$$

The impact of the threshold voltage $V_{th}$ on the on-resistance $R_{on}$ is determined via (2.13). As the gate overdrive voltage $V_{GS} - V_{th}$ is affected by process variations, an increased or decreased on-resistance is observed, and therefore the energy dissipation will be changed.

For the leakage losses, the impact of variations on the current in sub-threshold region (see (2.9)) are regarded. An exponential dependence of (2.9) on $V_{th}$ is seen, the dissipation caused by leakage currents shows an exponential dependence on a shift in $V_{th}$. A shift in $V_{th}$ causes a change in the non-adiabatic losses according to (2.11). Non-adiabatic losses are quadratically dependent on variations in the threshold voltage.

Summarizing, the $V_{th}$-shift induced by process variations has the strongest impact in the frequency regime where leakage currents dominate the overall losses in Adiabatic Logic. A shift of the optimum frequency to higher values can be observed if $V_{th}$ is shifted to lower absolute values [10].

In Fig. 2.7 simulation results of a buffer circuit in the Positive Feedback Adiabatic Logic (PFAL, [32]) in a 130 nm low-$V_{th}$ CMOS technology shows the im-
In the low frequency regime, the leakage currents are the main contributor to the energy dissipation. These losses are exponentially dependent on variations in $V_{th}$. Adiabatic losses are less impacted by process variations. The optimum operating frequency is shifted to higher frequencies when going from the slow corner to the fast corner.

![Graph showing energy dissipation vs. frequency for different process corners.](image)

Impact of the process variations on the energy dissipation versus the frequency. Nominal and corner simulations slow and fast are plotted. Process variations impact all regimes, the leakage dominated regime in the lower frequency region, the adiabatic regime in the high frequency region, and of course also non-adiabatic losses, that are independent of the frequency. As leakage currents are more sensitive to parameter variation, the highest deviation is seen in the low frequency range. The slow corner has a raised $V_{th}$ with respect to the nominal value, leakage is therefore reduced, but the on-resistance in the loading path is increased, resulting in higher adiabatic losses. For the fast corner $V_{th}$ is reduced, leading to a reduced on-resistance, and therefore to reduced adiabatic losses. But on the downside here leakage is increased. The optimum frequency is shifted from 10 MHz in case of the nominal $V_{th}$ to 3 MHz in the slow corner, and to 50 MHz for the fast corner parameters.

### 2.4 Voltage Scaling—A Comparison of Static CMOS and Adiabatic Logic

An easy and powerful way to reduce losses in static CMOS is by reducing the voltage supply $V_{DD}$ [33]. Equation (2.3) reveals a quadratic dependence of the energy dissipation on $V_{DD}$ due to dynamic losses:

$$E_{CMOS} \propto V_{DD}^2.$$  (2.14)

The limiting factor for voltage scaling is the propagation delay $t_p$, that is increased while the voltage is decreased according to [34]

$$t_p = \left( \frac{V_{th}}{V_{DD}} + \alpha' - \frac{1}{2} \right) t_\tau + \frac{C_L V_{DD}}{2I_{D0}},$$  (2.15)

where $t_\tau$ is the input slope, $\alpha'$ is the velocity saturation parameter, and $I_{D0}$ is the drain current for $V_{GS} = V_{GD} = V_{DS}$. The impact of the input slope decreases with
ongoing miniaturization [34], therefore the first term of (2.15) can be neglected for state-of-the-art CMOS technologies. Using $I_D \propto (V_{GS} - V_{th})$ in saturation [35] the dependence of the delay on the voltage supply is found:

$$t_p \propto \frac{V_{DD}}{(V_{DD} - V_{th})^{\alpha'}}.$$  \hspace{1cm} (2.16)

A trade-off exists between speed and power consumption, therefore the voltage can only be reduced to a level where no timing constraints in the design are violated. The critical path in a static CMOS design determines the maximum degree to which the voltage can be reduced. In designs where only a few critical paths exist, but many paths have a positive slack after reducing the supply voltage, the gain from globally reducing the supply voltage is not satisfying. To make voltage scaling more effective one can try to break up the critical paths to allow further reduction of voltage and thus power, and also using different voltage domains for fast and slow paths could increase the benefits of scaling [36].

Delay is not a concern for Adiabatic Logic circuits, as the maximum possible frequency is far above the optimum frequency for an energy-efficient operation of gates and systems. Looking into the frequency regime where adiabatic losses dominate the energy consumption of Adiabatic Logic, it is expected that the reduction of the supply voltage will lead to a benefit in energy consumption. On first sight a dependence of $V_{DD}^2$ is observed, but the on-resistance of the transistor in the charging path is also a function of the supply voltage. If the overdrive voltage $V_{GS} - V_{th}$ is reduced by reducing the supply voltage, the resistance is increased. As long as $V_{DD}$ is far above $V_{th}$, the dissipated energy is [30]

$$E_{AL} \propto V_{DD} \left(1 + \frac{V_{th}}{V_{DD}}\right).$$  \hspace{1cm} (2.17)

Thus, Adiabatic Logic also gains from voltage scaling, but the ESF on gate level will decrease if voltage reduction is applied:

$$ESF \propto \frac{V_{DD}}{1 + \frac{V_{th}}{V_{DD}}}.$$  \hspace{1cm} (2.18)

Leakage losses are also impacted by reducing the supply voltage. As long as the leakage losses are negligible compared to the dynamic losses in static CMOS, and as long as the adiabatic circuit is not operated in the leakage dominated regime, and if non-adiabatic losses are negligible, the impact of voltage scaling on the ESF can be estimated by (2.18). The lower bound for $V_{DD}$ in static CMOS is mainly limited by timing constraints, including margins for variations in the process and fluctuations in the temperature and supply voltage. Supply voltage reduction in Adiabatic Logic is not limited by timing constraints. But a functional limit for ECRL and PFAL is observed
2.5 Properties and Design Considerations in AL

when reducing $V_{DD}$. Minimum supply voltages are given by [9]:

$$ V_{DD,\text{min}}^{\text{ECRL}} = \max (V_{th,n}, |V_{th,p}|) $$

$$ V_{DD,\text{min}}^{\text{PFAL}} = 2V_{th,n} $$

(2.19)

Below this lower bound, malfunctions of circuits constructed by ECRL and PFAL gates appear. In ECRL, the NMOS device is responsible to keep one output node at ground potential, and the PMOS device charges the dual output node. Thus in ECRL the voltage supply has to be higher than the highest absolute threshold voltage value. In the PFAL gate, the output node has to be at least loaded to $V_{th,n}$ to make the NMOS device in the latch conductive that is responsible for keeping the dual output node at ground. The input device’s source node is connected to the output node, that is expected to be at least $V_{th,n}$. Thus the gate voltage of the input device needs a voltage of greater than $2V_{th,n}$ to be conducting.

Finally the reduction in the voltage levels will degrade the noise margin for static CMOS as well as for Adiabatic Logic. Energy reduction via supply voltage scaling will thus be a trade-off between energy and robustness of the design.

2.5 Properties of Adiabatic Logic and Resultant Design Considerations

Based on the way PFAL and ECRL are constructed and operated, properties exist that need to be considered when designing adiabatic systems. The dual-rail signaling is due to the differential constitution of PFAL and ECRL, whereas delay and inherent micropipelining are implications of the four-phase power-clock.

2.5.1 Dual-Rail Encoded Signals

Differential logic styles like PFAL and ECRL generate dual output signals. But in contrast to differential static CMOS styles like CVSL, differential Adiabatic Logic styles are not always differential in a physical sense. As the power-clock ramps down to 0 each cycle, both outputs will go to 0 during the W interval. Only during the H interval, differential Adiabatic Logic gates are also physically differential.

Although the two outputs out and $\overline{\text{out}}$ are generated, the area consumption due to the transistor count is comparable to static CMOS. Considering a NAND gate, that needs 2 NMOS and 2 PMOS devices for static CMOS. ECRL consists of 4 NMOS and 2 PMOS devices, whereas PFAL uses additional 2 NMOS devices in the latch. But, compared to static CMOS, the dual-rail adiabatic gate performs a NAND and an AND function, as both signals, out $= A \& \overline{B}$ and $\overline{\text{out}} = A \& B$ are generated. The AND gate in static CMOS needs an additional inverter circuit, consisting of 1 PMOS and 1 NMOS device. Implementing more complex functions will further reduce the
overhead introduced by the latch devices. Dual-rail signals can help to simplify functions, and also common sub-blocks in functions can be shared for \( F \) and \( \bar{F} \) [28], as demonstrated for an ECRL XOR/XNOR gate in Fig. 2.8.

For the XOR the transistor count for ECRL versus static CMOS is 8 compared to 12. The active gate area is a rough measure for the area consumption. If for the static CMOS gate a symmetric rise and fall time is required, the PMOS devices have to be sized larger than the NMOS devices, due to the reduced mobility of the majority carriers (holes) in the PMOS device.

In Table 2.1 the XOR implementations for static CMOS, PFAL and ECRL are compared. For the active gate area \( A^* \) a ratio \( \frac{W_P}{W_N} = 2 \) has been assumed for all three gates to compensate for the smaller carrier mobility of holes compared to electrons. What can be seen clearly is, that already for the pure transistor count (\( \Sigma = \#n + \#p \)) ratio \( \frac{\Sigma_{AL}}{\Sigma_{CMOS}} \), where \( AL \) stands for PFAL and ECRL respectively, the XOR gates in Adiabatic Logic are smaller then the corresponding gate in static CMOS if the input inverters in the static CMOS gate are taken into account. The ratio of the active gate areas \( \frac{A^*_{AL}}{A^*_{CMOS}} \) is even better for Adiabatic Logic. Even if the input inverters in the static CMOS gates are not regarded in the transistor count, the ECRL and PFAL gates are comparable in transistor count and active gate area as indicated by the values in brackets in Table 2.1.

Components used in various arithmetic structures are adders and subtractors. If 2 numbers \( A \) and \( B \) are subtracted, the subtraction is carried out by adding the 2’s

Fig. 2.8 The ECRL XOR gate (a) without and (b) with reusing transistors in the logic blocks

<table>
<thead>
<tr>
<th></th>
<th>CMOS</th>
<th>PFAL</th>
<th>ECRL</th>
</tr>
</thead>
<tbody>
<tr>
<td>#n</td>
<td>6</td>
<td>8</td>
<td>6</td>
</tr>
<tr>
<td>#p</td>
<td>4</td>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td>( \Sigma_{CMOS} )</td>
<td>A_{PFAL} ( A_{ECRL} )</td>
<td>A_{CMOS} ( A_{CMOS} )</td>
<td></td>
</tr>
<tr>
<td>6(4) 6(4)</td>
<td>83%(125%)</td>
<td>66%(100%)</td>
<td>66%(100%) 55%(83%)</td>
</tr>
</tbody>
</table>
Fig. 2.9  The arrangement of static CMOS gates in (a) cannot be directly translated into Adiabatic Logic. Due to the micropipelining, signal A has to be buffered (b) to be synchronous to the output X of the AND gate in phase $\phi_1$

complement of $B$ [29]:

$$A - B = A + \overline{B} + 1. \quad (2.20)$$

Dual-rail gates offer inverted outputs, $\overline{B}$ is generated without additional inverter gates, saving inverters in larger designs. Speaking about latency, Adiabatic Logic systems rise in latency the more gates are cascaded. Dual-rail signaling allows to skip inverter stages and so decrease the number of cascaded gates, and thus allows to decrease the energy consumption and the latency in adiabatic systems.

### 2.5.2 Inherent Pipelining

In Fig. 2.5 the transport of information in an adiabatic circuit is sketched in the power-clock scheme. A cascade of adiabatic gates forms a pipeline. Each gate consists of a storage element and the logic blocks, a gate acts comparable to a latch in static CMOS with integrated logical functionality. Pipelining is thus inherent in Adiabatic Logic. Pipelining in some cases eases the construction of a system. A critical path does not exist in Adiabatic Logic, as each path consists of one gate only. The power-clock itself enforces that input signals are valid as soon as a gate starts to evaluate its outputs. It is guaranteed, that the succeeding gate starts to evaluate only after its inputs are stable. So no care has to be taken to avoid setup time or hold time failures, they are by construction excluded in the design of adiabatic circuits. On the other hand care has to be taken that signals are synchronous at the time when they are further processed. An example (Fig. 2.9) shows the difference of static CMOS design and Adiabatic Logic, if two signal paths converge. To synchronize input A and the output signal of the AND gate X, a buffer has to be inserted in the adiabatic implementation of the design example in Fig. 2.9.

Especially if arithmetic units are designed, carefully selecting suitable topologies is of great importance to avoid overhead due to synchronization stages.
2.5.3 Delay Considerations in Adiabatic Logic

The delay characterizes the time a signal needs to propagate through a path (path delay) or through a gate. In static CMOS it is crucial for high speed designs to observe critical paths and gate delays to be aware of timing errors. The delay is determined by the ability of the driving transistors to source or sink current, and also by the capacitance value that has to be charged or discharged. It is approximated by a current source (if the transistor is in saturation) and the load capacitance. To operate an adiabatic gate in an energy efficient manner, the voltage drop between the rising/falling transition of the power-clock and the active output node has to be very small ($V_{DS} \approx 0$). Therefore, an operating frequency is chosen that is well below the maximum frequency allowed for a correct function of the gates. Thus, gate delay in Adiabatic Logic is fixed, the full swing output signal is valid after $\frac{1}{4f}$, where $f$ is the frequency of the power-clock.

2.5.4 The Power Supply Net in Adiabatic Logic: Crosstalk, iR-drop, L\(\frac{di}{dt}\)-drop, Electromigration

Crosstalk Adjacent lines A and B (Fig. 2.10) will experience changes in the voltage level if a transition occurs on the neighboring line [29, 37].

The relation of the capacitance between the lines $C_{12}$ and the line capacitances $C_1$ and $C_2$, and the voltage swing of the transition determine the change in the voltage seen by the impacted line. If a voltage transition occurs on line A, and the swing is $\Delta v_A$, the change on line B is

$$\Delta v_B = \frac{C_{12}}{C_2 + C_{12}} \Delta v_A.$$  \hspace{1cm} (2.21)

if signal line B is a floating line. Most likely in static CMOS and Adiabatic Logic the interfered lines will not be floating. As soon as line B is actively driven (Fig. 2.11), the driver will counteract the deviation due to crosstalk and will bring the voltage level on line B back to its original value. In [37] the equation for the deviation is given, if line A is connected to a driver:

$$\Delta v_B = \frac{C_{12}}{C_T} \frac{R_1}{\tau_A} \left( 1 - e^{-\frac{\tau_A}{R_1C_T}} \right).$$  \hspace{1cm} (2.22)

Here $C_T = C_1 + C_2$, $\tau_A$ is the transition time of the signal swing $\Delta v_A$ and $R_1$ is the on-resistance of the driver connected to line B.
If the transition time $t_{rA}$ of the disturbing signal is increased, the voltage peak induced in the neighboring line is reduced. As less charge is transferred in the same time, the driver has more time to counteract the disturbance. In static CMOS the transition time is determined by the slew rate of the gate driving line A and the capacitances that have to be driven by this gate. In Adiabatic Logic the transition time $t_{rA}$ of a line is determined by the power-clock. If the adiabatic gate is operated in the frequency regime with the lowest dissipation per cycle, the transition time will be much lower than in static CMOS. Thus it can be concluded according to (2.22), that Adiabatic Logic will be less impacted by crosstalk-induced voltage drop.

$iR$-drop Power supply lines have a non-zero on-resistance (Fig. 2.12), voltage drop occurs if a current is drawn on the power supply. In static CMOS circuits, current peaks occur when the registers are clocked, as than a lot of gates switch simultaneously. Not only peak currents, also average currents lead to $iR$-drop, but as the peak current is supposed to be dominant over the average, these peaks will lead to the sizing of safety margins for save operation of electronic systems in static CMOS. A reduced voltage supply due to the $iR$-drop will lead to an increased gate delay and thus critical paths can possibly fail to process data in time. An inverter that switches from low to high will first draw a current from the power supply that is the saturation current of the PMOS device:

$$I_{DS,sat} = -\frac{k_p'}{2} \frac{W}{L} (-V_{GS} + V_{th,p})^2 (1 - \lambda V_{DS}) .$$  \hfill (2.23)$$

At the beginning of the charging process, the maximum current will be drawn:

$$I_{peak,CMOS} = I_{DS,sat}(V_{DD}) = -\frac{k_p'}{2} \frac{W}{L} (V_{DD} + V_{th,p})^2 (1 - \lambda V_{DD}) .$$  \hfill (2.24)$$

Due to different paths within the gates composed in the logic core, the current waveform will get broader, and the peak will thus be reduced compared to the case where all gates switch simultaneously. The duration of such peaks with respect to the cycle time will show a great impact on the critical path delay. Even if the peak voltage drop is very short, and time is left where the regular $V_{DD}$ is seen by the gates, paths with a very critical timing can still fail.

Adiabatic Logic circuits operate with relatively small currents (transistors in linear region with small voltage $|V_{DS}|$), and due to the four-phase power-clock, not all gates will switch at once. Each phase has its own power line that only sees the
current profile of a single phase. During the evaluate interval, an almost constant current will be delivered to the gate. In the recover interval, the charge will be recovered with a constant rate (in ideal case) during the whole interval. In real ECRL gates there are non-adiabatic effects, where transistors in Adiabatic Logic operate in the saturation region. This happens when the power-clock reaches the threshold voltage of the PMOS device, and the gate’s output abruptly rises to the present voltage of the power-clock. The maximum peak current is determined by this effect in ECRL to 

\[ I_{\text{peak}} = \Delta q / t_x = C_L (|V_{\text{th, p}}| + V_0) / t_x, \]  

where \( t_x \) is the time it takes to follow from the initial voltage level \( V_0 \) on \( C_L \) to the voltage level \( |V_{\text{th, p}}| \). In (2.25) it is assumed, that the charge \( \Delta q \) is transferred via a constant current.

Each current peak, the one in static CMOS as well as the current peak in (2.25) are saturation currents, they are proportional to the square of the overdrive voltage \( (-V_{GS} + V_{\text{th, p}})^2 \). In contrast to static CMOS, where the maximum overdrive voltage is applied, i.e. \( -V_{DD} + V_{\text{th, p}} \), in ECRL only a very small overdrive is seen at the PMOS device at the beginning of the evaluate interval, thus also the peak current will be only a small fraction of the current in static CMOS. Additionally Adiabatic Logic circuits are not operated at a critical timing. Even larger fractions of the \( iR \)-drop will not impact the functionality of AL.

\[ \frac{di}{dt} \text{-drop} \]  

If steep current peaks occur in a circuit design, inductance (Fig. 2.12) may play an important role, as a voltage drop of \( \Delta V = L \frac{di}{dt} \) is induced in the inductor. Added on top of the voltage drop due to the resistance of the line, this will further decrease the power supply voltage at the circuit, the delay of critical paths is further increased. As soon as inductances are in a regime where a remarkable voltage drop can be observed for the \( \frac{di}{dt} \) slopes seen in the circuit, this has to be accounted for in the safety margin for \( V_{DD} \). In static CMOS, when instantaneous switching occurs, steep slopes of the current are expected. Adiabatic circuits do not draw such high peak currents, slopes \( \frac{di}{dt} \) are small compared to those in static CMOS.

**Electromigration**  
Electromigration is a wear-out process on lines carrying currents with a strong current density [29]. The effect is more likely to occur in lines where a strong unidirectional current flows, i.e. power supply lines in static CMOS.
circuits. Black [38] presents a relationship for the median time to failure [MTF]:

\[
\frac{1}{MTF} = AJ^2e^{-\frac{\varphi}{kT}}.
\]  

(2.26)

The constant \(A\) (involving the cross section area of the line), the current density \(J\), the activation energy \(\varphi\), and the temperature of the line \(T\) impact the \(MTF\). If the power is supplied in a bidirectional fashion, like in Adiabatic Logic, where charge is provided to the circuit and recovered later on into the power supply on the same line, electromigration is strongly reduced [39]. In order to limit electromigration, line widths (or thicknesses) have to be increased, to reduce the current density. Adiabatic Logic’s power supply lines will obviously less likely fail due to electromigration and can be sized smaller, also resulting in less capacitance of the power supply net. Similarly, in [40] stepwise charging has been proposed in SRAM cells in order to reduce electromigration and the Hot Carrier Injection.

Due to the properties of Adiabatic Logic and the power-clock, Adiabatic Logic suffers less from cross-talk, \(iR\)-drop, \(L\frac{di}{dt}\)-drop and electromigration. In static CMOS, due to the high peak currents, the power supply lines will exhibit a higher peak \(iR\)-drop, a stronger voltage bounce due to the \(L\frac{di}{dt}\)-drop and also electromigration will be significantly higher due to the unidirectional current flow and the high current peaks. Adiabatic Logic thus allows for the design of a voltage supply network that will have less constraints then in static CMOS.

2.6 General Simulation Setup

Adiabatic Logic in this work is supplied with a trapezoidal waveform in most of the circuit simulations. To characterize a gate, a simulation environment is established that reproduces the conditions in a real system. Static CMOS gates dissipate energy dependent on the slope of the input signal and on the capacitive load of the output signal. In such a system, a gate will see input signals, that are shaped by previous gates and the output load is formed by the connected gates. Two gates are connected at the inputs of the device under test to shape the input signal and two gates are used at the output to have a load connected to the outputs.

In Fig. 2.13 such a simulation setup is displayed. A general Device Under Test (DUT) is fed with \(N\) input signals shaped by two driver stages each. Idealized signals \(r\) are inserted at the front interface of the simulation setup. In static CMOS, the two driver gates are used to provide a input signal with a realistic slope to the DUT. Also a realistic imbalance between rising and falling edge is introduced. Adiabatic gates output signals differ from an ideal trapezoidal waveform due to different reasons, i.e. non-adiabatic steps in the voltage, remaining charge on the nodes and due to the voltage drop over the loading path during charging of the output. The \(M\) outputs of the gate are each connected to two gates in series. The energy is measured for the DUT by integrating over the power \(p(t)\) dissipated in the gate. Due to the energy transfer observed in PFAL gates [9], also the energy introduced via the inputs or the outputs can be regarded by measuring the energy flow via those ports.
Fig. 2.13 Simulation setup for an $N \times M$ ($N$ inputs, $M$ outputs) static CMOS or Adiabatic Logic gate characterization. The ideal input signals $r_1, \ldots, r_N$ are converted to realistic signals by two inverter/buffer gates. Connecting the outputs of the Device Under Test (DUT) to two further inverter/buffer gates allows for determining the energy dissipation with a realistic load.

If not stated otherwise, gates are characterized with such a setup. Also for the simulation of larger systems, signal shaping is used to provide a realistic signal to the inputs.
Adiabatic Logic
Future Trend and System Level Perspective
Teichmann, P.
2012, XVIII, 166 p., Hardcover
ISBN: 978-94-007-2344-3