Circuit Techniques for On-Chip Clocking and Synchronization

Behzad Mesgarzadeh
Abstract

Today’s microprocessors with millions of transistors perform high-complexity computing at multi-gigahertz clock frequencies. The ever-increasing chip size and speed call for new methodologies in clock distribution network. Conventional global synchronization techniques exhibit many drawbacks in the advanced VLSI chips such as high-speed microprocessors. A significant percentage of the total power consumption in a microprocessor is dissipated in the clock distribution network. Also since the chip dimensions increase, clock skew management becomes very challenging in the framework of conventional methodology. Long interconnect delays limit the maximum clock frequency and become a bottleneck for future microprocessor design. In such a situation, new alternative techniques for synchronization in system-on-chip are demanded.

This thesis presents new alternatives for traditional clocking and synchronization methods, in which, speed and power consumption bottlenecks are treated. For this purpose, two new techniques based on mesochronous synchronization and resonant clocking are investigated. The mesochronous synchronization technique deals with remedies for skew and delay management. Using this technique, clock frequency up to 5 GHz for on-chip communication is achievable in 0.18-µm CMOS process. On the other hand the resonant clocking solves significant power dissipation problem in the clock network. This method shows a great potential in power saving in very large-scale integrated circuits. According to measurements, 2.3X power saving in clock distribution network is achieved in 130-nm CMOS process. In the resonant clocking, oscillator plays a crucial role as a clock generator. Therefore an investigation
about oscillators and possible techniques for jitter and phase noise reduction in clock generators has been done in this research framework. For this purpose a study of injection locking phenomenon in ring oscillators is presented. This phenomenon can be used as a jitter suppression mechanism in the oscillators. Also a new implementation of the DLL-based clock generators using ring oscillators is presented in 130-nm CMOS process. The measurements show that this structure operates in the frequency range of 100 MHz-1.5 GHz, and consumes less power and area compared to the previously reported structures. Finally a new implementation of a 1.8-GHz quadrature oscillator with wide tuning range is presented. The quadrature oscillators potentially can be used as future clock generators where multi-phase clock is needed.
Preface

This licentiate thesis presents my research during the period from May 2004 through January 2006, at the Division of Electronic Devices, Department of Electrical Engineering, Linköping University, Sweden. The following publications are included in this thesis:


- Martin Hansson, Behzad Mesgarzadeh and Atila Alvandpour, “1.56-GHz On-Chip Resonant Clocking with 2.3X Clock Power-Saving in 130-nm CMOS”, manuscript to be submitted.

- Behzad Mesgarzadeh and Atila Alvandpour, “A 24-mW, 0.02-mm$^2$, 1.5-GHz DLL-Based Frequency Multiplier in 130-nm CMOS”, manuscript to be submitted.

The following publications related to this research are not included in the thesis:


The following papers present other research topics, which I have been involved in, during my study:


**Abbreviations**

<table>
<thead>
<tr>
<th>Abbreviation</th>
<th>Full Form</th>
</tr>
</thead>
<tbody>
<tr>
<td>CMOS</td>
<td>Complementary Metal Oxide Semiconductor</td>
</tr>
<tr>
<td>CP</td>
<td>Charge Pump</td>
</tr>
<tr>
<td>DLL</td>
<td>Delay-Locked Loop</td>
</tr>
<tr>
<td>IC</td>
<td>Integrated Circuit</td>
</tr>
<tr>
<td>LF</td>
<td>Loop Filter</td>
</tr>
<tr>
<td>LPF</td>
<td>Low-Pass Filter</td>
</tr>
<tr>
<td>PLL</td>
<td>Phase-Locked Loop</td>
</tr>
<tr>
<td>SoC</td>
<td>System-on-Chip</td>
</tr>
<tr>
<td>VCO</td>
<td>Voltage-Controlled Oscillator</td>
</tr>
<tr>
<td>VCDL</td>
<td>Voltage-Controlled Delay Line</td>
</tr>
<tr>
<td>VLSI</td>
<td>Very Large Scale Integrated Circuits</td>
</tr>
</tbody>
</table>
I would like to thank the following people:

- Professor Atila Alvandpour, for his great supervision. Without his support, guidance and encouragement, this research would not have been completed efficiently.
- Professor Christer Svensson, who gave me completely new perspectives on my research with his never-ending knowledge.
- Martin Hansson, who was always my best consultant in my research. He was the first person who forced me to speak Swedish in the university, however it was really difficult to manage technical conversations in Swedish with my little Swedish knowledge.
- Arta Alvandpour, not only for his technical support related to tools, but also for his help in fixing many different problems which sometimes were not even directly related to my research.
- Anna Folkeson, for her help in administrative issues.
- Dr. Kh. Hadidi in Urmia Semiconductors, who was my first teacher in IC design research area. I will never forget his great personality.
- Assistant Professor Per Löwenborg and Dr. Henrik Ohlsson, for chip design summer 2004.
- My dearest friend Jalal Maleki, for his valuable comments in proofreading of this thesis.
• All past and present members of the Division of Electronic Devices, especially Lic. Eng. Stefan Andersson, Dr. Peter Caputa, Henrik Fredriksson, Dr. Darius Jakonis, Dr. Kalle Folkesson, Rashad Ramzan, Naveed Ahsan, Associate Professor Jerzy Dabrowski, Christian Kullberg, Timmy Sundström and Saeed Tahmasbi, for creating such a great research environment.

• All of my nice friends in Sweden who have made it possible for me to succeed in my steps and not to be disappointed facing different problems which typically immigrants have in a foreign country, especially Professor Mariam Kamkar, and Maleki, Farboudi and Houshangi families.

• My family in Iran, especially my fantastic parents for their patience and great support. Believe me that it is hard to express how much grateful I am to them.

• Finally Shanai, my best friend and consultant in my life who accepted to accompany me in a tough immigration way and to be far from her family. I am proud of sharing my life with her. Definitely this thesis is dedicated to her.

Behzad Mesgarzadeh
February 2006
Contents

Abstract iii
Preface v
Abbreviations vii
Acknowledgments ix
I Introduction 1

1 Introduction 3
  1.1 Moore’s Law and Microelectronics ........................................... 3
  1.2 Scaling Trends and Future Challenges ....................................... 4
  1.3 Motivations and Scope of Thesis .............................................. 5
  1.4 References .............................................................................. 6

II Oscillators and Clock Generation 7

2 Oscillators 9
  2.1 Basic Considerations ............................................................... 9
  2.2 Ring Oscillators ...................................................................... 10
  2.3 LC Oscillators ....................................................................... 12
2.4 On-Chip Inductors........................................................................................................ 14
  2.4.1 Inductance Value.................................................................................................. 14
  2.4.2 Quality Factor and Resonance Frequency ...................................................... 15
2.5 Phase Noise ............................................................................................................ 17
2.6 Contribution of This Thesis ................................................................................. 18
2.7 References ............................................................................................................. 19

3 Frequency Multiplication ............................................................................. 21
  3.1 PLL ....................................................................................................................... 21
  3.2 DLL ...................................................................................................................... 23
  3.3 Clock Multipliers .................................................................................................. 25
    3.3.1 PLL-Based ....................................................................................................... 25
    3.3.2 DLL-Based ...................................................................................................... 26
  3.4 Contribution of This Thesis ................................................................................. 26
  3.5 References ............................................................................................................. 27

III Clock Distribution ............................................................................. 29

4 Synchronization and Clocking .................................................................. 31
  4.1 Global Synchronization ....................................................................................... 31
  4.2 Mesochronous Clocking ...................................................................................... 33
  4.3 Resonant Clocking ............................................................................................... 34
    4.3.1 Power Dissipation ......................................................................................... 35
    4.3.2 Quality Factor ............................................................................................... 36
    4.3.3 Mixing Phenomenon .................................................................................... 36
  4.4 Contribution of This Thesis ................................................................................. 38
  4.5 References ............................................................................................................. 38

IV Papers ........................................................................................................ 41

5 Paper 1 ............................................................................................................. 43
A New Mesochronous Clocking Scheme for Synchronization in SoC
  5.1 Introduction ........................................................................................................... 44
  5.2 Forbidden Zone ..................................................................................................... 46
  5.3 Proposed Scheme .................................................................................................. 46
    5.3.1 DLL-Based Frequency Doubler ...................................................................... 47
    5.3.2 Edge Decision Unit ....................................................................................... 49
  5.4 Simulation Results ............................................................................................... 50
  5.5 Conclusion ............................................................................................................ 52
  5.6 References ............................................................................................................. 52
6 Paper 2  

A Study of Injection Locking in Ring Oscillators

<table>
<thead>
<tr>
<th>Section</th>
<th>Title</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>6.1</td>
<td>Introduction</td>
<td>56</td>
</tr>
<tr>
<td>6.2</td>
<td>Ring Oscillators</td>
<td>56</td>
</tr>
<tr>
<td>6.3</td>
<td>Injection Locking</td>
<td>57</td>
</tr>
<tr>
<td>6.4</td>
<td>Phase Noise and Jitter Reduction</td>
<td>61</td>
</tr>
<tr>
<td>6.4.1</td>
<td>Phase Noise</td>
<td>61</td>
</tr>
<tr>
<td>6.4.2</td>
<td>Jitter</td>
<td>64</td>
</tr>
<tr>
<td>6.5</td>
<td>Simulation Results</td>
<td>64</td>
</tr>
<tr>
<td>6.6</td>
<td>Conclusions</td>
<td>65</td>
</tr>
<tr>
<td>6.7</td>
<td>References</td>
<td>65</td>
</tr>
</tbody>
</table>

7 Paper 3  

1.56-GHz On-Chip Resonant Clocking with 2.3X Clock Power-Saving in 130-nm CMOS

<table>
<thead>
<tr>
<th>Section</th>
<th>Title</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>7.1</td>
<td>Introduction</td>
<td>68</td>
</tr>
<tr>
<td>7.2</td>
<td>LC-Tank Resonant Clocking</td>
<td>69</td>
</tr>
<tr>
<td>7.3</td>
<td>Measurement Results</td>
<td>70</td>
</tr>
<tr>
<td>7.4</td>
<td>Conclusions</td>
<td>74</td>
</tr>
<tr>
<td>7.5</td>
<td>Acknowledgments</td>
<td>74</td>
</tr>
<tr>
<td>7.6</td>
<td>References</td>
<td>74</td>
</tr>
</tbody>
</table>

8 Paper 4  

A 24-mW, 0.02-mm², 1.5-GHz DLL-Based Frequency Multiplier in 130-nm CMOS

<table>
<thead>
<tr>
<th>Section</th>
<th>Title</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>8.1</td>
<td>Introduction</td>
<td>78</td>
</tr>
<tr>
<td>8.2</td>
<td>Frequency Multiplier Description</td>
<td>79</td>
</tr>
<tr>
<td>8.3</td>
<td>Experimental Results</td>
<td>80</td>
</tr>
<tr>
<td>8.4</td>
<td>References</td>
<td>83</td>
</tr>
</tbody>
</table>

9 Paper 5  

A Wide-Tuning Range 1.8-GHz Quadrature VCO Utilizing Coupled Ring Oscillators

<table>
<thead>
<tr>
<th>Section</th>
<th>Title</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>9.1</td>
<td>Introduction</td>
<td>86</td>
</tr>
<tr>
<td>9.2</td>
<td>General Considerations</td>
<td>87</td>
</tr>
<tr>
<td>9.3</td>
<td>Coupled Ring Oscillators</td>
<td>89</td>
</tr>
<tr>
<td>9.4</td>
<td>LC Tank-Based Filtering</td>
<td>90</td>
</tr>
<tr>
<td>9.5</td>
<td>Tuning Range</td>
<td>91</td>
</tr>
<tr>
<td>9.6</td>
<td>Test Chip Design</td>
<td>92</td>
</tr>
<tr>
<td>9.7</td>
<td>Simulation Results</td>
<td>92</td>
</tr>
<tr>
<td>Section</td>
<td>Page</td>
<td></td>
</tr>
<tr>
<td>------------------</td>
<td>------</td>
<td></td>
</tr>
<tr>
<td>9.8 Conclusions</td>
<td>94</td>
<td></td>
</tr>
<tr>
<td>9.9 References</td>
<td>95</td>
<td></td>
</tr>
</tbody>
</table>

**Appendix**

MOS Transistor Equations 99
Part I

Introduction
Chapter 1

Introduction

In the late 1950s first integrated circuits (IC’s) based on semiconductor properties were developed. In the mid-1960s CMOS devices were introduced, initiating a revolution in the semiconductor industry. In 40 years, the technology of IC production has evolved from producing simple chips with a few components to fabricating microprocessors comprising multi-billion transistors. Microelectronics has undoubtedly had a significant impact on the lifestyle of human being during its evolution.

1.1 Moore’s Law and Microelectronics

On 19 April 1965, Intel co-founder Gordon E. Moore published his famous paper in *Electronics* magazine [1] and predicted that the number of integrated components would be doubled every year. This prediction was based on changes in the number of integrated components during 1962-1965. In 1975, Moore amended his law to state that the number of transistors would be doubled about every 24 months. As shown in Figure 1.1, interestingly after 40 years, the number of transistors in CPUs manufactured by Intel is following the so-called Moore's law. The scaling property in CMOS technology, which causes this exponential growth in the number of transistors, gives high flexibility and performance, and increases the integration density per area. On the other hand, this exponential growth creates new design problems in the new large-scale
integrated circuits. For example, in the modern microprocessors because of the large chip dimensions, clock distribution network is one of the most crucial parts, in which clock skew and power consumption management have become more challenging.

![Figure 1.1: Moore’s law in Intel microprocessors.](image)

### 1.2 Scaling Trends and Future Challenges

The technology scaling will continue at least in the next ten years, having great impact on increasing integration density, speed and performance of the integrated circuits. Table 1.1 shows the scaling trends from the 2004 to 2016 published by International Technology Roadmap for Semiconductors (ITRS) [2].

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Technology node (nm)</td>
<td>90</td>
<td>65</td>
<td>45</td>
<td>32</td>
<td>22</td>
</tr>
<tr>
<td>Nominal $V_{dd}$ (V)</td>
<td>1.2</td>
<td>1.1</td>
<td>1.0</td>
<td>0.9</td>
<td>0.8</td>
</tr>
<tr>
<td>Saturation $V_T$ (V)</td>
<td>0.2</td>
<td>0.18</td>
<td>0.15</td>
<td>0.11</td>
<td>0.1</td>
</tr>
<tr>
<td>Gate leakage (A/cm$^2$)</td>
<td>450</td>
<td>930</td>
<td>1900</td>
<td>7700</td>
<td>19000</td>
</tr>
<tr>
<td>Peak $f_T$ (GHz)</td>
<td>120</td>
<td>200</td>
<td>280</td>
<td>400</td>
<td>700</td>
</tr>
</tbody>
</table>

**Table 1.1: Predicted scaling trend according to ITRS.**

According to this prediction, the scaling will continue at least until 2016, in which a feature size of 22 nm will be used. Obviously in this trend new challenges will arise. The leakage problem is one of the most serious challenges
1.3 Motivations and Scope of Thesis

In the modern microprocessors, the synchronization and clock distribution are the most critical and challenging tasks. The performance and efficiency of a large-scale, high-speed processor are directly related to the strategy of on-chip synchronization and clocking. The traditional global synchronization suffers several drawbacks. In such a synchronization style, a significant part of the total power consumption of the processors dissipates in the clock distribution network. Also timing skew problem is another issue, which in globally synchronous system with large scales is challenging to manage. This thesis introduces new circuit level solutions to overcome the mentioned problems in the conventional synchronization methodologies. Paper 1 and Paper 3 present new alternatives for the conventional globally synchronous clocking strategy. In Paper 1, a mesochronous clocking-based solution is presented, in which functional blocks can communicate in high data rates without need for a globally synchronous scheme [3]. Using this clocking technique, clock frequency up to 5 GHz is achievable for on-chip communication in 0.18-µm CMOS process. In Paper 3, a successful experience of 1.56-GHz resonant clocking in 130-nm CMOS process is presented [4]. In this strategy all buffers needed for conventional global synchronization are removed and the clock load is directly driven by an on-chip LC oscillator. A high potential in power saving is demonstrated by using this strategy. According to measurements, 2.3X power saving is achieved in clock distribution network compared to conventional clocking.

The rest of the papers in this thesis present the research on oscillators and clock generators, which can be used in clock distribution networks. In paper 2 the phenomenon of injection locking has been formulated for ring oscillators [5]. This phenomenon shows a great potential in the jitter and phase noise reduction in on-chip oscillators. Paper 4 presents a new DLL-based frequency multiplication technique, in which a DLL controls a ring oscillator to perform frequency multiplication [6]. According to the measurement results, the implementation of this structure in 130-nm CMOS process operates in the frequency range of 100 MHz-1.5 GHz with smaller area and less power consumption compared to the previously reported structures. Finally in Paper 5, a new implementation of quadrature LC oscillators utilizing coupled ring
oscillators is presented [7]. The proposed oscillator oscillates in 1.8 GHz and it has a wide tuning range. This kind of oscillators can be an interesting alternative for future clock generators where different clock phases are required.

The organization of this thesis is as follows. In chapter 2, basics of the oscillators and their possible on-chip implementations are discussed. In the third chapter, the techniques of frequency multiplication in clock generators are presented. Chapter 4 is dedicated to synchronization and clocking techniques and three different methodologies are compared. In chapter 5-9, the papers are presented.

1.4 References


[4] M. Hansson, B. Mesgarzadeh and A. Alvandpour, “1.56-GHz On-Chip Resonant Clocking with 2.3X Clock Power-Saving in 130-nm CMOS”, *manuscript to be submitted*.


[6] B. Mesgarzadeh and A. Alvandpour, “A 24-mW, 0.02-mm², 1.5-GHz DLL-Based Frequency Multiplier in 130-nm CMOS”, *manuscript to be submitted*.

Part II

Oscillators and Clock Generation
Chapter 2

Oscillators

Oscillators are crucial components in many electronic circuits. Oscillators can be integrated on-chip for a variety of different applications. In conventional clock distribution networks in microprocessors, typically a voltage-controlled oscillator (VCO) is a part of a phase-locked loop (PLL) in order to generate system clock. In this chapter, first an overview about the basic considerations in oscillatory systems is presented, and then possible implementations of on-chip CMOS oscillators are discussed.

2.1 Basic Considerations

A feedback system under certain criteria has the potential of oscillation. In order to get more insight, we consider the unity-gain negative feedback system shown in Figure 2.1.

![Figure 2.1: Unity-gain negative feedback system.](image-url)
The closed-loop transfer function of this system in the frequency-domain can be written as

\[ \frac{Y(s)}{X(s)} = \frac{H(s)}{1 + H(s)}. \]  

(2.1)

In Eq. 2.1, if for \( s = j\omega_0 \), \( H(j\omega_0) = -1 \), then the closed-loop gain, at \( \omega = \omega_0 \) approaches infinity. Under this condition, in an electrical circuit with such a feedback, the noise component in \( \omega = \omega_0 \) will be amplified by the circuit, resulting an oscillation at \( \omega = \omega_0 \) [1]. In practice the output amplitude will not be infinite and always some limiting mechanisms exist, resulting in saturation at the output of the oscillator. The loop gain of the oscillator circuit (\( |H(j\omega_0)| \)), should be unity or greater than unity to start the oscillation. Otherwise instead of amplification, the noise component will be suppressed, and oscillation will not be started. According to discussion above, two conditions are necessary but not sufficient for a negative-feedback circuit to oscillate [2]:

\[ |H(j\omega_0)| \geq 1 \]  

(2.2)

\[ \angle H(j\omega_0) = 180^\circ. \]  

(2.3)

These two conditions are called “Barkhusen criteria”. In the on-chip implementations, in order to ensure the oscillation in the presence of temperature and process variation, the loop gain should be chosen more than 2-3 [1]. Since the negative-feedback provides 180° phase shift, according to Eq. 2.3 a total phase shift of 360° around the loop is needed for oscillation. In CMOS technology, oscillators are typically implemented in two different forms, known as “ring oscillators” and “LC oscillators”. In the next sections a brief overview of these two oscillator categories are presented.

### 2.2 Ring Oscillators

According to the discussion in the previous section, in order to implement an oscillator, a proper implementation of \( H(s) \) in the circuit level is needed. Also since a loop-gain more than unity is needed, the nature of the circuit should be an amplifier with ability of creating the needed phase shift. An inverter could be a candidate for implementation of \( H(s) \) because by nature it is an amplifier, which creates phase shift between its input and output. A simple implementation of an inverter is a single stage common-source amplifier, as shown in Figure 2.2. When input voltage level is high, NMOS transistor is on and the load capacitance is discharged to reach a low output level, while for a low input, the
load capacitance is charged by the resistance $R_D$ to reach a \textit{high} output level. In frequency domain assuming that the dominant pole occurs at the output node, this circuit can be considered as a single-pole system. In such a system maximum phase shift is 90°. It means this circuit does not have sufficient phase shift to be used as possible implementation of $H(s)$. Cascading two inverters provides 180° phase shift but since the resulting output is not inversion of the input, the total phase shift around the loop will be 180° instead of 360°. Thus at least three cascaded inverter stages are needed in implementation of $H(s)$, to form an oscillator.

![Common source amplifier](image)

\textbf{Figure 2.2: Common source amplifier.}

![Frequency response of common source amplifier](image)

\textbf{Figure 2.3: Frequency response of common source amplifier.}
The number of inverter stages in a ring oscillator specifies the oscillation frequency of the oscillator. An $N$-stage ring oscillator is shown in Figure 2.4. In this circuit the oscillation frequency is

$$f_{osc} = \frac{1}{2 \cdot N \cdot t_d}$$

(2.4)

where $t_d$ is the propagation delay of an inverter stage driving an identical inverter.

Assuming each inverter stage as a first-order system with a pole at $\omega = \omega_p$, for an $N$-stage ring oscillator, the transfer function is

$$H(s) = -\frac{A^N}{(1 + s / \omega_p)^N}$$

(2.5)

where $A$ is the voltage gain of an inverter stage.

### 2.3 LC Oscillators

Another possible implementation of on-chip oscillators is based on the properties of RLC circuits. Figure 2.5 shows a parallel RLC circuit in which capacitance and inductance are ideal components without any resistive loss. The equivalent impedance of this circuit is frequency dependent and is as

$$|Z_{eq}(j\omega)|^2 = \frac{R^2 L^2 \omega^2}{L^2 \omega^2 + R^2 (1 - LC \omega^2)^2}.$$ 

(2.6)
In this circuit at a frequency of $\omega = 1/\sqrt{LC}$ the impedance of inductor and capacitor cancel each other. In such a situation, the circuit has a pure resistive nature and the total phase shift is $0^\circ$.

![Figure 2.5: RLC circuit.](image)

In practice, inductor is not an ideal component and it suffers a series resistance. Using proper transformations, we can transform this resistance to a parallel one [1]. In order to have oscillation, the RLC circuit should be used in a feedback loop with a total phase shift of $360^\circ$. Putting RCL circuit as load for common source amplifier shown in Figure 2.2 and using two cascaded amplifiers inside a feedback, creates a total $360^\circ$ phase shift around the loop. In such a case, choosing proper voltage gain for amplifiers as discussed earlier guarantees the oscillation. This structure, which is called as “cross-coupled LC oscillator”, is shown in Figure 2.6. As mentioned earlier, the resistance $R$ is the transformed series resistance of the inductor.

![Figure 2.6: Two cascaded common source amplifiers.](image)

In the circuit shown in Figure 2.6, cross-coupled transistors behave as a negative resistance. Forming another cross-coupled structure using PMOS transistors, as shown in Figure 2.7, increases the total gain of the amplifiers and increases the chance of oscillation using the same amount of supply current [3]. However,
Oscillators

PMOS transistors add more parasitics to the RLC circuit. This structure is known as “complementary cross-coupled oscillator”.

There are other implementations for LC oscillators (e.g. Colpitts oscillator), which are not discussed here, but the concept is the same for all implementations. In all of the different implementations, RLC circuit should be in a feedback loop with sufficient gain and 360° of phase shift. In on-chip implementation of LC oscillators, inductor design is one of the most important tasks. In the next section an overview of the on-chip inductor design is presented.

2.4 On-Chip Inductors

Fully integrated radio frequency circuits need on-chip implementation of inductors. On-chip inductors can be implemented using metal wires available in the process technology. The most important parameters of on-chip inductors are the quality factor (Q), self-resonance frequency and the area. Usually on-chip inductors are implemented as spiral structures as shown in Figure 2.8. In this section some basic concepts about on-chip spiral inductors will be discussed.

2.4.1 Inductance Value

Maxwell’s equations can be used in order to calculate the accurate value of the inductance for a given spiral structure. However these equations are very complicated for numerical calculations. A very accurate numerical solution may
be obtained using 3D finite element simulators but these kinds of simulators require long run times. In literature, various methods for the spiral inductor value calculation are introduced [4]-[6].

![Spiral Inductor Diagram](image)

**Figure 2.8: A rectangular spiral inductor.**

A closed-form formula, which has less than 10% error for inductors in the range of 5 to 50nH and can be used for square shape spiral inductors, is as [7]

\[
L = 1.3 \times 10^{-7} \frac{A_{m}^{5/3}}{A_{tot}^{1/6} \cdot W^{1.75} \cdot (W + G)^{0.25}}
\]

(2.7)

where \(A_{m}\) is the metal area, \(A_{tot}\) is the total inductor area (\(\approx S^2\) in Figure 2.8), \(W\) is the line width and \(G\) is the line spacing. All units are metric.

### 2.4.2 Quality Factor and Resonance Frequency

The quality factor of an inductor \((Q)\) is defined as

\[
Q = 2\pi \cdot \frac{E_{S}}{E_{L}}
\]

(2.8)

where \(E_{S}\) and \(E_{L}\) are the energy stored and energy dissipated per cycle, respectively [8]. This equation shows a general definition of the quality factor for an inductor regardless what stores or dissipates the energy. For an inductor,
only the energy stored in the magnetic field is of interest and \( E_s \) is equal to the
difference between the peak magnetic and electric energies [9]. When the peak
magnetic and electric energies are equal, the inductor is in self-resonance and
therefore \( Q \) reduces to zero in such a frequency. An on-chip inductor is a three-
port element including the substrate. It means there are couplings between on-
chip inductor and the substrate on which inductor is implemented. Taking these
couplings into account, more detailed definition of the quality factor of an
inductor is as following [9]

\[
Q = \frac{\omega L_s}{R_s} \left[ \frac{R_p}{R_p + ((\omega L_s / R_s)^2 + 1) \cdot R_s} \right] \left[ 1 - \frac{R_s^2 (C_s + C_p)}{L_s} - \omega^2 L_s (C_s + C_p) \right] \tag{2.9}
\]

where \( L_s \) and \( R_s \) are inductance and series resistance values respectively. \( C_s \) is
the capacitance due to overlap between the spiral and the center-tap underpass. \( R_p \) and \( C_p \) are frequency-dependent resistance and capacitance, which model the
substrate coupling [9]. Equation 2.9 has three distinguished parts: the first part
\((\omega L_s / R_s)\) is a linear function with respect to frequency, the second part is
substrate loss factor and the third one is the self-resonance factor. Equating the
self-resonance factor to zero gives the self-resonance frequency of the inductor.
According to Eq. 2.9 the quality factor of an inductor, instead of having a linear
behavior with respect to frequency changes, starts to be reduced above a certain
frequency as shown in Figure 2.9.

![Figure 2.9: Frequency behavior of \( Q \).](image)

There are techniques to increase the maximum achievable \( Q \) value and the
frequency in which \( Q_{\text{max}} \) happens [9]-[11].
2.5 Phase Noise

The spectrum of an ideal oscillator is an impulse at the operating frequency. However, in practice, the spectrum exhibits “skirts” around the center frequency, as shown in Figure 2.10. In order to measure the phase noise of an oscillator, a unit bandwidth at an offset of $\Delta \omega$ is considered and noise power in this bandwidth is divided by the carrier power. There are many studies aiming to quantify and formulate the phase noise of the oscillators. Some of them have tried to formulate it in the time domain [12], [13], while there are formulations in the frequency domain as well [14], [15]. One of the oldest models for the oscillator phase noise is derived by Lesson, resulting in the following equation [14]

$$ L(\Delta\omega) = \frac{1}{4Q^2} \left( \frac{\omega_0}{\Delta\omega} \right)^2 $$

(2.10)

where $L(\Delta\omega)$ is the phase noise at an offset of $\Delta\omega$ with respect to carrier frequency and $Q$ and $\omega_0$ are the quality factor of the oscillator and carrier frequency, respectively. There are different definitions for $Q$ for an oscillator. According to [15] the most practical one, which is applicable to variety of different oscillatory behaviors, is as

$$ Q = \frac{\omega_0}{2} \frac{d\Phi}{d\omega} $$

(2.11)

where $\omega_0$ is the carrier frequency and $\Phi$ is the phase of the open-loop transfer function of the oscillator.

Hajimiri [16] provides a model of phase noise, which explains the mechanism by which noise sources convert to phase noise. For each oscillator, an Impulse Sensitivity Function (ISF) is defined and based on this function, its phase noise is quantitatively predicted. According to Hajimiri’s model, the impact of any noise source on the oscillator phase noise varies across the oscillation period and it has a time-variant nature. This property is reflected in ISF definition.
Figure 2.10: Spectrum of (a) ideal oscillator and (b) real oscillator.

2.6 Contribution of This Thesis

The idea behind the oscillator design research in this research framework is to have a clear understanding about the difficulties and possibilities in using an oscillator as clock generator in an on-chip resonant clocking. In this technique, which is discussed in chapter 4, an oscillator drives the distributed load directly without any intermediate buffers [17], [18].

In our oscillator-related research, injection locking phenomenon is discussed and formulated for ring oscillators [19]. This phenomenon can be useful in phase noise and jitter reduction in the oscillator-based clock generators. Measurements show significant jitter suppression by applying this technique in an oscillatory system. Also, a quadrature oscillator design based on coupled ring oscillators is presented [20]. In the proposed structure, two coupled ring oscillators can generate quadrature outputs. Applying LC filtering technique and variable inductance concept give better phase noise and wider tuning range respectively. Besides RF applications, quadrature oscillators can be employed in future clock generators where different clock phases are needed.
2.7 References


[18] M. Hansson, B. Mesgarzadeh and A. Alvandpour, “1.56-GHz On-Chip Resonant Clocking with 2.3X Clock Power-Saving in 130-nm CMOS”, manuscript to be submitted.


Chapter 3

Frequency Multiplication

Frequency multiplication is a crucial task in most of the clock generators in high-performance microprocessors. Typically a phase-locked loop (PLL) is employed for clock multiplication purpose. Considering different trade-offs in a PLL-based clock multiplier design, delay-locked loops (DLLs) have become popular in order to be utilized in clock multiplication process. In this chapter, first a brief description of PLL and DLL structure is presented and then some frequency multiplication techniques based on these two elements are discussed.

3.1 PLL

A PLL is a feedback system, which receives a clock as input and produces another clock as output and the input clock and output clock are compared. When the input and output clocks become identical in the frequency and their phase difference is constant with time, we say PLL is locked. A simple PLL structure is shown in Figure 3.1. In this structure a phase detector (PD) compares the input and output signal and reflects their difference, a low-pass filter takes average of PD output and finally a VCO generates the output clock based on the difference between input and output.
The simplest PD can be implemented as an XOR gate and the simplest LPF can be implemented using a simple RC circuit. However investigating further in PLL dynamics shows that this simple implementation suffers several drawbacks. One of the most serious problems in this structure is “lock acquisition” problem [1]. If in start-up, VCO operates at a frequency far from the input frequency, loop may not be locked. This problem, which is studied and formulated mathematically [2], can be solved by using a frequency detector beside phase detector called as “aided acquisition” [1]. Combining phase and frequency detector results in the concept of charge-pump PLL. A block diagram of charge-pump PLL structure is shown in Figure 3.2.

Now considering a linear model for this structure, as shown in Figure 3.3, gives a second order closed loop transfer function as

\[ H(s) = \frac{K_{CP}K_{VCO}}{s^2 + K_{CP}K_{VCO}} \]  

(3.1)

Figure 3.1: A simple PLL.

Figure 3.2: Charge-pump PLL block diagram.
3.2 DLL

where $K_{VCO}$ is the gain of voltage-controlled oscillator and $K_{CP}$ is a constant which is determined by the charge-pump current and low-pass filter. Assuming $I_1=I_2=I_P$ in Figure 3.2, $K_{CP}$ equals

$$K_{CP} = \frac{I_P}{2\pi C_p}.$$  \hspace{1cm} (3.2)

![Diagram of charge-pump PLL](image)

**Figure 3.3:** A linear model of charge-pump PLL in frequency domain.

According to Eq. 3.1, the closed-loop transfer function contains two imaginary poles and therefore it is unstable. For stabilization purpose, a zero can be added to reduce phase shift to less than 180° at the gain crossover [1]. It can be done by adding a series resistance ($R_P$) to $C_P$. The closed-loop transfer function of the charge-pump PLL after this modification can be written as

$$H(s) = \frac{K_{CP} K_{VCO} (R_P C_P s + 1)}{s^2 + K_{CP} K_{VCO} R_P s + K_{CP} K_{VCO}}.$$  \hspace{1cm} (3.3)

where we have the same definitions for $K_{VCO}$ and $K_{CP}$ as mentioned earlier.

According to Eq. 3.3, the parameters of the charge-pump, loop filter and VCO should be selected carefully to have a stable PLL. The stability of the second-order charge-pump PLL has been studied previously, suggesting certain criteria on different parameters [3], [4]. Also to suppress the ripple, a second capacitor is added from the output of the charge pump to ground. This capacitor adds one more pole to the transfer function, creating a third order system requiring more study of stability issues [3]. Many extensive studies on analyzing, modeling and applications of PLL show the importance of PLL in the modern integrated circuit technologies [5].

### 3.2 DLL

DLL is a variant of PLLs, in which, input clock is compared with a delayed version of it [6], [7]. In a DLL, the VCO of PLL is replaced by a voltage-controlled delay line (VCDL). Input clock is delayed by an integer multiple of its period. When the phase difference between input and output becomes zero,
we say DLL is locked. A block diagram of DLL is shown in Figure 3.4. In this structure, a voltage-controlled delay line (VCDL) consisting of number of cascaded delay elements is controlled by the output of the charge-pump (CP) after filtering. A phase detector (PD) is used for phase comparison between input and output clock. A 4-stage implementation of VCDL and its waveforms when DLL is locked are shown in Figure 3.5.

![Figure 3.4: DLL block diagram.](image)

(a)

![Figure 3.5: VCDL (a) 4-stage implementation and (b) waveforms under lock condition.](image)

(b)
As shown in Figure 3.5, using DLL, different equally spaced clock phases of input clock can be generated.

Using the same method as in previous section, a frequency domain model of DLL is shown in Figure 3.6. As shown in this figure, the transfer function of VCDL is equal to the gain of the VCDL. It means that the transfer function of the feedback system in DLL is the same as that of LPF, resulting in an interesting property for DLL. Assuming a single capacitor (C_P) as LPF, the closed-loop transfer function of the DLL is as

\[ H(s) = \frac{K_{CP}K_{VCDL}}{s + K_{CP}K_{VCDL}} \]  \hspace{1cm} (3.4)

where \( K_{VCDL} \) is the gain of VCDL and \( K_{CP} \) is a constant given by Eq. 3.2.

According to Eq. 3.4, DLL is a first-order system and therefore it is stable. This property of DLL makes it very interesting and popular. As we will discuss in the next sections, different studies have been done in order to compare PLLs and DLLs and in many cases DLLs have been proper alternatives for PLLs.

3.3 Clock Multipliers

As mentioned earlier, typically frequency multiplication is performed using PLLs and DLLs. Different trade-offs should be considered in order to design a robust and precise frequency multiplier for clock generation purpose. In this section these two different strategies (PLL-based and DLL-based frequency multiplication) will be discussed.

3.3.1 PLL-Based

A PLL can be employed in order to multiply a reference clock by a specified number. Figure 3.7 depicts the concept of the frequency multiplication based on PLL. Output frequency is divided by \( M \) in feedback loop and the result is compared with the reference frequency. The PFD compares \( f_{out}/M \) with \( f_{ref} \) and when PLL locks, \( f_{out}/M \) is equal to \( f_{ref} \). It means VCO oscillates in \( M \) times
higher frequency than the reference clock, performing the frequency multiplication by $M$.

![PLL-based frequency multiplication](image)

**Figure 3.7: PLL-based frequency multiplication.**

A division by $N$ in the input of PFD can create a rational number ($M/N$) multiplication possibility as well. Also it is possible to control the division factor by proper logic circuits to design a PLL-based frequency synthesizer.

### 3.3.2 DLL-Based

PLL-based clock synthesis suffers some drawbacks. PLL is a higher-order system with several stability issues. Because of that, its design process is much more time-consuming than that of a first-order system like DLL. Also jitter accumulation problem is another drawback for PLLs. Since the jitter from VCO is circulated in the feedback system, it is accumulated over several clock cycle [8], [9]. These drawbacks are good motivations for replacing PLL by DLL for clock synthesis purpose [10]-[13]. There is no unique technique for DLL-based frequency multiplication but typically; the idea behind it is to take different phases produced by VCDL and to combine them using digital logic to create more transitions from one transition. The reported state-of-the-art DLL-based frequency multipliers can only multiply frequency by $N$ (an integer number) [10], [12] or by $N/2$ (fractional increment by 0.5) [11], [13]. Another drawback of DLL-based structure is that the additional large parasitics limits the operation frequency range [11].

### 3.4 Contribution of This Thesis

According to discussion above, although PLL and DLL have their own special advantages, unfortunately both of them suffer some drawbacks when used in clock generators. Therefore taking advantages of both PLL and DLL in frequency multiplier design can improve the overall performance and solve many design complexities. In paper 4, a combined structure is presented, which uses both PLL and DLL features to perform frequency multiplication [14]. The idea is to have a first-order loop, which typically is easy to design, and it
controls a VCO, which works out of the loop for multiplication purpose. This implementation adds more flexibility and saves area and power consumption and decreases the design process difficulties. The proposed structure, which is implemented in 130-nm CMOS process, operates in the frequency range of 100 MHz-1.5 GHz. The comparisons show an area and power saving compared to previously reported DLL-based structures [14].

### 3.5 References


[14] B. Mesgarzadeh and A. Alvandpour, “A 24-mW, 0.02-mm$^2$, 1.5-GHz DLL-Based Frequency Multiplier in 130-nm CMOS”, manuscript to be submitted.
Part III

Clock Distribution
Chapter 4

Synchronization and Clocking

In today’s large-scale and high-speed digital integrated circuits, clocking and synchronization are crucial tasks from many aspects. Almost all modern microprocessors need a proper strategy for synchronization purpose to perform different tasks. In high-speed on-chip communication, as chip dimension increases, clock skew and power consumption cause serious problems. In this chapter three different strategies in clock distribution are presented. First of all, conventional global synchronization is discussed. Since in this scheme, skew management and power consumption are two challenging issues, two other strategies called “mesochronous clocking” and “resonant clocking” to overcome these problems are presented.

4.1 Global Synchronization

Global synchronization is a traditional way to keep all the different functional blocks inside the chip synchronous with a reference clock. An H-tree implementation of such a system is shown in Figure 4.1. A master clock should be delivered to all blocks at the same clock phase. It means that in such a system, master clock should experience the same clock skew in different leaves. In this system, clocked I/O ports may malfunction, if any data-read failure
occurs due to clock skew [1]. To reduce the skew, wide metal wires are needed which increase the power consumption [1], [2].

![Diagram of synchronization and clocking](image)

**Figure 4.1: Global synchronization (H-tree implementation).**

In the system shown in Figure 4.1, for data transfer between two nonadjacent blocks, short data transfers between adjacent blocks are needed. In this case for long-distance data transfer total delay will be increased and maximum clocking frequency, which is limited by the total delay, will be decreased. On the other hand, buffer stages are needed in order to increase the driving capability of the reference clock. These buffers are power hungry elements of clock distribution network and increase the power consumption needed for synchronization. A significant part of the total power consumption in the modern microprocessors disputes in the clock distribution network [3]-[5]. In order to get more insight about the clock network power consumption, we can divide the clock distribution network shown in Figure 4.1 into global and local clock distribution [6]. The global distribution includes all intermediate buffers and wires, which are needed to drive the final load in leaves. The local distribution includes all clock loads (gates and latches) and all wires which connect the last stage buffers to the load. To get the minimum clock skew through the buffers, using the concept of logical effort [8], $m$ stages of buffers with equal stage effort are used. It means the tapering factor for all stages, is $n$. Since the buffers drive the load in parallel, we can combine them in a simplified form as shown in Figure 4.2 [6].
In Figure 4.2, $C_G$ is the capacitance from global clock distribution and $C_L$ is the load, which is driven by the last buffer stages locally. Then the total capacitance in the clock distribution network ($C_T$) is as \[ C_T = C_L + C_G = C_L + \sum_{i=1}^{m} \frac{C_L}{n^i} = C_L \cdot \frac{1 - (1/n^{m+1})}{1 - (1/n)} \approx C_L \cdot \frac{n}{n-1}. \] (4.1)

This approximation is accurate enough because typically $n$ is about 3 and even for small number of stages the term $1/n^{m+1}$ is negligible. According to Eq. 4.1 the power dissipated in the clock distribution network can be estimated as

\[ P_C = C_T V_{dd}^2 f = \frac{n}{n-1} C_L V_{dd}^2 f \] (4.2)

where $V_{dd}$ is the power supply voltage and $f$ is the clock frequency. Equation 4.2 gives an estimation of clock network power consumption and we will discuss more about that later, when introducing the concept of resonant clocking.

**4.2 Mesochronous Clocking**

As discussed in the previous section, in global synchronization strategy, maximum clock frequency is limited by the delay needed for data communication between blocks especially when chip size increases. On the other hand, clock skew management becomes more challenging and it could create data-read failures in clocked I/Os. To remedy these problems an alternative for global synchronization is proposed in which clock distribution is integrated in data communication buses. This strategy is called as “mesochronous clocking” and the term of mesochronous is referred to the clocks with the same frequency but different phases [1], [2], [7]. To get more insight, a
mesochronous communication scheme is depicted in Figure 4.3. In this strategy, clock distribution is done using a signal called “strobe”, accompanying data links between the blocks. Each block has its own local clock or alternatively it can use strobe as its local clock. Since delay in data transfers between the blocks are unknown, the phase relation between the local clock of each block and incoming data is unknown and special techniques should be used to prevent failures like metastability during data read [1], [2], [9]. Since clock and data distribution are done at the same manner, the advantage of this strategy over globally synchronous method is that maximum clock frequency is not limited by data transfer delays.

![Figure 4.3: Mesochronous clocking.](image)

### 4.3 Resonant Clocking

Power consumption in the clock distribution network is a significant part of the total power consumption in the modern microprocessors [3]-[5]. Therefore any technique for power saving in clock distribution network can have a great impact on the total power consumption reduction in very large-scale integrated circuits. According to Eq. 4.2 for a tapering factor of 3, 2/3 of clock power is dissipated in the local clock distribution. In global clock-tree synchronization scheme, this power can be reduced only by using aggressive clock gating and it is not possible to reduce the load capacitance because it is fixed by the load. There are some techniques for power saving in the global clock distribution power consumption [10], [11], although they slow the growth of the clock power dissipation but they are limited by the fixed clock load. Resonant clocking is an interesting alternative, which can be a remedy for the mentioned bottleneck. This strategy directly addresses the power dissipation in the local clock load, by using it as the capacitor in a LC tank. It means all intermediate buffers are removed and the LC oscillator drives the load directly. A simplified model of resonant clock distribution is shown in Figure 4.4.
4.3 Resonant Clocking

4.3.1 Power Dissipation

In order to obtain an estimate of the power saving, we assume that the output of LC oscillator is a sinusoidal with the amplitude and DC level of $V_{dd}/2$ providing a clock swing between 0 and $V_{dd}$. The average power dissipation in the RLC circuit at resonance is

$$P_R = \frac{3V_R^2}{2R_P}$$

(4.3)

where $V_R$ is the amplitude of the sinusoidal oscillator output and $R_P$ is the parallel resistance in the tank. Replacing $V_R$ with $V_{dd}/2$ and assuming a quality factor of $Q$ for the tank ($Q=2\pi fR_P C_L$) gives [6]

$$P_R = \frac{3\pi}{4Q} V_{dd}^2 fC_L$$

(4.4)

in which $f$ is the resonance frequency. Using Eq. 4.2 and Eq. 4.4 results in

$$\frac{P_R}{P_C} = \frac{3\pi(n-1)}{4Qn}.$$  

(4.5)

Equation 4.5 is an important result which provides a comparison for power dissipation between resonant clocking and conventional scheme. According to this result, for a tapering factor of 3, a quality factor greater than $\pi/2$ is needed to get power saving from resonant clocking compared to the conventional buffer-driven globally synchronous scheme. As mentioned in chapter 2, different techniques have been proposed to increase the quality factor of the on-chip
inductors. On the other hand using off-chip bonding wire inductance with higher quality factor [12], can increase power saving in the resonant clocking-based synchronization.

### 4.3.2 Quality Factor

The quality factor of the tank is an important factor which determines how much power saving is achievable using resonant clocking scheme. The quality factor of the tank can be defined as

$$Q_T = Q_L \parallel Q_C$$

(4.6)

where $Q_L$ is the quality factor of the inductor and $Q_C$ is the quality factor of the capacitor. In chapter 2 the quality factor of inductor is discussed and for a capacitor with parallel resistance of $R_C$, the quality factor can be defined as

$$Q_C = CR_C \omega.$$  

(4.7)

Typically for LC oscillators $Q_C$ is high enough to be ignored and the quality factor of the tank is limited by the inductor. In case of resonant clocking, this is unfortunately not the case. The resistance of the metal wires, which connects the load to the tank, contributes in the quality factor of the capacitance and $Q_C$ is decreased. As a numerical example, if resonance frequency is 2 GHz and the total load is 15 pF, wire resistance of 1 Ω results in a quality factor of about 5-6 for the capacitor. Therefore, extra attention should be paid to clock distributing wires design. Using upper layer metals for wires and utilizing schemes like grid could help to decrease the resistivity of the wires and consequently to increase the quality factor of the capacitor.

### 4.3.3 Mixing Phenomenon

In resonant clocking the capacitance of the tank is provided by load capacitance, which has time-variant nature. The value of the load capacitance is data-dependent and for different data activities the speed of its variation can be changed. Assume that a number of flip-flops are connected to the tank as load and their parasitic capacitance contributes in the total tank capacitance. When data pattern changes in the input of the flip-flops, the capacitance seen by the tank and natural frequency of the oscillator will be changed. In this case as shown in Figure 4.5, we can assume that the load consists of a constant capacitance plus a time-variant part. The instantaneous frequency is
\[ \omega_i = \frac{1}{\sqrt{L(C_0 + C_L(t))}} = \frac{\omega_0}{\sqrt{1 + C_L(t) / C_0}}. \] (4.8)

For small variations of time-variant part \((C_L(t) \ll C_0)\), Eq. 4.8 is approximated as

\[ \omega_i = \omega_0 \left(1 - \frac{C_L(t)}{2C_0}\right). \] (4.9)

If \(C_L(t) = A \cos(\omega_m t)\) then the instantaneous frequency is

\[ \omega_i = \omega_0 \left(1 - \frac{A}{2C_0} \cos(\omega_m t)\right). \] (4.10)

Figure 4.5: Simplified model of resonant clocking with time-variant load.

According to Eq. 4.10, the natural frequency of the oscillator is modulated by \(\omega_m\). The measured output spectrum of the LC oscillator in the resonant clock distribution network shows this mixing phenomenon due to time-variant nature of the load capacitance as shown in Figure 4.6. Depending on the data frequency in the input of flip-flops, the spacing between the sidebands in the output spectrum changes. This causes different values for the clock jitter in the clock distribution network, depending on the data activity [6], [13]. Jitter peaking occurs in a data rate about one-half of the clock frequency [6], [13]. In such data rate, many sidebands are combined close to the center frequency and the phase noise increases rapidly. The same situation will occur if the data rate is chosen close to the resonance frequency.
4.4 Contribution of This Thesis

As mentioned previously, the ultimate purpose of this research is to present successful experiences on new alternative schemes for the conventional global synchronization. In this way, paper 1 presents a new mesochronous scheme in which the problem of metastability failure has been solved using a completely digital implementation with a robust behavior [9]. On the other hand, new research directions in demonstration of capabilities of resonant clocking are initiated, but previously reported experiments have not been successful in high-frequency demonstration [6]. Paper 3 presents a completely successful experience of 1.56-GHz resonant clock distribution, which reports so far the fastest on-chip LC-tank energy-recovery clocking without any intermediate clock buffers and the first successful experiment studying the impact of the resonant clocking on flip-flop and data path power consumption [13]. As future work in this field, possible ways of clock gating and efficient techniques of jitter reduction (for example using injection locking phenomenon) could be remarked.

4.5 References


4.5 References


Part V

Appendix
MOS Transistor Equations

A deep-submicron MOS transistor has four different operating regions: subthreshold, linear, saturation and velocity saturation. The current equations for these regions can be written as following.

**Subthreshold:** $V_{GS} < V_T$

$$I_{DS} = I_0 e^{\frac{V_{GS}}{kT/q}} (1 - e^{\frac{-V_{DS}}{kT/q}}) \quad (A.1)$$

**Linear:** $V_{GS} > V_T$ and $\min(V_{GS} - V_T, V_{DS}, V_{DSAT}) = V_{DS}$

$$I_{DS} = k'_n \left( \frac{W}{L} \right) (V_{GS} - V_T) V_{DS} - \frac{V_{DS}^2}{2} \quad (A.2)$$

**Saturation:** $V_{GS} > V_T$ and $\min(V_{GS} - V_T, V_{DS}, V_{DSAT}) = V_{GS} - V_T$

$$I_{DS} = \frac{k'_n}{2} \left( \frac{W}{L} \right) (V_{GS} - V_T)^2 (1 + \lambda V_{DS}) \quad (A.3)$$
Velocity Saturation: $V_{GS} > V_T$ and $\min(V_{GS} - V_T, V_{DS}, V_{DSAT}) = V_{DSAT}$

$$I_{DS} = k_n' \left( \frac{W}{L} \right) \left[ (V_{GS} - V_T)V_{DSAT} - \frac{V_{DSAT}^2}{2} \right] \cdot (1 + \lambda V_{DS}) \quad (A.4)$$

In presence of body-effect phenomenon (due to different voltage levels between source and bulk), $V_T$ in Eq. A.1-A.4 is calculated as

$$V_T = V_{T0} + \gamma (\sqrt{|-2\phi_F + V_{SB}|} - \sqrt{|-2\phi_F|}) \quad (A.5)$$

where $\gamma$ is the body-effect constant.