[ieee 2014 ieee 29th international conference on microelectronics (miel) - belgrade, serbia...

4
393 978-1-4799-5296-0/14/$31.00 © 2014 IEEE PROC. 29th INTERNATIONAL CONFERENCE ON MICROELECTRONICS (MIEL 2014), BELGRADE, SERBIA, 12-14 MAY, 2014 The 2-Phase On-demand Delayed Clock Generator Circuit S. Poriazis Abstract - The phased clock signals are useful to synchronize the individual modules within a multiphase digital system and satisfy the complexity of their clock timing requirement. The capability of the on-demand adjustment of the phased clocking pattern can be embedded to the circuit that generates the associated clocks by shifting in time their active clock edges. A delay insertion technique is presented that can implement this adjustment capability. The designed two-phase circuit, targeting an FPGA device, generates the phased clock signals and has control inputs that define the timing and the positioning of delays. These clocks can directly drive the idle system modules to reduce their power consumption during the specified delay periods. I. INTRODUCTION In a sequential circuit clock signals can be distributed by a clock tree or network. The clock skew scheduling assigns different clock latencies to the registers of a sequential circuit for minimizing the cycle period, by borrowing time from paths with slacks and applying it to critical paths. Phase shifts between clock domains can be considered for the physical implementation of the multidomain clock. In [1] the formation of an optimal algorithm achieves a solution for the multidomain clock skew scheduling problem (MDCSS) applicable to small domain number. Within this algorithm a process of incrementally decrease the clock period T, starting from an initial value, is being incorporated while dynamically maintaining the corresponding shortest path trees. A clocking strategy that utilizes phase-shifted clocks is given by [2] which is focused in combinational circuits. In this work the assigned clock periods and the phase-shift is dependent on the slack values. Clock gating is a design technique to save power consumption. The impact of clock gating on clock skew scheduling has been studied by [3]. This work attempts to perform co-synthesis of data paths and clock control paths for gated clock designs via the simultaneous application of clock skew scheduling and delay insertion. A delay insertion may be implemented by buffer insertion which can increase the power consumption, on the other hand by gate downsizing which can decrease the power consumption. Power-reducing techniques can be added to flip-flops in order to save the power dissipated on the clock tree. Clock-gating is one of the major techniques [5]. For a large digital system, clock-gating technique is used to reduce the power consumed on idle circuitry in the design. For a clock-gated system, the internal clock controls the gated circuits. However, the combination of clock-gating and double edge-triggered techniques can create an asynchronous sampling under certain circumstances, evidenced by the output changing between clock edges. Specialized gating circuits can be used to attempt to filter out the asynchronous data sampling in order to remove the asynchronous transitions. Clock-gating is also considered by [6] in order to reduce the power consumption of an FPGA design. Clock gating involves two main aspects, the first one is to select those blocks to be frozen during certain periods of time, and the second one is to generate all the necessary control signals to gate the required clocks. Some problems may arise if the control signals are being synchronized with a clock to be gated and the controlled block could behave aberrantly and its internal states to change unexpectedly. Another problem could arise if the above control signals are synchronized with the original clock and the gated clock arrives simultaneously, then setup and hold violations could occur. An alternative to the above clock-gating technique is presented in this paper and is based on the delay insertion concept being applied to the timing pattern of the set of phased clocking signals that synchronize the individual modules of a multiphase digital system. The phased clock signals have been introduced by [7] to synchronize the operation of the multiphase model of a digital system. The timing activity of the phased clock signals can be slowed down incrementally by the amount of delays that are being inserted into them, either during a single time instance or during a series of separate time instances per signal. This adjustment of clocking activity is used to reduce the power being consumed on idle circuitry in the design. It is noted that the frequency of the phased clock signals remains the same and only their timing activity is adjusted on-demand to the power consumption requirements of the system. As a result to this adjustment particular clock edges can be shifted in time during the delays. This technique overrides the above mentioned problems of the clock-gating since the routing of the phased clock signals is not interrupted by any gating logic and these clocks directly drive the clock inputs of each sequential module within the system. It is also noted that the duration of each delay being inserted is equal to a multiple of the half of the phased clock period of the output signals. In addition to that, there is no Dr. Serafim Poriazis is with the R&D Department of Phasetronic Laboratories at www.phasetroniclab.com and the email address is: [email protected]

Upload: s

Post on 16-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2014 IEEE 29th International Conference on Microelectronics (MIEL) - Belgrade, Serbia (2014.5.12-2014.5.14)] 2014 29th International Conference on Microelectronics Proceedings

393978-1-4799-5296-0/14/$31.00 © 2014 IEEE

PROC. 29th INTERNATIONAL CONFERENCE ON MICROELECTRONICS (MIEL 2014), BELGRADE, SERBIA, 12-14 MAY, 2014

The 2-Phase On-demand Delayed Clock Generator

Circuit

S. Poriazis

Abstract - The phased clock signals are useful to synchronize

the individual modules within a multiphase digital system and

satisfy the complexity of their clock timing requirement. The

capability of the on-demand adjustment of the phased clocking

pattern can be embedded to the circuit that generates the

associated clocks by shifting in time their active clock edges. A

delay insertion technique is presented that can implement this

adjustment capability. The designed two-phase circuit, targeting

an FPGA device, generates the phased clock signals and has

control inputs that define the timing and the positioning of delays.

These clocks can directly drive the idle system modules to reduce

their power consumption during the specified delay periods.

I. INTRODUCTION

In a sequential circuit clock signals can be distributed

by a clock tree or network. The clock skew scheduling

assigns different clock latencies to the registers of a

sequential circuit for minimizing the cycle period, by

borrowing time from paths with slacks and applying it to

critical paths. Phase shifts between clock domains can be

considered for the physical implementation of the

multidomain clock. In [1] the formation of an optimal

algorithm achieves a solution for the multidomain clock

skew scheduling problem (MDCSS) applicable to small

domain number. Within this algorithm a process of

incrementally decrease the clock period T, starting from an

initial value, is being incorporated while dynamically

maintaining the corresponding shortest path trees. A

clocking strategy that utilizes phase-shifted clocks is given

by [2] which is focused in combinational circuits. In this

work the assigned clock periods and the phase-shift is

dependent on the slack values.

Clock gating is a design technique to save power

consumption. The impact of clock gating on clock skew

scheduling has been studied by [3]. This work attempts to

perform co-synthesis of data paths and clock control paths

for gated clock designs via the simultaneous application of

clock skew scheduling and delay insertion. A delay

insertion may be implemented by buffer insertion which

can increase the power consumption, on the other hand by

gate downsizing which can decrease the power

consumption. Power-reducing techniques can be added to

flip-flops in order to save the power dissipated on the clock

tree. Clock-gating is one of the major techniques [5]. For a

large digital system, clock-gating technique is used to

reduce the power consumed on idle circuitry in the design.

For a clock-gated system, the internal clock controls the

gated circuits. However, the combination of clock-gating

and double edge-triggered techniques can create an

asynchronous sampling under certain circumstances,

evidenced by the output changing between clock edges.

Specialized gating circuits can be used to attempt to filter

out the asynchronous data sampling in order to remove the

asynchronous transitions.

Clock-gating is also considered by [6] in order to

reduce the power consumption of an FPGA design. Clock

gating involves two main aspects, the first one is to select

those blocks to be frozen during certain periods of time,

and the second one is to generate all the necessary control

signals to gate the required clocks. Some problems may

arise if the control signals are being synchronized with a

clock to be gated and the controlled block could behave

aberrantly and its internal states to change unexpectedly.

Another problem could arise if the above control signals

are synchronized with the original clock and the gated

clock arrives simultaneously, then setup and hold

violations could occur.

An alternative to the above clock-gating technique is

presented in this paper and is based on the delay insertion

concept being applied to the timing pattern of the set of

phased clocking signals that synchronize the individual

modules of a multiphase digital system. The phased clock

signals have been introduced by [7] to synchronize the

operation of the multiphase model of a digital system. The

timing activity of the phased clock signals can be slowed

down incrementally by the amount of delays that are being

inserted into them, either during a single time instance or

during a series of separate time instances per signal. This

adjustment of clocking activity is used to reduce the power

being consumed on idle circuitry in the design. It is noted

that the frequency of the phased clock signals remains the

same and only their timing activity is adjusted on-demand

to the power consumption requirements of the system. As a

result to this adjustment particular clock edges can be

shifted in time during the delays. This technique overrides

the above mentioned problems of the clock-gating since the

routing of the phased clock signals is not interrupted by

any gating logic and these clocks directly drive the clock

inputs of each sequential module within the system. It is

also noted that the duration of each delay being inserted is

equal to a multiple of the half of the phased clock period of

the output signals. In addition to that, there is no

Dr. Serafim Poriazis is with the R&D Department of

Phasetronic Laboratories at www.phasetroniclab.com and the

email address is: [email protected]

Page 2: [IEEE 2014 IEEE 29th International Conference on Microelectronics (MIEL) - Belgrade, Serbia (2014.5.12-2014.5.14)] 2014 29th International Conference on Microelectronics Proceedings

394

asynchronous behavior exercised at the output of the circuit

that generates these clock signals due to the insertion of

delays. This is achieved by the presented clock generator

circuit which is designed to operate internally in a two-

phase mode considering the original clock signal being

used.

In this paper, the design of the two-phase clock

generator circuit is presented which produces a set of

phased-clock signals with a clocking pattern that can be

adjusted on-demand by the appropriate control inputs.

Within this clocking pattern appropriate clock edges can be

shifted in time during the specified delay periods. In

particular, in section II the analysis of the delay insertion

technique is given which is embedded to this circuit. In

section III the basic block of the 2PDCLK circuit operation

is described. Finally in section IV the simulation results are

given for this circuit. The 2PDCLK circuit was

successfully designed by using an FPGA design platform.

II. ANALYSIS OF THE DELAY INSERTION

TECHNIQUE

Considering a set of k phased clock signals and their

active timing edges over the duration P, which is called the

phased period, of k periods of T, where T is the clock

period of the originating clock signal CLK, we have in total

2*k clock edges within P. It is noted that the period T of

CLK has two phases, identified by a and b for the

values of logic 1 (including the active rising edge) and 0 (including the active falling edge) respectively. Each

phased clock signal PDCLK of the clock generator circuit

has in total one rising clock edge and one falling clock

edge (in total has 2 active clock edges) being delivered

during P. The phased period P of each phased clock signal

PDCLK remains the same and equals to k*T, where k can

take the values from 1 to k and is shifted in time according

to the value of k following the phase pattern of CLK. The

basic circuit block being considered has two clock signals

at its output that are noted by PDCLK(1) and PDCLK(2).

For this basic circuit we have k being equal to the value of

2 and the phased period P of PDCLK(1) and PDCLK(2)

being equal to the value of 2*T. This block can be used to

build a multilevel binary tree-like circuit structure and form

circuits that can generate output clocks PDCLK(k) that

have k=1 to 2**n, where n signifies the number of levels of

the binary structure. In such a structure the value of k can

be as follows: 2, 4, 8, 16, ... , 2**n, with n taking an integer

value. The duration of the clock pulse of each PDCLK is

equal to k*T/2 for a duty cycle of 50%.

The phased clocking pattern of the PDCLK signals

during the period P can deliver less than the total of 2*k

clock edges depending on the amount of delays being

inserted (that is, an adjustment on the clocking activity of k

rising edges and k falling edges). This clocking adjustment

can be incremental allowing the clocking activity to slow

down or to speed up according to the logic values of the

controlling circuit inputs DLY. We can shift in time

individual clock edges of each phased clock signal PDCLK

at any timing instance by inserting into the circuit operation

a delay of activity, noted by D, during which there is no

clocking edge appearing at the corresponding output of the

presented circuit. Thus, the appropriate rising or falling

clock edges of the corresponding phased clock signal

PDCLK are shifted in time. These delays can be applied in

sequence to the outputs of the circuit accordingly to the

system requirements. The application of a delay D is

externally controlled by the circuit input DLY(k) that

corresponds to the phased clock signal PDCLK(k) under

consideration. These control inputs are internally

synchronized by the phases a or b of the originating

TABLE I

MATRIX FOR THE K=4 PHASED CLOCK PATTERN DELAY INSERTION

DESCRIPTION

clock signal CLK, such that the resulting delay is always

applied in synchronism to the clock signal CLK and thus

avoiding the appearance of relevant transient pulses at the

outputs PDCLK of the circuit. The resulting delay at the

output PDCLK has the duration of multiples of P/2.

A rectangular two dimensional matrix X of size k*k

can be used to map the phased clocking pattern of the

signals PDCLK of the presented clock generator circuit.

Each row of the matrix corresponds to a phased clock

signal PDCLK(i) with the row index i taking integer values

from 1 up to k. Each column of the matrix corresponds to

the duration of one clock period T(j) for the phased period

P, with the column index j taking values from 1 up to k.

This column is divided into two sub columns for each

phase of the originating clock signal CLK, that is, we have

T1 T2 T3 T4

T1a T1b T2a T2b T3a T3b T4a T4b PDCLK1 "1" "1" "1" "1" "0" "0" "0" "0" PDCLK2 "0" "1" "1" "1" "1" "0" "0" "0" PDCLK3 "0" "0" "1" "1" "1" "1" "0" "0" PDCLK4 "0" "0" "0" "1" "1" "1" "1" "0"

DELAY at level

2 D(2,1)

PDCLK1 "1" "1" "1" "1" "0" "0" "0" "0" PDCLK2 "0" "0" "0" "0" "0" "1" "1" "1" PDCLK3 "0" "0" "1" "1" "1" "1" "0" "0" PDCLK4 "0" "0" "0" "1" "1" "1" "1" "0"

Page 3: [IEEE 2014 IEEE 29th International Conference on Microelectronics (MIEL) - Belgrade, Serbia (2014.5.12-2014.5.14)] 2014 29th International Conference on Microelectronics Proceedings

395

a sub column a for the phase a and a sub column b for

the phase b of CLK, notated j(a) and j(b) for column j,

each of length T/2. The coordinates of the positioning of a

delay D(i,j) by the control signal DLY(i) within this matrix

uses the row index i for the phased clock signal PDCLK(i)

for which this delay is inserted and column index j for the

timing instance within the phased clock period P when this

delay is applied. It is noted that each delay D(i,j), being

notated in this matrix, is visualized as a block of length T

and having positions that can be identified horizontally by

the values of index i and vertically by the values of index j

which includes the value of the clock edge during T(j) that

is to be shifted in time. The identification of a delay D(i,j)

can take in total 2*k individual possible positions on this

matrix during the period P. For example, we can consider a

clock generator circuit with four clock outputs PDCLK(i)

where i=1,2,3,4. We have in total 8 clock edges being

delivered at the circuit outputs (4 rising and 4 falling

edges) during the period P of length 4*T. The duty cycle of

50% results in a phased clock pulse of duration equal to

2*T. This circuit is composed of a two level binary tree-

like structure where n=2 and there are three basic circuit

blocks being utilized, one for the first level and two for the

second level. At the top section of Table I we visualize the

normal mode of operation of the 2-level 2PDCLK circuit

without any delay being inserted to the signals PDCLK.

For the positioning combination D(2,1) of a delay D being

applied to the circuit for signal PDCLK2 at time T1, we

visualize the corresponding adjustment of the clocking

activity of PDCLK2 in the bottom section of the matrix in

Table I, that is, a delay insertion with a time shift of the

rising edge of T1 by 2*T to the right direction of row 2

(starting at T1b and ending at T3a).

III. THE 2PDCLK CIRCUIT OPERATION

The fundamental block, which generates the 2-phased

clock signals PDCLK1 and PDCLK2, under the control of

two delay insertion signals DLY1 and DLY2, is called the

2-Phase On-Demand Delayed Clock Generator circuit and

is notified as 2PDCLK. This circuit is driven by the clock

signal CLK of frequency f and internally both clock phases

are utilized for synchronization. The block diagram is

shown in Figure 1, where the above signals are shown and

an additional RESET signal is added for initializing the

circuit. The two-phased clocks at the output of the circuit

have frequency equal to f/2 with signal PDCLK1 leading

the PDCLK2 by a T/2 phase difference, where T the period

of the clock signal CLK. The pulse width of each output

signal PDCLK1 and PDCLK2 is equal to T with a duty

cycle of 50 percent.

When the delay insertion input signal DLY1 is active

high, it controls the output signal PDCLK1 by shifting on-

demand the active clock edges at the output. The signal

PDCLK1 maintains its logic value that was appearing at

the time before the application of the delay. This logic

value can be either 0 or 1 depending on the previous

logic status of PDCLK1 and is kept stable for as long as the

DLY1 is active. This is actually a controlled time-shifting

operation on the clock signal PDCLK1. At the inactive

state of DLY1, the output signal PDCLK1 returns to its

normal operation that is a normal phased clocking pattern

of frequency f/2. Similarly, the delay insertion input signal

DLY2 controls the output signal PDCLK2.

Internally, the 2PDCLK circuit operates at both

phases of CLK, each one of length T/2, and incorporates

the two-phase mode of operation. Considering the

externally application of control orders to the two inputs

DLY1 and DLY2, this circuit has the capability of

maintaining the logic values of these inputs in synchronism

Fig. 1. The 2PDCLK circuit block to CLK and adjusting

accordingly the clock edge timeline of PDLCK1 and

Fig. 1. The 2PDCLK circuit block

PDCLK2. The idle system modules are directly driven by

the clock outputs PDCLK1 and PDCLK2 of the presented

circuit such that no clock-gating logic is required to save

their power consumption. This circuit can be extended by

building binary tree-like multilevel structures and

generating additional phased clock signals PDCLKs with

the same capabilities. Such structures can be used to drive a

larger number of individual system modules while keeping

the clocking adjustment capabilities of the presented block

circuit.

IV. THE VHDL SIMULATION AND SYNTHESIS

RESULTS

The VHDL test bench simulation results for the

2PDCLK circuit block are given in Figure 2 where the

clocking pattern of signals PDCLK1 and PDCLK2 is given

for the asserted logic states of the delay insertion control

signals DLY1 & DLY2. In Figure 2(a) the circuit starts

operating at its normal mode by producing the phased

clocking pattern at PDCLK1 and PDCLK2 output signals.

When the DLY1 is set active (its value equals to 1 ) then

the value of output PDCLK1 is set to 0 without any

clocking activity. During this time the second input DLY2

is kept inactive (its value remains to 0 ) and the PDCLK2

is not affected while it keeps its normal clocking activity.

When the DLY1 returns to its inactive state (its value

equals to 0 ) then the output PDCLK1 comes back to its

Page 4: [IEEE 2014 IEEE 29th International Conference on Microelectronics (MIEL) - Belgrade, Serbia (2014.5.12-2014.5.14)] 2014 29th International Conference on Microelectronics Proceedings

396

normal clocking operation. In Figure 2(b) we have

similarly the signal DLY2 set to active high and thus

inserting delays at the output signal PDCLK2. Finally in

Figure 2(c) we have both signals DLY1 and DLY2 set to

active high. As a result the active clock edges of both

output signals PDCLK1 and PDCLK2 are shifted in time

by the delay duration being inserted for as long as the

controlling signals are active. When the control inputs

DLY1 and DLY2 are set inactive, the outputs PDCLK1 &

PDCLK2 of the circuit return to their normal operation.

This circuit was successfully designed, simulated and

synthesized by using a commercially design platform,

targeting an FPGA device of the Xilinx Spartan series, by

using the corresponding VHDL description.

V. CONCLUSION

The design aspects of the 2PDCLK circuit are being

considered in this paper. The operation of this circuit is

based on the generation of a set of phased clock signals that

have a clocking pattern of a repeating phased sequence of

rising and falling edges. The presented circuit is capable to

adjust the clocking activity of its outputs by inserting

appropriate delays. During these delays the clock edges are

shifted in time while the corresponding phased clock

signals keep their previous logic values. The delay

insertion technique is analyzed and visualized by a

dedicated matrix. The VHDL simulation results, targeting

an FPGA device, are given for the designed 2PDCLK

circuit and verify the presented circuit operation.

REFERENCES

[1] L. Li, Y. Lu and H. Zhou, �Optimal and Efficient Algorithms

for Multidomain Clock Skew Scheduling�, in IEEE Transactions

on Very Large Scale Integration (VLSI) Systems, Volume: PP,

Issue: 99, 2013, pp.1.

[2] R. Hyman, N. Ranganathan, T. Bingel and D. T. Vo, �A

Clock Control Strategy for Peak Power and RMS Current

Reduction Using Path Clustering�, in IEEE Transactions on Very

Large Scale Integration (VLSI) Systems, Vol: 21, Issue: 2,

February 2013, pp.259-269.

[3] W. Tu, S. Huang, and C. Cheng, �Co-Synthesis of Data Paths

and Clock Control Paths for Minimum-Period Clock Gating�, in

Proceedings of DATE – Design, Automation & Test in Europe

Conference & Exhibition, 2013, pp.1831-1836.

[4] S. Poriazis, �The Two-Phase Twisted-Ring Counter Circuit�, in Proceedings of the 2002 IEEE International Symposium on

Circuits and Systems, ISCAS�02, Phoenix, Arizona, USA, vol. IV,

26-29 May 2002, pp. 858-861.

[5] X. Wang and W. H. Robinson, "Asynchronous Data

Sampling Within Clock-Gated Double Edge-Triggered Flip-

Flops", in IEEE Transactions on Circuit and Systems, Vol.60,

Issue 9, September 2013, pp.2401-2411.

[6] L. Mengibar-Pozo, M.G. Lorenz, C. Lopez and L. Enter a,

"Low-Power Design in Aerospace Circuits: A Case Study", in

IEEE A&E Systems Magazine, December 2013, pp.46-52.

[7] R&D Department at Phasetronic Laboratories,

www.phasetroniclab.com

Fig. 2. The 2PDCLK operation at cycle repetition instances (VHDL simulation results)

(a) delay insertion at output PDCLK1 by control signal DLY1

(b) delay insertion at output PDCLK2 by control signal DLY2

(c) delay insertion at both outputs PCLK1 & PCLK2 by control signals DLY1 & DLY2

(a)

(b)

(c)