[ieee 2014 ieee 29th international conference on microelectronics (miel) - belgrade, serbia...
TRANSCRIPT
393978-1-4799-5296-0/14/$31.00 © 2014 IEEE
PROC. 29th INTERNATIONAL CONFERENCE ON MICROELECTRONICS (MIEL 2014), BELGRADE, SERBIA, 12-14 MAY, 2014
The 2-Phase On-demand Delayed Clock Generator
Circuit
S. Poriazis
Abstract - The phased clock signals are useful to synchronize
the individual modules within a multiphase digital system and
satisfy the complexity of their clock timing requirement. The
capability of the on-demand adjustment of the phased clocking
pattern can be embedded to the circuit that generates the
associated clocks by shifting in time their active clock edges. A
delay insertion technique is presented that can implement this
adjustment capability. The designed two-phase circuit, targeting
an FPGA device, generates the phased clock signals and has
control inputs that define the timing and the positioning of delays.
These clocks can directly drive the idle system modules to reduce
their power consumption during the specified delay periods.
I. INTRODUCTION
In a sequential circuit clock signals can be distributed
by a clock tree or network. The clock skew scheduling
assigns different clock latencies to the registers of a
sequential circuit for minimizing the cycle period, by
borrowing time from paths with slacks and applying it to
critical paths. Phase shifts between clock domains can be
considered for the physical implementation of the
multidomain clock. In [1] the formation of an optimal
algorithm achieves a solution for the multidomain clock
skew scheduling problem (MDCSS) applicable to small
domain number. Within this algorithm a process of
incrementally decrease the clock period T, starting from an
initial value, is being incorporated while dynamically
maintaining the corresponding shortest path trees. A
clocking strategy that utilizes phase-shifted clocks is given
by [2] which is focused in combinational circuits. In this
work the assigned clock periods and the phase-shift is
dependent on the slack values.
Clock gating is a design technique to save power
consumption. The impact of clock gating on clock skew
scheduling has been studied by [3]. This work attempts to
perform co-synthesis of data paths and clock control paths
for gated clock designs via the simultaneous application of
clock skew scheduling and delay insertion. A delay
insertion may be implemented by buffer insertion which
can increase the power consumption, on the other hand by
gate downsizing which can decrease the power
consumption. Power-reducing techniques can be added to
flip-flops in order to save the power dissipated on the clock
tree. Clock-gating is one of the major techniques [5]. For a
large digital system, clock-gating technique is used to
reduce the power consumed on idle circuitry in the design.
For a clock-gated system, the internal clock controls the
gated circuits. However, the combination of clock-gating
and double edge-triggered techniques can create an
asynchronous sampling under certain circumstances,
evidenced by the output changing between clock edges.
Specialized gating circuits can be used to attempt to filter
out the asynchronous data sampling in order to remove the
asynchronous transitions.
Clock-gating is also considered by [6] in order to
reduce the power consumption of an FPGA design. Clock
gating involves two main aspects, the first one is to select
those blocks to be frozen during certain periods of time,
and the second one is to generate all the necessary control
signals to gate the required clocks. Some problems may
arise if the control signals are being synchronized with a
clock to be gated and the controlled block could behave
aberrantly and its internal states to change unexpectedly.
Another problem could arise if the above control signals
are synchronized with the original clock and the gated
clock arrives simultaneously, then setup and hold
violations could occur.
An alternative to the above clock-gating technique is
presented in this paper and is based on the delay insertion
concept being applied to the timing pattern of the set of
phased clocking signals that synchronize the individual
modules of a multiphase digital system. The phased clock
signals have been introduced by [7] to synchronize the
operation of the multiphase model of a digital system. The
timing activity of the phased clock signals can be slowed
down incrementally by the amount of delays that are being
inserted into them, either during a single time instance or
during a series of separate time instances per signal. This
adjustment of clocking activity is used to reduce the power
being consumed on idle circuitry in the design. It is noted
that the frequency of the phased clock signals remains the
same and only their timing activity is adjusted on-demand
to the power consumption requirements of the system. As a
result to this adjustment particular clock edges can be
shifted in time during the delays. This technique overrides
the above mentioned problems of the clock-gating since the
routing of the phased clock signals is not interrupted by
any gating logic and these clocks directly drive the clock
inputs of each sequential module within the system. It is
also noted that the duration of each delay being inserted is
equal to a multiple of the half of the phased clock period of
the output signals. In addition to that, there is no
Dr. Serafim Poriazis is with the R&D Department of
Phasetronic Laboratories at www.phasetroniclab.com and the
email address is: [email protected]
394
asynchronous behavior exercised at the output of the circuit
that generates these clock signals due to the insertion of
delays. This is achieved by the presented clock generator
circuit which is designed to operate internally in a two-
phase mode considering the original clock signal being
used.
In this paper, the design of the two-phase clock
generator circuit is presented which produces a set of
phased-clock signals with a clocking pattern that can be
adjusted on-demand by the appropriate control inputs.
Within this clocking pattern appropriate clock edges can be
shifted in time during the specified delay periods. In
particular, in section II the analysis of the delay insertion
technique is given which is embedded to this circuit. In
section III the basic block of the 2PDCLK circuit operation
is described. Finally in section IV the simulation results are
given for this circuit. The 2PDCLK circuit was
successfully designed by using an FPGA design platform.
II. ANALYSIS OF THE DELAY INSERTION
TECHNIQUE
Considering a set of k phased clock signals and their
active timing edges over the duration P, which is called the
phased period, of k periods of T, where T is the clock
period of the originating clock signal CLK, we have in total
2*k clock edges within P. It is noted that the period T of
CLK has two phases, identified by a and b for the
values of logic 1 (including the active rising edge) and 0 (including the active falling edge) respectively. Each
phased clock signal PDCLK of the clock generator circuit
has in total one rising clock edge and one falling clock
edge (in total has 2 active clock edges) being delivered
during P. The phased period P of each phased clock signal
PDCLK remains the same and equals to k*T, where k can
take the values from 1 to k and is shifted in time according
to the value of k following the phase pattern of CLK. The
basic circuit block being considered has two clock signals
at its output that are noted by PDCLK(1) and PDCLK(2).
For this basic circuit we have k being equal to the value of
2 and the phased period P of PDCLK(1) and PDCLK(2)
being equal to the value of 2*T. This block can be used to
build a multilevel binary tree-like circuit structure and form
circuits that can generate output clocks PDCLK(k) that
have k=1 to 2**n, where n signifies the number of levels of
the binary structure. In such a structure the value of k can
be as follows: 2, 4, 8, 16, ... , 2**n, with n taking an integer
value. The duration of the clock pulse of each PDCLK is
equal to k*T/2 for a duty cycle of 50%.
The phased clocking pattern of the PDCLK signals
during the period P can deliver less than the total of 2*k
clock edges depending on the amount of delays being
inserted (that is, an adjustment on the clocking activity of k
rising edges and k falling edges). This clocking adjustment
can be incremental allowing the clocking activity to slow
down or to speed up according to the logic values of the
controlling circuit inputs DLY. We can shift in time
individual clock edges of each phased clock signal PDCLK
at any timing instance by inserting into the circuit operation
a delay of activity, noted by D, during which there is no
clocking edge appearing at the corresponding output of the
presented circuit. Thus, the appropriate rising or falling
clock edges of the corresponding phased clock signal
PDCLK are shifted in time. These delays can be applied in
sequence to the outputs of the circuit accordingly to the
system requirements. The application of a delay D is
externally controlled by the circuit input DLY(k) that
corresponds to the phased clock signal PDCLK(k) under
consideration. These control inputs are internally
synchronized by the phases a or b of the originating
TABLE I
MATRIX FOR THE K=4 PHASED CLOCK PATTERN DELAY INSERTION
DESCRIPTION
clock signal CLK, such that the resulting delay is always
applied in synchronism to the clock signal CLK and thus
avoiding the appearance of relevant transient pulses at the
outputs PDCLK of the circuit. The resulting delay at the
output PDCLK has the duration of multiples of P/2.
A rectangular two dimensional matrix X of size k*k
can be used to map the phased clocking pattern of the
signals PDCLK of the presented clock generator circuit.
Each row of the matrix corresponds to a phased clock
signal PDCLK(i) with the row index i taking integer values
from 1 up to k. Each column of the matrix corresponds to
the duration of one clock period T(j) for the phased period
P, with the column index j taking values from 1 up to k.
This column is divided into two sub columns for each
phase of the originating clock signal CLK, that is, we have
T1 T2 T3 T4
T1a T1b T2a T2b T3a T3b T4a T4b PDCLK1 "1" "1" "1" "1" "0" "0" "0" "0" PDCLK2 "0" "1" "1" "1" "1" "0" "0" "0" PDCLK3 "0" "0" "1" "1" "1" "1" "0" "0" PDCLK4 "0" "0" "0" "1" "1" "1" "1" "0"
DELAY at level
2 D(2,1)
PDCLK1 "1" "1" "1" "1" "0" "0" "0" "0" PDCLK2 "0" "0" "0" "0" "0" "1" "1" "1" PDCLK3 "0" "0" "1" "1" "1" "1" "0" "0" PDCLK4 "0" "0" "0" "1" "1" "1" "1" "0"
395
a sub column a for the phase a and a sub column b for
the phase b of CLK, notated j(a) and j(b) for column j,
each of length T/2. The coordinates of the positioning of a
delay D(i,j) by the control signal DLY(i) within this matrix
uses the row index i for the phased clock signal PDCLK(i)
for which this delay is inserted and column index j for the
timing instance within the phased clock period P when this
delay is applied. It is noted that each delay D(i,j), being
notated in this matrix, is visualized as a block of length T
and having positions that can be identified horizontally by
the values of index i and vertically by the values of index j
which includes the value of the clock edge during T(j) that
is to be shifted in time. The identification of a delay D(i,j)
can take in total 2*k individual possible positions on this
matrix during the period P. For example, we can consider a
clock generator circuit with four clock outputs PDCLK(i)
where i=1,2,3,4. We have in total 8 clock edges being
delivered at the circuit outputs (4 rising and 4 falling
edges) during the period P of length 4*T. The duty cycle of
50% results in a phased clock pulse of duration equal to
2*T. This circuit is composed of a two level binary tree-
like structure where n=2 and there are three basic circuit
blocks being utilized, one for the first level and two for the
second level. At the top section of Table I we visualize the
normal mode of operation of the 2-level 2PDCLK circuit
without any delay being inserted to the signals PDCLK.
For the positioning combination D(2,1) of a delay D being
applied to the circuit for signal PDCLK2 at time T1, we
visualize the corresponding adjustment of the clocking
activity of PDCLK2 in the bottom section of the matrix in
Table I, that is, a delay insertion with a time shift of the
rising edge of T1 by 2*T to the right direction of row 2
(starting at T1b and ending at T3a).
III. THE 2PDCLK CIRCUIT OPERATION
The fundamental block, which generates the 2-phased
clock signals PDCLK1 and PDCLK2, under the control of
two delay insertion signals DLY1 and DLY2, is called the
2-Phase On-Demand Delayed Clock Generator circuit and
is notified as 2PDCLK. This circuit is driven by the clock
signal CLK of frequency f and internally both clock phases
are utilized for synchronization. The block diagram is
shown in Figure 1, where the above signals are shown and
an additional RESET signal is added for initializing the
circuit. The two-phased clocks at the output of the circuit
have frequency equal to f/2 with signal PDCLK1 leading
the PDCLK2 by a T/2 phase difference, where T the period
of the clock signal CLK. The pulse width of each output
signal PDCLK1 and PDCLK2 is equal to T with a duty
cycle of 50 percent.
When the delay insertion input signal DLY1 is active
high, it controls the output signal PDCLK1 by shifting on-
demand the active clock edges at the output. The signal
PDCLK1 maintains its logic value that was appearing at
the time before the application of the delay. This logic
value can be either 0 or 1 depending on the previous
logic status of PDCLK1 and is kept stable for as long as the
DLY1 is active. This is actually a controlled time-shifting
operation on the clock signal PDCLK1. At the inactive
state of DLY1, the output signal PDCLK1 returns to its
normal operation that is a normal phased clocking pattern
of frequency f/2. Similarly, the delay insertion input signal
DLY2 controls the output signal PDCLK2.
Internally, the 2PDCLK circuit operates at both
phases of CLK, each one of length T/2, and incorporates
the two-phase mode of operation. Considering the
externally application of control orders to the two inputs
DLY1 and DLY2, this circuit has the capability of
maintaining the logic values of these inputs in synchronism
Fig. 1. The 2PDCLK circuit block to CLK and adjusting
accordingly the clock edge timeline of PDLCK1 and
Fig. 1. The 2PDCLK circuit block
PDCLK2. The idle system modules are directly driven by
the clock outputs PDCLK1 and PDCLK2 of the presented
circuit such that no clock-gating logic is required to save
their power consumption. This circuit can be extended by
building binary tree-like multilevel structures and
generating additional phased clock signals PDCLKs with
the same capabilities. Such structures can be used to drive a
larger number of individual system modules while keeping
the clocking adjustment capabilities of the presented block
circuit.
IV. THE VHDL SIMULATION AND SYNTHESIS
RESULTS
The VHDL test bench simulation results for the
2PDCLK circuit block are given in Figure 2 where the
clocking pattern of signals PDCLK1 and PDCLK2 is given
for the asserted logic states of the delay insertion control
signals DLY1 & DLY2. In Figure 2(a) the circuit starts
operating at its normal mode by producing the phased
clocking pattern at PDCLK1 and PDCLK2 output signals.
When the DLY1 is set active (its value equals to 1 ) then
the value of output PDCLK1 is set to 0 without any
clocking activity. During this time the second input DLY2
is kept inactive (its value remains to 0 ) and the PDCLK2
is not affected while it keeps its normal clocking activity.
When the DLY1 returns to its inactive state (its value
equals to 0 ) then the output PDCLK1 comes back to its
396
normal clocking operation. In Figure 2(b) we have
similarly the signal DLY2 set to active high and thus
inserting delays at the output signal PDCLK2. Finally in
Figure 2(c) we have both signals DLY1 and DLY2 set to
active high. As a result the active clock edges of both
output signals PDCLK1 and PDCLK2 are shifted in time
by the delay duration being inserted for as long as the
controlling signals are active. When the control inputs
DLY1 and DLY2 are set inactive, the outputs PDCLK1 &
PDCLK2 of the circuit return to their normal operation.
This circuit was successfully designed, simulated and
synthesized by using a commercially design platform,
targeting an FPGA device of the Xilinx Spartan series, by
using the corresponding VHDL description.
V. CONCLUSION
The design aspects of the 2PDCLK circuit are being
considered in this paper. The operation of this circuit is
based on the generation of a set of phased clock signals that
have a clocking pattern of a repeating phased sequence of
rising and falling edges. The presented circuit is capable to
adjust the clocking activity of its outputs by inserting
appropriate delays. During these delays the clock edges are
shifted in time while the corresponding phased clock
signals keep their previous logic values. The delay
insertion technique is analyzed and visualized by a
dedicated matrix. The VHDL simulation results, targeting
an FPGA device, are given for the designed 2PDCLK
circuit and verify the presented circuit operation.
REFERENCES
[1] L. Li, Y. Lu and H. Zhou, �Optimal and Efficient Algorithms
for Multidomain Clock Skew Scheduling�, in IEEE Transactions
on Very Large Scale Integration (VLSI) Systems, Volume: PP,
Issue: 99, 2013, pp.1.
[2] R. Hyman, N. Ranganathan, T. Bingel and D. T. Vo, �A
Clock Control Strategy for Peak Power and RMS Current
Reduction Using Path Clustering�, in IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, Vol: 21, Issue: 2,
February 2013, pp.259-269.
[3] W. Tu, S. Huang, and C. Cheng, �Co-Synthesis of Data Paths
and Clock Control Paths for Minimum-Period Clock Gating�, in
Proceedings of DATE – Design, Automation & Test in Europe
Conference & Exhibition, 2013, pp.1831-1836.
[4] S. Poriazis, �The Two-Phase Twisted-Ring Counter Circuit�, in Proceedings of the 2002 IEEE International Symposium on
Circuits and Systems, ISCAS�02, Phoenix, Arizona, USA, vol. IV,
26-29 May 2002, pp. 858-861.
[5] X. Wang and W. H. Robinson, "Asynchronous Data
Sampling Within Clock-Gated Double Edge-Triggered Flip-
Flops", in IEEE Transactions on Circuit and Systems, Vol.60,
Issue 9, September 2013, pp.2401-2411.
[6] L. Mengibar-Pozo, M.G. Lorenz, C. Lopez and L. Enter a,
"Low-Power Design in Aerospace Circuits: A Case Study", in
IEEE A&E Systems Magazine, December 2013, pp.46-52.
[7] R&D Department at Phasetronic Laboratories,
www.phasetroniclab.com
Fig. 2. The 2PDCLK operation at cycle repetition instances (VHDL simulation results)
(a) delay insertion at output PDCLK1 by control signal DLY1
(b) delay insertion at output PDCLK2 by control signal DLY2
(c) delay insertion at both outputs PCLK1 & PCLK2 by control signals DLY1 & DLY2
(a)
(b)
(c)