bulk ieee projects in vlsi ,bulk ieee projects, ieee 2015-16 vlsi projects in chennai, 2015-16 vlsi...

32
NEXGEN TECHNOLOGY www.nexgenproject.com No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry. Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159 VLSI PROJECTS 2015 Sno. Topic Abstract Year 1. VLSI2015_01 A Low-Cost Low-Power All-Digital Spread-Spectrum Clock Generator In this brief, a low-cost low-power all-digital spread spectrum clock generator (ADSSCG) is presented. The proposed ADSSCG can provide an accurate programmable spreading ratio with process, voltage, and temperature variations. To maintain the frequency stability while performing triangular modulation, the fast-relocked mechanism is proposed. The proposed fast-relocked ADSSCG is implemented in a standard performance 90-nm CMOS process, and the active area is 200 μm × 200 μm. The exp erimental results show that the electromagnetic interference reduction is 14.61 dB with a 0.5% spreading ratio and 19.69 dB with a 2% spreading ratio at 270 M Hz. The power consumption is 443 μW at 270 M Hz with a 1.0 V power supply. 2015 2. VLSI2015_02 A Combined SDC-SDF Architecture for Normal I/O Pipelined Radix-2 FFT We present an efficient combined single-path delay commutator-feedback (SDC-SDF) radix-2 pipelined fast Fourier transform architecture, which includes log2 N 1 SDC stages, and 1 SDF stage. The SDC processing engine is proposed to achieve 100% hardware resource utilization by sharing the common arithmetic resource in the time- multiplexed approach, including both adders and multipliers. Thus, the required number of complex multipliers is reduced to log4 N 0.5, compared with log2 N 1 for the other radix-2 SDC/SDF architectures. In addition, the proposed architecture requires roughly minimum number of complex adders log2 N + 1 and complex delay memory 2N + 1.5 log2 N 1.5. 2015 3. VLSI2015_03 A Class of SEC-DED- DAEC Codes Derived From Orthogonal Latin Square Codes Radiation-induced soft errors are a major reliability concern for memories. To ensure that memory contents are not corrupted, single error correction double error detection (SEC-DED) codes are commonly used, however, in advanced technology nodes, soft errors frequently affect more than one memory bit. Since SEC-DED codes cannot correct multiple errors, they are often combined with interleaving. Interleaving, however, impacts memory design and performance and cannot always be used in small memories. This limitation has spurred interest in codes that can correct adjacent bit errors. In particular, several SEC-DED double adjacent error correction (SEC-DED-DAEC) codes have recently been proposed. Implementing DAEC has a cost as it impacts the decoder complexity and delay. Another issue is that most of the new SEC-DED-DAEC codes miscorrect some double nonadjacent bit errors. In this brief, a new class of SEC-DED-DAEC codes is derived from orthogonal latin squares codes. The new codes significantly reduce the decoding complexity and delay. In addition, the codes do not 2015

Upload: praveen-kumar

Post on 19-Jul-2015

624 views

Category:

Education


0 download

TRANSCRIPT

Page 1: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

VLSI PROJECTS 2015

Sno. Topic Abstract Year 1. VLSI2015_01 A Low-Cost Low-Power

All-Digital Spread-Spectrum

Clock Generator

In this brief, a low-cost low-power all-digital spread

spectrum clock generator (ADSSCG) is presented. The

proposed ADSSCG can provide an accurate programmable

spreading ratio with process, voltage, and temperature variations. To maintain the frequency stability while

performing triangular modulation, the fast-relocked

mechanism is proposed. The proposed fast-relocked

ADSSCG is implemented in a standard performance 90-nm CMOS process, and the active area is 200 µm × 200 µm. The

experimental results show that the electromagnetic

interference reduction is 14.61 dB with a 0.5% spreading

ratio and 19.69 dB with a 2% spreading ratio at 270 MHz.

The power consumption is 443 µW at 270 MHz with a 1.0 V power supply.

2015

2. VLSI2015_02 A Combined SDC-SDF

Architecture for Normal I/O Pipelined Radix-2 FFT

We present an efficient combined single-path delay

commutator-feedback (SDC-SDF) radix-2 pipelined fast Fourier transform architecture, which includes log2 N − 1

SDC stages, and 1 SDF stage. The SDC processing engine is

proposed to achieve 100% hardware resource utilization by

sharing the common arithmetic resource in the time-

multiplexed approach, including both adders and multipliers. Thus, the required number of complex multipliers is reduced

to log4 N − 0.5, compared with log2 N − 1 for the other

radix-2 SDC/SDF architectures. In addition, the proposed

architecture requires roughly minimum number of complex

adders log2 N + 1 and complex delay memory 2N + 1.5 log2 N − 1.5.

2015

3. VLSI2015_03 A Class of SEC-DED-DAEC Codes Derived From

Orthogonal Latin Square

Codes

Radiation-induced soft errors are a major reliability concern for memories. To ensure that memory contents are not

corrupted, single error correction double error detection

(SEC-DED) codes are commonly used, however, in advanced

technology nodes, soft errors frequently affect more than one

memory bit. Since SEC-DED codes cannot correct multiple errors, they are often combined with interleaving.

Interleaving, however, impacts memory design and

performance and cannot always be used in small memories.

This limitation has spurred interest in codes that can correct

adjacent bit errors. In particular, several SEC-DED double adjacent error correction (SEC-DED-DAEC) codes have

recently been proposed. Implementing DAEC has a cost as it

impacts the decoder complexity and delay. Another issue is

that most of the new SEC-DED-DAEC codes miscorrect

some double nonadjacent bit errors. In this brief, a new class of SEC-DED-DAEC codes is derived from orthogonal latin

squares codes. The new codes significantly reduce the

decoding complexity and delay. In addition, the codes do not

2015

Page 2: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

miscorrect any double nonadjacent bit errors. The main

disadvantage of the new codes is that they require a larger

number of parity check bits. Therefore, they can be useful when decoding delay or complexity is critical or when

miscorrection of double nonadjacent bit errors is not

acceptable. The proposed codes have been implemented in

Hardware Description Language and compared with some of

the existing SEC-DED-DAEC codes.

4. VLSI2015_04 Design of Efficient Content

Addressable Memories in

High-Performance FinFET

Technology

Content addressable memories (CAMs) enable highspeed

parallel search operations in table lookup-based applications,

such as Internet routers and processor caches. Traditional

CAM design has always suffered from the high dynamic power consumption associated with its large and active

parallel hardware. However, deeply scaled technology nodes,

with multigate devices replacing planar MOSFETs, are

expected to bring new tradeoffs to CAM design. FinFET, a

vertical-channel gate-wraparound double-gate device, has emerged as the best alternative to planar MOSFET. In this

brief, for the first time, we explore the design space of

symmetric and asymmetric gate-workfunction FinFET

CAMs. We propose several design alternatives and evaluate

them in terms of their dc and transient metrics for different mismatch probabilities using technology

computeraided design simulations with 22-nm FinFET

devices. We also propose two orthogonal layout styles for

CAM design and show that one of them (vertical-search line)

outperforms the other (vertical-match line) in terms of total power (22.3%) and search delay (5.8%).

2015

5. VLSI2015_05 A New Efficiency-Improvement Low-Ripple

Charge-Pump Boost

Converter Using Adaptive

Slope Generator With

Hysteresis Voltage Comparison Techniques

The new efficiency-improvement low-ripple charge pump boost converter using adaptive slope generator with

hysteresis voltage comparison techniques is proposed in this

paper. This proposed converter can reduce output voltage

ripple, because its inductor is connected to the output. This

proposed converter adopts a new controlled architecture, self-adaptive slope generator with hysteresis comparison

technology, to shorten the transient response. The proposed

boost converter has been fabricated with TSMC 0.35-µm

CMOS 2P4M processes, and a total chip area of 1.49 mm ×

1.49 mm. Its maximum output current is 260 mA when the output voltage is 3.6 V. When the supply voltage is 3.3 V, the

output voltage can be 3.6–5.1 V. The maximum efficiency is

90.99% and the minimum output ripple is 10.8 mV. Finally,

the theoretical analysis is verified to be correct by the

experimental results.

2015

6. VLSI2015_06 A 0.25-V 28-nW 58-dB Dynamic Range

Asynchronous Delta Sigma

Modulator

in 130-nm Digital CMOS

Process

In this paper, we present a single-bit clock-less asynchronous delta–sigma modulator (ADSM) operating at

just 0.25 V power supply. Several circuit approaches were

employed to enable such low-voltage operation and maintain

high performance. One approach involved utilizing bulk-

driven transistors in sub threshold region with trans conductance-enhancement topology. Another approach was

to employ distributed transistor layout structure to mitigate

2015

Page 3: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

the effect of low output impedance due to halo drain implants

employed in today’s digital CMOS process. The ADSM

achieved a characteristic center frequency of 630 Hz. It had an effective signal-to-noise-plus-distortion ratio (SNDR) of

58 dB or effective number of bits (ENOB) 9 b and just 28-

nW power dissipation. A detailed analytical model capturing

the effect of non-idealities of the individual circuit

components is also presented for the first time with a close agreement with experimental results.

7. VLSI2015_07 Range Unlimited Delay-

Interleaving and -Recycling

Clock Skew Compensation

and Duty-Cycle

Correction Circuit

A clock skew-compensation and duty-cycle correction circuit

(CSADC) is used as the second-level clock distributing

circuit to align a system global clock while maintaining a

50% duty cycle. A power-efficient, range-unlimited, and

accuracy enhanced CSADC, designed mainly with a new delay-interleaving and -recycling technique that mitigates

operating frequency limitations while keeping overhead costs

low, is proposed in this paper. Our preliminary research

results prove the feasibility of the proposed technique and

show that the operating frequency ranges from 110 MHz to 1.75 GHz, with the corrected duty cycle varying from 51.2%

to 48.9% based on 0.18-µm CMOS technology. Meanwhile,

the lock-in time, static phase error, and power consumption

are, respectively, 26 clock cycles, 4.2 ps, and 5.58 mW at

1.75 GHz.

2015

8. VLSI2015_08 Obfuscating DSP Circuits

via High-Level Transformations

This paper presents a novel approach to design

obfuscated circuits for digital signal processing (DSP) applications using high-level transformations, a key-based

obfuscating finite-state machine (FSM), and a

reconfiguration. The goal is to design DSP circuits that are

harder to reverse engineer. High level transformations of

iterative data-flow graphs have been exploited for area-speed-power tradeoffs. This is the first attempt to develop a

design flow to apply high-level transformations that not only

meet these tradeoffs but also simultaneously obfuscate the

architectures both structurally and functionally. Several

modes of operations are introduced for obfuscation where the outputs are meaningful from a signal processing point of

view, but are functionally incorrect. Examples of such modes

include a third-order digital filter that can also implement a

sixth-order or ninth-order filter in a time-multiplexed manner.

The latter two modes are meaningful but represent functionally incorrect modes. Multiple meaningful modes can

be exploited to reconfigure the filter order for different

applications. Other modes may correspond to non meaningful

modes. A correct key input to an FSM activates a

reconfigurator. The configure data controls various modes of the circuit operation. Functional obfuscation is accomplished

by requiring use of the correct initialization key, and

configure data. Wrong initialization key fails to enable the

reconfigurator, and a wrong configure data activates either a

2015

Page 4: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

meaningful but nonfunctional or nonmeaningful mode.

Probability of activating the correct mode is significantly

reduced leading to an obfuscated DSP circuit. Structural obfuscation is also achieved by the proposed methodology

via high-level transformations. Experimental results show

that the overhead of the proposed methodology is small,

while a strong obfuscation is attained. For example, the area

overhead for a (3l)th-order IIR filter benchmark is only 17.7% with a 128-bit configuration key, where 1 ≤ l ≤ 8, i.e.,

the order of this filter should be a multiple of 3, and can vary

from 3 to 24.

9. VLSI2015_09 Accelerating Scalar

Conversion for Koblitz

Curve

Cryptoprocessors on

Hardware Platforms

Koblitz curves are a class of computationally efficient elliptic

curves where scalar multiplications can be accelerated using τ

NAF representations of scalars. However, conversion from

an integer scalar to a short τ NAF is a costly operation. In this

paper, we improve the recently proposed scalar conversion scheme based on division by τ 2. We apply two levels of

optimizations in the scalar conversion architecture. First, we

reduce the number of long integer subtractions during the

scalar conversion. This optimization reduces the computation

cost and also simplifies the critical paths present in the conversion architecture. Then we implement pipelines in the

architecture. The pipeline splitting increases the operating

frequency without increasing the number of cycles. We have

provided detailed experimental results to support our claims

made in this paper.

2015

10. VLSI2015_10 Design of Self-Timed

Reconfigurable Controllers for Parallel Synchronization

via Wagging

Synchronization is an important issue in modern system

design as systems-on-chips integrate more diverse technologies, operating voltages, and clock frequencies on a

single substrate. This paper presents a methodology for the

design and implementation of a self-timed reconfigurable

control device suitable for a parallel cascaded flip-flop

synchronizer based on a principle known as wagging, through the application of distributed feedback graphs. By

modifying the endpoint adjacency of a common behavior

graph via one-hot codes, several configurable modes can be

implemented in a single design specification, thereby

facilitating direct control over the synchronization time and the mean-time between failures of the parallel master-slave

latches in the synchronizer. Therefore, the resulting

implementation is resistant to process non-idealities, which

are present in physical design layouts. This paper includes a

discussion of the reconfiguration protocol, and implementations of both a sequential token ring control

device, and an interrupt subsystem necessary for

reconfiguration, all simulated in UMC 90-nm technology.

The interrupt subsystem demonstrates operating frequencies

between 505 and 818 MHz per module, with average power consumptions between 70.7 and 90.0 µW in

the typical-typical case under a corner analysis.

2015

11. VLSI2015_11 Level-Converting Retention In this paper, we propose a level-converting retention flip- 2015

Page 5: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

Flip-Flop for Reducing

Standby Power in ZigBee

SoCs

flop (RFF) for ZigBee systems-on-chips (SoCs). The

proposed RFF allows the voltage regulator that generates the

core supply voltage (VDD,core) to be turned off in the standby mode, and it thus reduces the standby power of the

ZigBee SoCs. The logic states are retained in a slave latch

composed of thick-oxide transistors using an I/O supply

voltage (VDD,IO) that is always turned on. Level-up

conversion from VDD,core to VDD,IO is achieved by an embedded nMOS pass-transistor level-conversion scheme

that uses a low-only signal-transmitting technique. By

embedding a retention latch and level-up converter into the

data-to-output path of the proposed RFF, the RFF resolves

the problems of the static RAM -based RFF, such as large dc current and low readability caused by threshold drop. The

proposed RFF does not also require additional control signals

for power mode transitioning. Using 0.13-µm process

technology, we implemented an RFF with VDD,core and

VDD,IO of 1.2 and 2.5 V, respectively. The maximum operating frequency is 300 MHz. The active energy of the

RFF is 191.70 fJ, and its standby power is

350.25 pW.

12. VLSI2015_12 All Digital Energy Sensing

for Minimum Energy Tracking

Minimizing energy consumption is of utmost importance in

an energy starved system with relaxed performance requirements. This brief presents a digital energy sensing

method that requires neither a constant voltage reference nor

a time reference. An energy minimizing loop uses this to find

the minimum energy point and sets the supply voltage between 0.2 and 0.5 V. Energy savings up to 1 275% over

existing minimum energy tracking techniques in the literature

is achieved.

2015

13. VLSI2015_13 Recursive Approach to the

Design of a

Parallel Self-Timed Adder

This brief presents a parallel single-rail self-timed adder.

It is based on a recursive formulation for performing multibit

binary addition. The operation is parallel for those bits that

do not need any carry chain propagation. Thus, the design

attains logarithmic performance over random operand conditions without any special speedup circuitry or look-

ahead schema. A practical implementation is provided along

with a completion detection unit. The implementation is

regular and does not have any practical limitations of high

fanouts. A high fan-in gate is required though but this is unavoidable for asynchronous logic and is managed by

connecting the transistors in parallel. Simulations have been

performed using an industry standard toolkit that verify the

practicality and superiority of the proposed approach over

existing asynchronous adders.

2015

14. VLSI2015_14 Novel Reconfigurable

Hardware Architecture for Polynomial Matrix

In this paper, we introduce a novel reconfigurable hardware

architecture for computing the polynomial matrix multiplication (PMM) of polynomial matrices and/or

2015

Page 6: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

Multiplications polynomial vectors. The proposed algorithm exploits an

extension of the fast convolution technique to multiple-input

multiple-output systems. The proposed architecture is the first one devoted to the hardware implementation of PMM.

Hardware implementation of the algorithm is achieved via

highly pipelined, partly systolic field-programmable gate

array (FPGA) architecture. The architecture, which is

scalable in terms of the order of the input polynomial matrices, has been designed using the Xilinx system

generator tool. We verify the algorithmic accuracy of the

architecture through FPGA-in-the-loop hardware co-

simulations. The application to sensor array signal processing

is highlighted, in terms of strong de-correlation. The results are presented to demonstrate the accuracy and capability of

the architecture. The results verify that the proposed solution

gives low execution times while limiting the number of

required FPGA resources.

15. VLSI2015_15 Implementation of

Subthreshold Adiabatic

Logic for Ultralow-Power

Application

Behavior of adiabatic logic circuits in weak inversion or

subthreshold regime is analyzed in depth for the first time in

the literature to make great improvement in ultralowpower

circuit design. This novel approach is efficacious in low-speed operations where power consumption and longevity are

the pivotal concerns instead of performance. The schematic

and layout of a 4-bit carry look ahead adder (CLA) has been

implemented to show the workability of the proposed logic.

The effect of temperature and process parameter variations on subthreshold adiabatic logic-based 4-bit CLA has also

been addressed separately. Postlayout simulations show that

subthreshold adiabatic units can save significant energy

compared with a logically equivalent static CMOS implementation.

2015

16. VLSI2015_16 FPGA-Based Bit Error Rate

Performance

Measurement of Wireless Systems

This paper presents the bit error rate (BER) performance

validation of digital baseband communication systems on a

field-programmable gate array (FPGA). The proposed BER tester (BERT) integrates fundamental baseband signal

processing modules of a typical wireless communication

system along with a realistic fading channel simulator and an

accurate Gaussian noise generator onto a single FPGA to

provide an accelerated and repeatable test environment in a laboratory setting. Using a developed graphical user

interface, the error rate performance of single- and multiple-

antenna systems over a wide range of parameters can be

rapidly evaluated. The FPGA-based BERT should reduce the

need for time-consuming software based simulations, hence increasing the productivity. This FPGA-based solution is

significantly more cost effective than conventional

performance measurements made using expensive

commercially available test equipment and channel

simulators.

2015

17. VLSI2015_17 Algorithm and Architecture

Design of the H.265/HEVC Intra Encoder

Improved video coding techniques introduced in the

H.265/HEVC standard allow video encoders to achieve better compression efficiencies. On the other hand the increased

2015

Page 7: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

complexity requires a new design methodology able to face

challenges associated with ever higher spatio-temporal

resolutions. The paper presents the computationally -scalable algorithm and its hardware architecture able to support the

intra encoding up to the 2160p@30fps resolution. The

scalability allows the tradeoff between the throughput and the

compression efficiency. In particular, the encoder is able to

check a variable number of candidate modes. The rate estimation based on bin counting and the distortion

estimation in the transform domain simplify the rate-

distortion analysis and enable the evaluation of a great

number of candidate intra modes. The encoder preselects

candidate modes by the processing of 8×8 predictions computed from original samples. The preselection shares

hardware resources used for the processing of predictions

generated from reconstructed samples. To support intra 4×4

modes for the 2160p@30fps resolution, the encoder

incorporates a separate reconstruction loop. The processing of blocks with different sizes is interleaved to compensate the

delay of reconstruction loops. Implementation results show

that the encoder utilizes 1086k gates and 52 kB on-chip

memories for TSMC 90nm. The main reconstruction loop

can operate at 400 MHz, whereas the remaining modules work at 200 MHz. For 2160p@30fps videos, the average BD-

Rate is 5.46% compared to the HM software.

18. VLSI2015_18 Pre-Encoded Multipliers Based on Non-Redundant

Radix-4

Signed-Digit Encoding

In this paper, we introduce architecture of pre-encoded multipliers for Digital Signal Processing applications based

on off-line encoding of coefficients. To this extend, the Non-

Redundant radix-4 Signed-Digit (NR4SD) encoding

technique, which uses the digit values {−1, 0, +1, +2} or {−2, −1, 0, +1}, is proposed leading to a multiplier design with

less complex partial products implementation. Extensive

experimental analysis verifies that the proposed pre-encoded

NR4SD multipliers, including the coefficients memory, are

more area and power efficient than the conventional Modified Booth scheme.

2015

19. VLSI2015_19 A High-Performance FIR

Filter Architecture for Fixed and Reconfigurable

Applications

Transpose form finite-impulse response (FIR) filters are

inherently pipelined and support multiple constant multiplications (MCM) technique that results in significant

saving of computation. However, transpose form

configuration does not directly support the block processing

unlike direct form configuration. In this paper, we explore the

possibility of realization of block FIR filter in transpose form configuration for area-delay efficient realization of large

order FIR filters for both fixed and reconfigurable

applications. Based on a detailed computational analysis of

transpose form configuration of FIR filter, we have derived a

flow graph for transpose form block FIR filter with optimized register complexity. A generalized

block formulation is presented for transpose form FIR filter.

We have derived a general multiplier-based architecture for

the proposed transpose form block filter for reconfigurable

2015

Page 8: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

applications. A low-complexity design using the MCM

scheme is also presented for the block implementation of

fixed FIR filters. The proposed structure involves significantly less area delay product (ADP) and less energy

per sample (EPS) than the existing block implementation of

direct-form structure for medium or large filter lengths, while

for the short-length filters, the block implementation of

direct-form FIR structure has less ADP and less EPS than the proposed structure. Application specific integrated circuit

synthesis result shows that the proposed structure for block

size 4 and filter length 64 involves 42% less ADP and 40%

less EPS than the best available FIR filter structure proposed

for reconfigurable applications. For the same filter length and the same block size, the proposed structure involves 13% less

ADP and 12.8% less EPS than that of the existing direct-form

block FIR structure.

20. VLSI2015_20 A Novel Photosensitive

Tunneling Transistor

for Near-Infrared Sensing

Applications:

Design, Modeling, and Simulation

In this paper, a novel device structure, operating

on the principle of band-to-band tunneling, has been designed

for near-infrared (1–1.5 µm) multispectral optical sensing

applications. A drain current model based on line tunneling

approach has been developed to illustrate the device operation. The results of the model are compared with the

simulated data for devices with similar dimension and

structure, indicating good accuracy of the developed model.

Spectral response of the device is studied by estimating the

relative values of its transfer—as well as output—characteristics, and also by measuring the variation of

threshold voltage, VT and ON-state current, ION. VT and

ION are found to be sensitive to wavelength variations at

moderate gate doping levels. VT is found to increase by ∼40 mV and ION decreases by 35% for a change of illumination

wavelength from 1 to 1.5 µm at a gate doping of 1 × 1018 cm−3. Peak spectral sensitivity at an illumination intensity of

0.75 W/cm2 is found to be 318.38, 2.02 × 103, and 672.2

corresponding to the change in wavelength from (1–1.2 µm),

(1.2–1.45 µm), and (1.45–1.5 µm), respectively.

2015

21. VLSI2015_21 High-Throughput LDPC-

Decoder Architecture

Using Efficient Comparison Techniques & Dynamic

Multi-Frame Processing

Schedule

This paper presents architecture of block-level-parallel

layered decoder for irregular LDPC code. It can be

reconfigured to support various block lengths and code rates of IEEE 802.11n (WiFi) wireless-communication standard.

We have proposed efficient comparison techniques for both

column and row layered schedule and rejection-based high-

speed circuits to compute the two minimum values from

multiple inputs required for row layered processing of hardware-friendly min-sum decoding algorithm. The results

show good speed with lower area as compared to state-of-

the-art circuits. Additionally, this work proposes dynamic

multi-frame processing schedule which efficiently utilizes the

layered-LDPC decoding with minimum pipeline stages. The suggested LDPC-decoder architecture has been synthesized

and post-layout simulated in 90 nm-CMOS process. This

decoder occupies 5.19 area and supports multiple code rates

2015

Page 9: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

like 1/2, 2/3, 3/4 & 5/6 as well as block-lengths of 648, 1296

& 1944. At a clock frequency of 336 MHz, the proposed

LDPC-decoder has achieved better throughput of 5.13 Gbps and energy efficiency of 0.01 nJ/bits/iterations, as compared

to the similar state-of-the-art works.

22. VLSI2015_22 A New Parallel VLSI

Architecture for Real-time

Electrical Capacitance

Tomography

This paper presents a fixed-point reconfigurable parallel

VLSI hardware architecture for real-time Electrical

Capacitance Tomography (ECT). It is modular and consists

of a front-end module which performs precise capacitance

measurements in a time multiplexed manner using Capacitance to Digital Converter (CDC) technique. Another

FPGA module performs the inverse steps of the tomography

algorithm. A dual port built-in memory banks store the

sensitivity matrix, the actual value of the capacitances, and

the actual image. A two dimensional (2D) core multiprocessing elements (PE) engine intercommunicates

with these memory banks via parallel buses. A Hardware-

software co-design methodology was conducted using

commercially available tools in order to concurrently tune the

algorithms and hardware parameters. Hence, the hardware was designed down to the bit-level in order to reduce both the

hardware cost and power consumption, while satisfying real-

time constraint. Quantization errors were assessed against the

image quality and bit-level simulations demonstrate the

correctness of the design. Further simulations indicate that the proposed architecture achieves a speed-up of up to three

orders of magnitude over the software version when the

reconstruction algorithm runs on 2.53 GHZ-based Pentium

processor or DSP Ti’s Delphino TMS320F32837 processor.

More specifically, a throughput of 17.241 Kframes/sec for both the Linear-Back Projection (LBP) and modified

Landweber algorithms and 8.475 Kframes/sec for the

Landweber algorithm with 200 iterations could be achieved.

This performance was achieved using an array of [2×2] ×

[2×2] processing units. This satisfies the real-time constraint of many industrial applications. To the best of the authors’

knowledge, this is the first embedded system which explores

the intrinsic parallelism which is available in modern FPGA

for ECT tomography.

2015

23. VLSI2015_23 Graph-Based Transistor

Network Generation

Method for Supergate Design

Transistor network optimization represents an effective way

of improving VLSI circuits. This paper proposes a novel

method to automatically generate networks with minimal transistor count, starting from an irredundant sum-of-

products expression as the input. The method is able to

deliver both series–parallel (SP) and non-SP switch

arrangements, improving speed, power dissipation, and area

of CMOS gates. Experimental results demonstrate expected gains in comparison with related approaches.

2015

24. VLSI2015_24 A Relative Imaging CMOS Image Sensor for High

Dynamic Range and High

This paper proposes an unconventional image acquisition scheme for machine vision applications, based on detecting

ratios of illumination (pixel) intensities. Detecting relative

2015

Page 10: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

Frame-Rate Machine

Vision Imaging Applications

ratios enables capturing the scene features and patterns

almost independently from the local scene illumination

resulting in potentially extremely high dynamic range. Moreover, detecting signal ratios using a fully differential

circuit optimally suits the intrinsic nature of VLSI design. A

scalable and compact hardware implementation is proposed

as a proof-of-concept towards relative image acquisition. The

proposed photo-current ratio-detecting pixels completely bypass the need of conventional photo-current integration

which enables high frame-rate operation of up to 24000

frames-per-second (fps). The pulse-width modulated output

of the proposed pixel is captured by compact column-parallel

readout circuits based on digital counters. The developed 32×32 pixel array prototype CMOS image sensor consumes

4mW of power operating at a nominal 9765 fps frame rate,

and 6.8mW of power operating at a maximum 24000fps. The

presented prototype design is fully scalable towards newer

CMOS fabrication nodes and higher sensor resolution.

25. VLSI2015_25 Low-Cost High-

Performance VLSI

Architecture for Montgomery Modular

Multiplication

This paper proposes a simple and efficient Montgomery

multiplication algorithm such that the low-cost and high-

performance Montgomery modular multiplier can be implemented accordingly. The proposed multiplier receives

and outputs the data with binary representation and uses only

one-level carry-save adder (CSA) to avoid the carry

propagation at each addition operation. This CSA is also used

to perform operand pre-computation and format conversion from the carry save format to the binary representation,

leading to a low hardware cost and short critical path delay at

the expense of extra clock cycles for completing one modular

multiplication. To overcome the weakness, a configurable CSA (CCSA), which could be one full-adder or two serial

half-adders, is proposed to reduce the extra clock cycles for

operand pre-computation and format conversion by half. In

addition, a mechanism that can detect and skip the

unnecessary carry-save addition operations in the one-level CCSA architecture while maintaining the short critical path

delay is developed. As a result, the extra clock cycles for

operand pre-computation and format conversion can be

hidden and high throughput can be obtained. Experimental

results show that the proposed Montgomery modular multiplier can achieve higher performance and significant

area–time product improvement when compared with

previous designs.

2015

26. VLSI2015_26 Fully Pipelined Low-Cost

and High-Quality Color

Demosaicking VLSI Design

for Real-Time Video

Applications

This paper presents a fully pipelined color demosaicking

design. To improve the quality of reconstructed images, a

linear deviation compensation scheme was created to

increase the correlation between the interpolated and

neighboring pixels. Furthermore, immediately interpolated green color pixels are first to be used in hardware-oriented

color demosaicking algorithms, which efficiently promoted

the quality of the reconstructed image. A boundary detector

and boundary mirror machine were added to improve the

2015

Page 11: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

quality of pixels located in boundaries. In addition, a

hardware sharing technique was used to reduce the hardware

costs of three interpolators. The VLSI architecture in this work contains only 4.97 K gate counts and the core area is

60,229 um2 synthesized by using 0.18-um CMOS process.

The operating frequency of this work is 200 MHz by

consuming 4.76 mW. Compared with the previous low

complexity designs, this work has the benefits in terms of low cost, low power consumption, and high performance.

27. VLSI2015_27 A Novel Area-Efficient

VLSI Architecture for Recursion Computation in

LTE Turbo Decoders

Long term evolution (LTE) is aimed to achieve the

peak data rates in excess of 300 Mb/s for the next generation wireless communication systems. Turbo codes, the specified

channel coding scheme in LTE, suffer from a low-decoding

throughput due to its iterative decoding algorithm. One

efficient approach to achieve a promising throughput is to use

multiple Maximum a-Posteriori (MAP) cores in parallel, resulting in a large area overhead. The two computationally

challenging units in an MAP core are α and β recursion units.

Although several methods have been proposed to shorten the

critical path of these recursion units, their area-efficient

architecture with minimum silicon area is still missing. In this paper, a novel relation existing between α and

β metrics is introduced, leading to a novel add-compare-

select (ACS) architecture. The proposed technique can be

applied to both the precise approximation of log-MAP and

max-log MAP ACS architectures. The proposed ACS design, implemented in a 0.13 µm CMOS technology and

customized for the LTE standard, results in at most 18.1%

less area compared to the reported designs to-date while

maintaining the same throughput level.

2015

28. VLSI2015_28 Comparative Performance

Analysis of

the Dielectrically Modulated FullGate and Short-Gate

Tunnel

FET-Based Biosensors

In this paper, a short-gate tunneling-field-effecttransistor

(SG-TFET) structure has been investigated for the

dielectrically modulated biosensing applications in comparison with a full-gate tunneling-field-effect-transistor

structure of similar dimensions. This paper explores the

underlying physics of these architectures and estimates their

comparative sensing performance. The sensing performance

has been evaluated for both the charged and charge-neutral biomolecules using extensive device-level simulation, and

the effects of the biomolecule dielectric constant and charge

density are also studied. In SG-TFET architecture, the

reduction of the gate length enhances its drain control over

the band-to-band tunneling process and this has been exploited for the detection, resulting to superior drain current

sensitivity for biomolecule conjugation. The gate and drain

biasing conditions show dominant impact on the sensitivity

enhancement in the short-gate biosensors. Therefore, the gate

and drain bias are identified as the effective design parameters for the efficiency optimization.

2015

29. VLSI2015_29 An Efficient Constant Multiplier Architecture

This paper proposes efficient constant multiplier architecture based on vertical-horizontal binary common sub-expression

2015

Page 12: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

Based on Vertical-

Horizontal Binary Common

Sub-expression Elimination Algorithm for

Reconfigurable FIR Filter

Synthesis

elimination (VHBCSE) algorithm for designing a

reconfigurable finite impulse response (FIR) filter whose

coefficients can dynamically change in real time. To design an efficient reconfigurable FIR filter, according to the

proposed VHBCSE algorithm, 2-bit binary common sub-

expression elimination (BCSE) algorithm has been applied

vertically across adjacent coefficients on the 2-D space of the

coefficient matrix initially, followed by applying variable-bit BCSE algorithm horizontally within each coefficient. This

technique is capable of reducing the average probability of

use or the switching activity of the multiplier block adders by

6.2% and 19.6% as compared to that of two existing 2-bit and

3-bit BCSE algorithms respectively. ASIC implementation results of FIR filters using this multiplier show that the

proposed VHBCSE algorithm is also successful in reducing

the average power consumption by 32% and 52% along with

an improvement in the area power product (APP) by 25%

and 66% compared to those of the 2-bit and 3-bit BCSE algorithms respectively. As regards the implementation of

FIR filter, improvements of 13% and 28% in area delay

product (ADP) and 76.1% and 77.8% in power delay product

(PDP) for the proposed VHBCSE algorithm have been

achieved over those of the earlier multiple constant multiplication (MCM) algorithms, viz. faithfully rounded

truncated multiple constant multiplication/accumulation

(MCMAT) and multi-root binary partition graph (MBPG)

respectively. Efficiency shown by the results of comparing

the FPGA and ASIC implementations of the reconfigurable FIR filter designed using VHBCSE algorithm based constant

multiplier establishes the suitability of the proposed

algorithm for efficient fixed point reconfigurable FIR filter

synthesis.

30. VLSI2015_30 VLSI-Assisted Nonrigid

Registration Using

Modified Demons Algorithm

Increasing demand of high-speed portable modules for

multimedia applications has motivated the development of

hardware-based solutions for image processing applications. Most of the nonrigid image registration algorithms are found

to be unsuitable for hardware implementation because of

their nonlinearity and computationally intensive nature. In

this paper, an algorithm for nonrigid image registration based

on Demons approximation is proposed. The algorithm has been simulated in MATLAB and results show a 15%

improvement in peaksignal-to-noise-ratio with a 17%

reduction in registration time for 256 × 256 image over the

original Demons algorithm. The proposed algorithm is

synthesized in Virtex6-xc6vlx760-2-ff1760 and maximum synthesized frequency is found to be 174 MHz. The proposed

architecture provides the low cost, high-speed solution for the

registration process, which is also helpful for making a

portable system.

2015

31. VLSI2015_31 Fine-Grained Access

Management in

Reconfigurable Scan

Modern VLSI designs incorporate a high amount of

instrumentation that supports post-silicon validation and

debug, volume test and diagnosis, as well as in-field system

2015

Page 13: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

Networks monitoring and maintenance. Reconfigurable scan

architectures, as allowed by the novel IEEE Std 1149.1-2013

(JTAG) and IEEE Std 1687- 2014 (IJTAG), emerge as a scalable mechanism for access to such on-chip instruments.

While the on-chip instrumentation is crucial for meeting

quality, dependability, and time-to-market goals, it is prone

to abuse and threatens system safety and security. A secure

access management method is mandatory to assure that critical instruments be accessible to authorized entities only.

This work presents a novel protection method for fine-

grained access management in complex reconfigurable scan

networks based on a challenge-response authentication

protocol. The target scan network is extended with an authorization instrument and Secure Segment Insertion Bits

(S2IB) that together control the accessibility of individual

instruments. To the best of the authors’ knowledge, this is the

first fine-grained access management scheme that scales well

with the number of protected instruments and offers a high level of security. Compared with recent state of-the-art

techniques, this scheme is more favorable with respect to

implementation cost, performance overhead, and provided

security level.

32. VLSI2015_32 A High-Throughput VLSI

Architecture for Hard and

Soft SC-FDMA MIMO

Detectors

This paper introduces a novel low-complexity multiple-input

multiple-output (MIMO) detector tailored for single-carrier

frequency division-multiple access (SC-FDMA) systems,

suitable for efficient hardware implementations. The proposed detector starts with an initial estimate of the

transmitted signal based on a minimum mean square error

(MMSE) detector. Subsequently, it recognizes less reliable

symbols for which more candidates in the constellation are browsed to improve the initial estimate. Efficient high-

throughput VLSI architecture is also introduced achieving a

superior performance compared to the conventional MMSE

detectors with less than 28% added complexity. The

performance of the proposed design is close to the existing maximum likelihood post-detection processing (ML-PDP)

scheme, while resulting in a significantly lower complexity,

i.e., and times fewer Euclidean distance (ED) calculations in

the 16-QAM and 64-QAM schemes, respectively. The

proposed design for the 16-QAM scheme is fabricated in a 0.13 CMOS technology and fully tested, achieving a 1.332

Gbps throughput, reporting the first fabricated design for SC-

FDMA MIMO detectors to-date. A soft version of the

proposed architecture is also introduced, which is customized

for coded systems.

2015

33. VLSI2015_33 Partially Parallel Encoder

Architecture for Long Polar Codes

Due to the channel achieving property, the polar code has

become one of the most favorable error-correcting codes. As the polar code achieves the property asymptotically,

however, it should be long enough to have a good error-

correcting performance. Although the previous fully parallel

encoder is intuitive and easy to implement, it is not suitable

for long polar codes because of the huge hardware

2015

Page 14: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

complexity required. In this brief, we analyze the encoding

process in the viewpoint of very-large-scale integration

implementation and propose a new efficient encoder architecture that is adequate for long polar codes and

effective in alleviating the hardware complexity. As the

proposed encoder allows high-throughput encoding with

small hardware complexity, it can be systematically applied

to the design of any polar code and to any level of parallelism.

34. VLSI2015_34 Novel Block-Formulation

and Area-Delay-Efficient Reconfigurable Interpolation

Filter Architecture for

Multi-Standard SDR

Applications

A poly-phase based interpolation filter computation involves

an input-matrix and coefficient-matrix of size each, where is the up-sampling factor and , is the filter length. The input-

matrix and the coefficient-matrix resizes when changes. An

analysis of interpolation filter computation for different up -

sampling factors is made in this paper to identify redundant

computations and removed those by reusing partial results. Reuse of partial results eliminates the necessity of

matrix resizing in interpolation filter computation. A novel

block-formulation is presented to share the partial results for

parallel computation of filter outputs of different up-sampling

factors. Using the proposed block formulation, a parallel multiplier-based reconfigurable architecture is derived for

interpolation filter. The most remarkable aspect of the

proposed architecture is that, it does not require

reconfiguration to compute filter outputs of an interpolation

filter for different up-sampling factor. The proposed structure has regular data-flow and it has no overhead complexity for

its reconfigurable feature unlike the existing structures.

Besides, the proposed structure has significantly less register

complexity than the existing structure and its register complexity is independent of the block-size. Moreover, the

proposed structure can support higher input-sampling

frequency than the existing structure. ASIC synthesis result

shows that the proposed structure for block-size 4,

filter length 32, and up-sampling factor 8, involves 13.6 times more area and offers 245 times higher maximum input-

sampling frequency compared with the existing multiplier-

less structure. It involves 18.6 times less area-delay-product

(ADP) and 9.5 times less energy per output (EPO) than the

existing multiplier-less structure.

2015

35. VLSI2015_35 One Minimum Only Trellis

Decoder for Non-Binary

Low-Density Parity-Check Codes

A one minimum only decoder for Trellis-EMS (OMO

T-EMS) and for Trellis-Min-max (OMO T-MM) is proposed

in this paper. In this novel approach, we avoid computing the second minimum in messages of the check node processor,

and propose efficient estimators to infer the second minimum

value. By doing so, we greatly reduce the complexity and at

the same time improve latency and throughput of the derived

architectures compared to the existing implementations of EMS and Min-max decoders. This solution has been applied

to various NB-LDPC codes constructed over different Galois

fields and with different degree distributions showing in all

cases negligible performance loss compared to the ideal EMS

2015

Page 15: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

and Min-max algorithms. In addition, two complete decoders

for OMO T-EMS and OMO T-MM were implemented for

the (837,726) NB-LDPC code over GF(32) for comparison proposals. A 90 nm CMOS process was applied, achieving a

throughput of 711 Mbps and 818 Mbps respectively at a

clock frequency of 250 MHz, with an area of 19.02 and 16.10

after place and route. To the best knowledge of the authors,

the proposed decoders have higher throughput and area-time efficiency than any other solution for high-rate NB-LDPC

codes with high Galois field order.

36. VLSI2015_36 A Low-Cost Hardware

Architecture for Illumination

Adjustment in Real-Time

Applications

For real-time surveillance and safety applications in

intelligent transportation systems, high-speed processing for

image enhancement is necessary and must be considered. In

this paper, we propose a fast and efficient illumination

adjustment algorithm that is suitable for low-cost very large scale integration implementation. Experimental results show

that the proposed method requires the least number of

operations and achieves comparable visual quality as

compared with previous techniques. To further meet the

requirement of real-time image/video applications, the 16-stage pipelined hardware architecture of our method is

implemented as an intellectual property core. Our design

yields a processing rate of about 200 MHz by using TSMC

0.13-μm technology. Since it can process one pixel per clock

cycle, for an image with a resolution of QSXGA (2560 × 2048), it requires about 27 ms to process one frame that is

suitable for real-time applications. In some low-cost

intelligent imaging systems, the processing rate can be

slowed down, and our hardware core can run at very low

power consumption.

2015

37. VLSI2015_37 A 2.5-Gb/s DLL-Based

Burst-Mode Clock and Data Recovery

Circuit With 4×

Oversampling

In this brief, a delay-locked loop (DLL)-based burst-mode

clock and data recovery (BMCDR) circuit using a 4× oversampling technique is realized for passive optical

network. With the help of DLL to track the input phase, the

proposed circuit can recover the burst mode data in a short

acquisition time and achieve large jitter tolerance. In

addition, a 2.5-GHz four-phase clock generator is embedded in the chip. Implemented with a 0.18-µm CMOS technology,

experiment shows that the acquisition time can be

accomplished in the time of 31 bits. Incoming 2.5-Gb/s input

data of 231–1 pseudorandom binary sequence, the retimed

data has a root-mean-square jitter of 8.557 ps and a peakto-peak jitter of 32.0 ps, and the measured bit error rate is less

than 10−10. The area of the whole chip is 1.4 × 1.4 mm2,

where the BMCDR circuit core occupies 0.81 × 0.325 mm2.

The total power consumption is 130 mW from a 1.8 V supply

voltage.

2015

38. VLSI2015_38 Aging-Aware Reliable

Multiplier Design With Adaptive Hold Logic

Digital multipliers are among the most critical

arithmetic functional units. The overall performance of these systems depends on the throughput of the multiplier.

Meanwhile, the negative bias temperature instability effect

2015

Page 16: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

occurs when a pMOS transistor is under negative bias (Vgs =

−Vdd), increasing the threshold voltage of the pMOS

transistor, and reducing multiplier speed. A similar phenomenon, positive bias temperature instability, occurs

when an nMOS transistor is under positive bias. Both effects

degrade transistor speed, and in the long term, the system

may fail due to timing violations. Therefore, it is important to

design reliable high-performance multipliers. In this paper, we propose an aging-aware multiplier design

with a novel adaptive hold logic (AHL) circuit. The

multiplier is able to provide higher throughput through the

variable latency and can adjust the AHL circuit to mitigate

performance degradation that is due to the aging effect. Moreover, the proposed architecture can be applied to a

column- or row-bypassing multiplier. The experimental

results show that our proposed architecture with 16 × 16 and

32 × 32 column-bypassing multipliers can attain up to

62.88% and 76.28% performance improvement, respectively, compared with 16×16 and 32×32 fixed-latency column-

bypassing multipliers. Furthermore, our proposed

architecture with 16 × 16 and 32 × 32 row-bypassing

multipliers can achieve up to 80.17% and 69.40%

performance improvement as compared with 16×16 and 32 × 32 fixed-latency row-bypassing multipliers.

39. VLSI2015_39 Reverse Converter Design

via Parallel-Prefix Adders: Novel Components,

Methodology, and

Implementations

In this brief, the implementation of residue number system

reverse converters based on well-known regular and modular parallel prefix adders is analyzed. The VLSI implementation

results show a significant delay reduction and area × time2

improvements, all this at the cost of higher power

consumption, which is the main reason preventing the use of parallel-prefix adders to achieve high-speed reverse

converters in nowadays systems. Hence, to solve the high

power consumption problem, novel specific hybrid parallel-

prefix-based adder components that provide better tradeoff

between delay and power consumption are herein presented to design reverse converters. A methodology is also

described to design reverse converters based on different

kinds of prefix adders. This methodology helps the designer

to adjust the performance of the reverse converter based on

the target application and existing constraints.

2015

40. VLSI2015_40 Fully Reused VLSI

Architecture of

FM0/Manchester Encoding Using SOLS

Technique for DSRC

Applications

The dedicated short-range communication (DSRC)

is an emerging technique to push the intelligent transportation

system into our daily life. The DSRC standards generally adopt FM0 and Manchester codes to reach dc-balance,

enhancing the signal reliability. Nevertheless, the coding-

diversity between the FM0 and Manchester codes seriously

limits the potential to design a fully reused VLSI architecture

for both. In this paper, the similarity-oriented logic simplification (SOLS) technique is proposed to overcome

this limitation. The SOLS technique improves the hardware

utilization rate from 57.14% to 100% for both FM0 and

Manchester encodings. The performance of this paper is

2015

Page 17: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

evaluated on the post layout simulation in Taiwan

Semiconductor Manufacturing Company (TSMC) 0.18-µm

1P6M CMOS technology. The maximum operation frequency is 2 GHz and 900 MHz for Manchester and FM0

encodings, respectively. The power consumption is 1.58 mW

at 2 GHz for Manchester encoding and 1.14 mW at 900 MHz

for FM0 encoding. The core circuit area is 65.98 × 30.43

µm2. The encoding capability of this paper can fully support the DSRC standards of America, Europe, and Japan.

VLSI PROJECTS 2014

SN PRO JECT

CO DE

PRO JECT TO PIC YEAR

NVLSI1449

Topic: Argo: A Time-Elastic Time-Division-Multiplexed NOC using Asynchronous

Routers

Abstract: In this paper we explore the use of asynchronous routers in a time-division-

2014

Page 18: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

1 multiplexed (TDM) network-on-chip (NOC), Argo that is being developed for a multi-

processor platform for hard real-time systems. TDM inherently requires a common time

reference, and existing TDM-based NOC designs are either synchronous or mesochronous. We use asynchronous routers to achieve a simpler, smaller, and more robust, self-timed

design. Our design exploits the fact that pipelined asynchronous circuits also behave as

ripple FIFOs. Thus, it avoids the need for explicit synchronization FIFOs between the

routers. Argo has interesting elastic timing properties that allow it to tolerate skew between

the network interfaces (NIs). The paper presents Argo NOC-architecture and provides a quantitative analysis of its ability of absorb skew between the NIs. Using a signal transition

graph model and realistic component delays derived from a 65 nm CMOS implementation, a

worst case analysis shows that a typical design can tolerate a skew of 1-5 cycles (depending

on FIFO depths and NI clock frequency). Simulation results of a2×2NOC confirm this.

2

NVLSI1448

Topic: High Performance BIST PLL Approach for VCO Testing

Abstract: RF and mixed signal IC testing is becoming an important issue that affects both the time-to-market and product cost of many modem electronic systems. This paper focuses

on certain mixed signal IC that is phase locked loop (PLL). A novel BIST (Built -In-Self-

Test) approach is developed for RF PLL; it is particularly applied for testing the VCO block.

The proposed BIST schema doesn’t break the loop to include test circuit in the PLL design

stage which is achieved with minimal degradation characteristics of PLL. The key advantage of this technique is that it uses an internal test signal for evaluating the test procedure. The

presented architecture uses the existing elements for measuring and testing in order to reduce

the area overhead for BIST schema, solves the analog nodes loading problem and improves

the test accessibility. The test output generated is a purely digital signal. The BIST method

enables the detection of catastrophic and many parametric faults affected the VCO by measuring its oscillation frequency response. To evaluate the effectiveness of proposed BIST

approach, a fault simulation results indicate the characteristic of the BIST structure that is

high fault coverage of 100%.

2014

3

NVLSI1447

Topic: Performance Evaluation of Column-Scaled LDPC Codes Under Fading Channel

Conditions

Abstract: Column scaling of LDPC codes is generally done to reduce the decoding complexity without degradation in bit error rate. It has been deduced that, the so constructed

CS-LDPC codes had better performance than the existing regular and deterministic LDPC in

terms of Bit Error Rate (BER). In this paper, ability of Column Scaled LDPC codes in

reaching the best performance is evaluated for different fading channel conditions such as

Additive White Gaussian Noise (AWGN), Rayleigh and Rician channels. Diagonal elements (that are not equal to zero) are distributed randomly in order to generate CS-LDPC. In

Column Scaled Low density Parity Check codes, non binary parity check matrix, H is

derived using Galios field or finite field polynomial and then non-binary H matrix is

converted to binary H matrix. CS-LDPC codes have parity-check matrix that are composed

of binary and diagonal matrix that eases implementation and analysis results shows that the CS-LDPC scheme improves Bit Error Rate and Frame Error Rate for

AWGN and Rician channels and no significant improvement for Rayleigh channels.

2014

4

NVLSI1446

Topic: A Low-Cost Platform for Voice Monitoring

Abstract: A low-cost platform is proposed in this paper that has been conceived to monitor

2014

Page 19: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

the vocal activity of people that use the voice as a professional tool. Such a platform includes

a wearable data-logger and a processing program that allows the vocal parameters to be

extracted from the recorded signal. The data-logger is equipped with a contact microphone that is attached to the jugular notch of the person under monitoring, thus sensing the skin

acceleration level due to the vibration of the vocal folds. The microphone output is

conditioned through a custom circuitry and then sent to a cheap micro-controller based

board, which stores the raw samples onto a micro SDcard. The off-line processing provides

an estimation of Sound Pressure Level (SPL), fundamental frequency (F0)andTime Dose (Dt), which are the parameters that seem most suitable for the identification of vocal

disorders and the prevention of an improper use of the voice. For the estimated parameters,

suitable calibration procedures are implemented and their effectiveness is shown through

specifically conceived experimental tests. Experimental results are shown that refer to the

calibration of the device and its normal use during monitoring interval of several hours. A comparison with a commercial device is also reported.

5

NVLSI1445

Topic: Implementation of High Speed Low Power Combinational and Sequential

Circuits using Reversible logic

Abstract: Reversible logic has presented itself as a prominent technology which plays an

imperative role in Quantum Computing. Quantum computing devices theoretically operate

at ultra high speed and consume infinitesimally less power. Research done in this paper aims

to utilize the idea of reversible logic to break the conventional speed-power trade-off, thereby getting a step closer to realise Quantum computing devices. To authenticate this

research, various combinational and sequential circuits are implemented such as a 4-bit

Ripple-carry Adder, (8-bit X 8-bit) Wallace Tree Multiplier, and the Control Unit of an

8-bit GCD processor using Reversible gates. The power and speed parameters for the circuits

have been indicated, and compared with their conventional non-reversible counterparts. The comparative statistical study proves that circuits employing Reversible logic thus are faster

and power efficient. The designs presented in this paper were simulated using Xilinx 9.2

software.

2014

6

NVLSI1444

Topic: A LOW POWER BIST SCHEME BASED ON BLOCK ENCODING

Abstract: With the development of integrated circuit manufacturing technology, low power

test has become a focus of concern during testing fields. This paper proposes a new low power BIST˄built-in self test˅scheme based on block encoding which first exploit a block

re-encoding method to optimize the test cube, and then a low power test based on LFSR

(linear feedback shift register) reseeding is applied. According to the compatibility of flag,

the scheme proposes a grouping algorithm based on flag to divide and reorder the test cubes

in the test cube set. Experimental results show that the scheme not only obtain better t est compression ratio and test data storage, but also reduce the test power consumption

effectively. Key words: LFSR reseeding; test data compression; low power test; test cube

block.

2014

7

NVLSI1443

Topic: Designing of FPGA Based High Performance 32 Bit FFT Processor With

BIST

Abstract: Designing and implementation of 32 bit and 64 point pipelined FFT processor is presented in this paper. This FFT processor is going to be implemented

on Field Programmable Gate Array (FPGA). The aim behind this is to reduce the

number of cycles required for computation. The architecture of FFT has two pipelines.

Out of this one pipeline is present in execution of the complex multiplication of

butterfly unit and other is present in the RAM unit. In this architecture a novel simple address mapping scheme is proposed. The twiddle factor in this architecture is

2014

Page 20: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

not going to be stored in ROM memory, it is going to be generated and accessed

directly. The Built In Self Test (BIST) provided in this is used to design such

technique which test itself.

8

NVLSI1442

Topic: Low Power and High Performance Achievement Using Constant Delay

Logic Style

Abstract: The high performance energy efficient is one of the most important goal and

objective in the design of VLSI circuits. To achieve this, new CMOS logic family

constant delay (CD) logic is used. The CD logic has contention C-Q delay and D-Q delay

modes. In CD logic, D-Q delay mode proposes a distinct characteristic where the

output is pre-calculated before getting the inputs from the previous stage. This logic provides performance improvement over static and dynamic logic styles in multistage

circuit block. In accordance with the logic type, the CD logic style is suitable to

implement difficult logic expressions such as addition. The three modes of CD logic

is designed, simulated and synthesized. Also full adder is designed, simulated and

synthesized in transistor level using static, dynamic and CD logic styles in Tanner EDA. The synthesized results of Full Adder demonstrates that

Full Adder using CD logic style has lesser delay which enhances the performance and

consumes more power than other two logic styles. Low power is likely to be a key

objective in VLSI circuit design. To achieve this, low power techniques -Clock Gating

and Supply Voltage Scaling are also used in Full Adder and 4-bit Ripple Carry Adder using CD logic style.

2014

9

NVLSI1441

Topic: Reconfigurable Edge Detection Processor Using Xilinx Platform Studio

Abstract: In this paper we propose a technique for software implementation of Edge

detection which serves as a preprocessing step for many image processing algorithms such as

image enhancement, image segmentation, tracking and image and video coding. The Edge Detection is one of the key stages in image processing and object recognition. Edge detection

is a basic operation in image processing which refers to the process of identifying and

locating sharp discontinuities in an image. The discontinuities are abrupt changes in pixel

intensity which characterize boundaries of objects in a scene. It plays a major role in many

algorithms used for segmentation and tracking. This p aper presents an edge detection algorithm that results in significantly reduced memory requirements, decreased latency and

increased throughput with no loss in edge detection performance using Micro Blaze

Processor. This edge detection algorithm is based on MATLAB simulation and FPGA

implementation through serial communication using Xilinx Platform Studio.

2014

10

NVLSI1440

Topic: Reconfigurable System-On-Chip Design Using FPGA

Abstract: System-on-Chip (SoC) design integrates processors, memory, and a variety of IPs

in a single design. Due to the FPGA capabilities and high time-to-market pressures, complex

SoC designs are increasingly targeted to FPGA. Traditionally cores in FPGAs are connected

using AXI and PLB bus-based architectures. FPGA devices provide Embedded Systems

development with new alternatives for creating new hardware accelerated applications. The availability of embedded processor subsystems in FPGAs opens the door to a myriad of

applications. Reconfigurable System-on-Chip architecture: includes Micro Blaze Soft Core

Processor integrates peripherals with PLB and OPB Buses provides access to memory, PS2

and VGA IP cores. A new peripheral based Arithmetic application is designed, the keyboard

module is a custom hardware module that accepts input from a PS/2 serial keyboard and outputs character data to the VGA input memory. VHDL Language is used in ISE for custom

2014

Page 21: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

logic design. System C & VHDL Co-Synthesis scenario provides a way of checking

interoperability of a single designed different functionality hardware module. Both designs

are synthesizable and implemented in a single Bit stream, and configured to FPGA. Two level functionality is observed for the configured Bit stream with FPGA Hardware, design

modeling was done using System C & VHDL Co-Synthesis. This paper presents an

evaluation of design methods and concepts of reconfigurable architecture; it provides a lot of

options for system designers. Co-Synthesis was done either Top-Down or Bottom-Up Design

Methodologies. Implementation was targeted through Spartan - 3E FPGA Board.

11

NVLSI1439

Topic: Performance Analysis of a Space-Frequency Block Coded OFDM Wireless

Communication System with MSK and GMSK Modulation

Abstract: Space frequency block codes (SFBC) are very much efficient in overcoming the

effect of frequency selective fading channel in a wireless communication system. In this

paper, bit error rate performance analysis is carried out for a SFBC-OFDM system with

MSK and GMSK modulation schemes. Results are evaluated numerically for SISO and MIMO communication links. It is shown that, for a fixed bit error rate the improvement in

SNR for both SFBC coded MSK and GMSK modulation is noticeable. Also the receiver

sensitivity is evaluated for system BER. It is shown that sensitivity improves for the change

in code rate but remains nearly same for the same combination of transmit and receive

antennas both for SFBC coded MSK and GMSK modulation scheme.

2014

12

NVLSI1438

Topic: Modified Wallace Tree Multiplier using Efficient Square Root Carry Select

Adder

Abstract : A multiplier is one of the key hardware blocks in most digital and high

performance systems such as FIR filters, micro processors and digital signal processors etc.

A system’s performance is generally determined by the performance of the multiplier

because the multiplier is generally the slowest element in the whole system and also it is occupying more area consuming. The Carry Select Adder (CSLA) provides a good

compromise between cost and performance in carry propagation adder design. A Square

Root Carry Select Adder using RCA is introduced but it offers some speed penalty.

However, conventional CSLA is still area-consuming due to the dual ripple carry adder

structure. In the proposed work, generally in Wallace multiplier the partial products are reduced as soon as possible and the final carry propagation path carry select adder is used. In

this paper, modification is done at gate level to reduce area and power consumption. The

Modified Square Root Carry Select-Adder (MCSLA) is designed using Common Boolean

Logic and then compared with regular CSLA respective architectures, and this MCSLA is

implemented in Wallace Tree Multiplier. This work gives the reduced area compared to normal Wallace tree multiplier. Finally an area efficient Wallace tree multiplier is designed

using common Boolean logic based square root carry select adder.

2014

13

NVLSI1437

Topic: Design of an Energy Efficient, High Speed, Low Power Full Subtractor Using

GDI Technique

Abstract: This paper proposes the design of an energy efficient, high speed and low power

full subtractor using Gate Diffusion Input (GDI) technique. The entire design has been performed in 150nm technology and on comparison with a full subtractor employing the

conventional CMOS transistors, transmission gates and Complementary Pass-Transistor

Logic (CPL), respectively it has been found that there is a considerable amount of reduction

in Average Power consumption (Pavg), delay time as well as Power Delay Product (PDP). P

avg is as low as 13.96nW while the delay time is found to be 18.02pico second thereby giving a PDP as low as 2.51x10 -19 Joule for 1 volt power supply. In addition to this there is

2014

Page 22: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

a significant reduction in transistor count compared to traditional full subtractor employing

CMOS transistors, transmission gates and CPL, accordingly implying minimization of area.

The simulation of the proposed design has been carried out in Tanner SPICE and the layout has been designed in Microwind.

14

NVLSI1436

Topic: Design and Implementation of Area Efficient, Low Power AMBA-APB Bridge

for SoC

Abstract: In this paper, we present the design of Advanced Peripheral Bus (APB) controller

(or APB Bridge). UART as an APB slave has been used in the design. Linear Feedback shift

register (LFSR) module has been included in the UART design for data security. We have

also compared APB Bridge design compatible with AMBA Specification (Rev 2.0) and APB Bridge design compatible with AMBA 3 APB Specification (v1.0) for

power and area constraints have been done. Design of APB Bride with AMBA3 APB save

6% power and 10% area over the one designed with AMBA2 APB.

2014

15

NVLSI1435

Topic: Designing a Learning Platform for the Implementation of Serial Standards

using ARM Microcontroller LPC2148

Abstract: In embedded system design, managing communication among various bus

interfaces and attaching multiple systems with different interfacing protocols to a main

processor is one of the challenging tasks. Popular serial interfacing protocols include:

USB, I2C, SPIISSP, CAN and UART for communication between integrated circuits

for low/medium data transfer speed with on board peripherals. This paper presents a platform which deals with the implementation of certain of the above serial protocols

presented by a low power 32-bit ARM RISC processor: LPC2148, with suitable

examples, including hardware and software details. This platform is also useful for

students of different disciplines to work with different serial protocols, which helps

them in interfacing of sensors, memory ICs, analog subsystems and so on. It also aims to provide the students with hands on experience, practices in embedded systems and

minimizing the prerequisite knowledge.

2014

16

NVLSI1434

Topic: RGB Based KMB Image Compression Technique

Abstract: With the increased requirement of bandwidth in digital media, the compression

of an image is an important issue. However the various image compression technologies

which are still in use such as JPEG/PNG/DCT offer an efficient way for the

compression/extraction of an image and provide an ease of data transmission. The technique used here, is much more helpful in reducing the bandwidth of an image

and to speed up of its availability, reliability, and transmission rates. In this

technique, an image compression domain algorithm aims at high performance in terms

of image effectiveness.

2014

17

NVLSI1433

Topic: Built-In Self-Test for Analog-to-Digital Converters in SoC Applications

Abstract: This paper presents a built-in self-test (BIST) architecture for testing high speed analog-to-digital converters (ADCs) with sampling rates in excess of 1 GHz. A methodology

for performing mixed-mode BIST simulations in SoC applications is proposed along with

hardware for performing on-chip BIST. The architecture presented

utilizes an on-chip ROM and allows for the generation of test signals with single frequency

as well as multiple frequencies signals. The issues associated with BIST signal generation for low voltage ADCs are also discussed. Simulations revealed that the SFDR of the

sinusoidal signal generated from the BIST hardware was 25.28 dB with a frequency of

312.5 MHz and 19.88 dB with a frequency of 416.67 MHz.

2014

Page 23: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

18

NVLSI1432

Topic: A SoC Design and Implementation of H.264 Video Encoding System Based on

FPGA

Abstract: A SoC design of H.264 Video Encoding system is implemented based on FPGA

in this paper. Intra prediction algorithm and baseline profile is selected, and H.264 encoder

algorithm is designed as an IP core and embedded to the SoC through the interconnect

interface AMBA AXI bus. The SoC is implemented on Xilinx Zynq-7000 FPGA and each

functional module is simulated by Modelsim and tested within the SoC platform. Comparing to the existing H.264 Video Encoding system based on ARM or DSP, results indicate that

this special SoC could fully shows its advantage in high-speed and flexibility.

Also the implemented system could meet the required rate for the processing of HD-1080

format video sequence

2014

19

NVLSI1431

Topic: Design and Analysis of a S imple D Flip-Flop Based Sequential Logic Circuits for

QCA Implementation

Abstract: Quantum-dot Cellular Automata (QCA) is one of the emerging computing

paradigms. Its advantages such as smaller size, lower power consumption and faster speed are very attractive. QCA performs highly dense computing that could be realized in a variety

of material systems. It is presently being investigated as an alternative to CMOS VLSI. In

conventional digital systems the information is transferred from one place to another by

means of electrical current, while as QCA cells transfer information by propagating a

polarization state. This paper proposes a detailed design and simulation of a simple D flip -flop based sequential logic circuits like shift register, ring counter and modulo n counter

circuits for quantum-dot cellular automata. The proposed designs are based on the D-type

flip-flop (DFF) device. A QCA binary wire with four clocking zones can be used to

implement a DFF. The aim is to maximize the circuit density and focus on a layout that is

minimal in its use of cells.

2014

20

NVLSI1430

Topic: Multiple-Clock Multiple-Edge-Triggered Multiple-Bit Flip-flops for Two-Phase

Handshaking Asynchronous Circuits

Abstract: This paper proposes multiple-clock multiple-edge triggered multiple-bit flip-flops

for designing simple and straightforward asynchronous control circuits of the two-phase

handshaking protocol. The proposed flip-flops have multiple clocks and multiple data inputs,

and each data input can be stored in the flip -flop at both the rising edge and the falling edge

of the corresponding clock. They can be applied in the asynchronous design of the two-phase handshaking protocol not only for synthesizing simple control circuits, but also for obtaining

robust circuits. The performance of the proposed flip -flops has been evaluated using the

PTM 22nm HP device parameters.

2014

21

NVLSI1429

Topic: Efficient Design of Sparse FIR Filters with Optimized Filter Length

Abstract: A large number of experiments have demonstrated that for an FIR filter the

sparsity of filter coefficients is highly elated to its filter order. However, traditional sparse FIR filter design methods focus on how to increase the number of zero valued coefficients,

but overlook the impact of filter orders on design performance. As an attempt to jointly

optimize filter length and sparsity of an FIR filter, a novel method is proposed in this paper

to design sparse linear-phase FIR filters. With peak error constraints, the objective function

of the design problem is formulated as a combination of the sparsity of filter coefficients and a measure of the effective filter order. Then, the design problem is then recast as a weighted

l0-norm optimization problem, which is solved by an efficient numerical method based on

the iterative-reweighted-least-squares (IRLS) algorithms. Experimental results illustrate that

the proposed method can efficiently reduce the effective filter order while

enhancing the sparsity of an FIR filter.

2014

Page 24: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

22

NVLSI1428

Topic: A novel approach to realize Built-in-self-test(BIST) enabled UART using VHDL

Abstract: Testing of VLSI chips are becoming very much complex day by day due to increasing exponential advancement of nano technology. So both front-end and back-end

engineers are trying to evolve a system with full testability keeping in mind the possibility of

reduced product failures and missed market opportunities. BIST is a design technique that

allows a system to test automatically itself with slightly larger system size. In this paper, the

simulation result performance achieved by BIST enabled UART architecture through VHDL programming is enough to compensate the extra hardware needed in BIST architecture. This

technique generate random test pattern automatically, so it can provide less test time

compared to an externally applied test pattern and helps to achieve much more productivity

at the end .

2014

23

NVLSI1427

Topic: Architecture for Monitoring SET Propagation in 16-bit Sklansky Adder

Abstract: We propose a measurement architecture that allows to trace generation and

propagation of single event transients in a combinational target circuit that will be subjected to radiation in an experimental study. We choose the Sklansky adder as a target circuit, since

it exhibits both properties we are interested in, namely different amounts of fanout and a

carry propagation chain. The problem of devising a suitable on-chip measurement

infrastructure lies in the partly contradictory requirements, like constrained area, radiation

tolerance and good resolution of the location and propagation path of particle hits. Our proposed architecture is based on linear feedback shift registers that can be used as lean and

robust counter implementations. These counters are at tached at selected locations within the

target adder circuit, and we show by means of a simulation study as well as a fault dictionary

that this architecture indeed comes up to our expectations.

2014

24

NVLSI1426

Topic: High Performance Low Swing Clock Tree Synthesis with Custom D Flip-Flop

Design

Abstract: Low swing clocking is a low power design methodology that scales the clock voltage to decrease power consumption of the clock distribution networks, with an expected

degradation in the performance. In this work, a novel low swing clock tree synthesis

methodology is combined with a custom low swing clock-aware D flip-flop (DFF) design.

The low swing clocking serves to reduce the power dissipation whereas the custom low

swing-aware DFF serves to preserve the performance of the IC. The experimental results performed on the three largest circuits of ISCAS’89 benchmarks operating at 1GHz in the

32nm technology show that the proposed methodology can achieve an average of 16% power

savings in the clock tree compared to its full swing counterpart, while satisfying the same

clock skew (50ps) and slew (150ps) constraints at the worst case corner of operation.

Moreover, the clock-to-output delay of the low swing DFF does not increase compared to traditional full swing DFF, while consuming only 1% more power.

2014

25

NVLSI1425

Topic: Securing RObust Header Compression (ROHC)

Abstract: The desire for the cellular and wireless industry to converge on an all-IP

infrastructure, fueled by the increased usage of mobile applications on smart phones and

VoIP applications have pushed research in maximizing bandwidth efficiency amidst a

shrinking allocation of RF spectrum. One method of providing increased bandwidth efficiency (especially with the desire to move to IPv6), is the use of RObust Header

2013

Page 25: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

Compression (ROHC-RFC5225) to compress headers from the network layer and above into

small identifiers before sending packets to the link layer. ROHCv1 and ROHCv2 have been

adopted and is in the roadmaps for usage on High Speed Packet Access (HSPA), Long Term Evolution (LTE) and Evolution Data Optimized (EVDO) mobile phone networks. Although

the promise of significant bandwidth savings can be achieved using ROHC, the stateful

nature of the protocol leads to potential compromises. In this paper, we examine three attacks

on the ROHC protocol that result in denial of service and packet interception and their affect

on networks that use ROHC to compress and decompress IP headers. Additionally, we propose three simple methods to mitigate the attacks.

26

NVLSI1424

Topic: Shift Register Design Using Two Bit Flip-Flop

Abstract: A novel concept of multi bit flip -flops has been proved to be an effective way in

processing multiple bits simultaneously .In this paper we propose a way of using multi bit

flip-flop technique in designing various digital circuits. By sharing the inverters in the flip-

flops, the total number of inverters can be reduced in a multi-bit flip-flop. So, here, we have

designed a shift register which is an important memory element in digital systems, using 2-bit flip flop. Experimental results reveal that our ap proach is very efficient, which can be

effortlessly incorporated in modern vlsi circuit designs.

2014

27

NVLSI1423

Topic: Design and Estimation of delay, power and area for Parallel prefix adders

Abstract: In Very Large Scale Integration (VLSI) designs, Parallel prefix adders (PPA) have

the better delay performance. This paper investigates four types of PPA’s (Kogge Stone

Adder (KSA), Spanning Tree Adder (STA), Brent Kung Adder (BKA) and Sparse Kogge Stone Adder (SKA)). Additionally Ripple Carry Adder (RCA), Carry Look-ahead Adder

(CLA) and Carry Skip Adder (CSA) are also investigated. These adders are implemented in

verilog Hardware Description Language (HDL) using Xilinx Integrated Software

Environment (ISE) 13.2 Design Suite. These designs are implemented in Xilinx Virtex 5

Field Programmable Gate Arrays (FPGA) and delays are measured using Agilent 1692A logic analyzer and all these adder’s delay, power and area are investigated and compared

finally.

2014

28

NVLSI1422

Topic: Design of a 4-bit Adder using Reversible Logic in Quantum-Dot Cellular

Automata (QCA)

Abstract: Both quantum-dot cellular automata (QCA) and reversible logic are emerging

technologies that are promising alternatives to overcoming the scaling and heat dissipation issues, respectively, in the current CMOS designs. Here, the fundamentals of QCA and

reversible logic are studied; the feasibility of incorporating reversible logic in QCA designs

is also demonstrated. Based on two existing designs, an improved version of the reversible

gates, namely the Feynman Gate and the Toffoli Gate, were implemented in QCA

technology using QCADesigner. The proposed design of the QCA-based Feynman Gate is faster by ½ cycle as compared to the existing design; while the proposed Toffoli Gate has the

same latency as the existing design but it is readily to be cascaded into a more complex

design. A 4-bit ripple carry adder in QCA is then designed using the proposed Feynman and

Toffoli gates to realize a reversible QCA full adder. This 4-bit QCA adder with reversible

logic consists of 2030 QCA cells, has a latency of 7 clock cycles and 8 garbage outputs.

2014

Page 26: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

29

NVLSI1421

Topic: Background Subtraction Algorithm for Moving Object Detection in FPGA

Abstract: Currently, both the market and the academic communities have required

applications based on image and video processing with several real-time constraints. On the

other hand, detection of moving objects is a very important task in mobile robotics and

surveillance applications. In order to achieve an alternative design that allows for rapid development of real time motion detection systems, this paper proposes a hardware

architecture for motion detection based on the background subtraction algorithm, which is

implemented on FPGAs (Field Programmable Gate Arrays). For achieving this, the

following steps are executed: (a) a background image (in gray -level format) is stored in an

external SRAM memory, (b) a low-pass filter is applied to both the stored and current images, (c) a subtraction operation between both images is obtained, and (d) a morphological

filter is applied over the resulting image. Afterward, the gravity center of the object is

calculated and sent to a PC (via RS-232 interface). Both the practical results of the motion

detection system and synthesis results have demonstrated the feasibility of FPGAs for

implementing the proposed algorithms on an FPGA based hardware platform. The implemented system provides one processed pixel per FPGA’s clock cycle (after the latency

time) and speed-ups the software implementation (using the real-time xPC TargetOS from

MathWorks) by a factor of 32.

2014

30 NVLSI1420

Topic: An Area- and Energy-Efficient FIFO Design Using Error-Reduced Data

Compression and Near-Threshold Operation for Image/Video Applications

Abstract: Many image/video processing algorithms require FIFO for filtering. The FIFO

size is proportional to the length of the filters and input data width, causing large area and

power consumption. We have proposed an energy- and area-efficient FIFO design for

image/video applications through FIFO with error-reduced data compression (FERDC) and

near-threshold operation. On architecture level, FERDC technique is proposed to reduce the size and power consumption of the FIFO by utilizing the spatial correlation between

neighboring pixels and performing error-reduced data compression together with

quantization to minimize the mean square error (MSE). On circuit level, near threshold

operation is adopted to achieve further power reduction while maintaining the required

performance. To demonstrate the proposed FIFO, it has been implemented using a 0.18-µmCMOS process technology. The implementation covers different FIFO length, including

128, 256, 512, and 1024. The experimental results show that the proposed FIFO operating at

0.5 V and 28.57 MHz achieves up to 99%, 65%, and 34.91% reduction in dynamic power,

leakage power, and area, respectively, with a small MSE of 2.76, compared with the

conventional FIFO design. The proposed FIFO can be applied to a wide range of image/video signal processing applications to achieve high area and energy efficiency.

2014

31

NVLSI1419

Topic: Design and Implementation of High Throughput and Area Efficient Hard

Decision Viterbi Decoder in 65nm Technology

Abstract: This paper presents a high throughput (1Gbps) and moderate area for constraint

length K=3, code rate R=1/2 and four states (N=4) hard decision state parallel Viterbi

decoder. The Add Compare Select (ACS) unit in path metric unit is designed to reduce the

latency of ACS loop delay by using Modified Carry Look Ahead Adder and Digital Comparator. We also consider the design of Survivor Memory Unit (SMU) which combines

2014

Page 27: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

the advantages of both Register Exchange method and Trace Back method, to reduce the

decoding latency and total area of the Viterbi decoder. The proposed Viterbi decoder design

is described using Verilog HDL and implemented in standard cell ASIC flow using Synopsys EDA tool. The design operation is verified by decoding the one million bits. The behavior of

the decoder is verified by using Synopsys simulator and synthesized using Synopsys Design

Compiler in 65nm CMOS technology library. The proposed decoder operates at 250MHz,

supply voltage 1.32V and operating temperature range -40°C to 125°C. The ACS

architecture achieves 67.07% improvement in reduction of latency compared to the conventional ACS architecture and achieves 1.235 Gbps throughput. The results show that,

the Viterbi decoder architecture achieves 73.03% to 92.46% improvement in area as

compared to the other architectures. This reduction in latency and area finds application in

high data rate communication.

32

NVLSI1418

Topic: Exact BER Performance Analysis of Link Adaptive Relaying with Non-coherent

BFSK Modulation

Abstract: Link adaptive relaying (LAR) is one of the most popular techniques developed for

mitigating error propagation in decode and forward (DF) based cooperative wireless

networks which employs soft power scaling approaches at the relay nodes. On the other side,

frequency shift keying (FSK) is a prominent technique for eliminating the need for channel

estimation by training sequences which increases complexity of the system and causes reduction in the transmission rate in proportional to the number of users involved in the

network. In this paper, performance of an LAR scheme with non-coherent binary FSK

(BFSK) signaling is investigated by deriving exact closed form bit error rate expressions in

Rayleigh fading channels.

2014

33

NVLSI1417

Topic: Efficient Integer DCT Architectures for HEVC

Abstract: In this paper, we present area- and power-efficient architectures for the implementation of integer discrete cosine transform (DCT) of different lengths to be used in

High Efficiency Video Coding (HEVC). We show that an efficient constant matrix

multiplication scheme can be used to derive parallel architectures for 1-D integer DCT of

different lengths. We also show that the proposed structure could be reusable for DCT of

lengths 4, 8, 16, and 32 with a throughput of 32 DCT coefficients per cycle irrespective of the transform size. Moreover, the proposed architecture could be pruned to reduce the

complexity of implementation substantially with only a marginal affect on the coding

performance. We propose power-efficient structures for folded and full-parallel

implementations of 2-D DCT. From the synthesis result, it is found that the proposed

architecture involves nearly 14% less area-delay product (ADP) and 19% less energy per sample (EPS) compared to the direct implementation of the reference algorithm, on average,

for integer DCT of lengths 4, 8, 16, and 32. Also, an additional 19% saving in ADP and 20%

saving in EPS can be achieved by the proposed pruning algorithm with nearly the same

throughput rate. The proposed architecture is found to support ultrahigh definition

7680×4320 at 60 frames/s video, which is one of the applications of HEVC.

2014

34

NVLSI1416

Topic: Critical-Path Analysis and Low-Complexity Implementation of the LMS Adaptive Algorithm

2014

Page 28: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

Abstract: This paper presents a precise analysis of the critical path of the least -mean-square

(LMS) adaptive filter for deriving its architectures for high-speed and low-complexity implementation. It is shown that the direct-form LMS adaptive filter has nearly the same

critical path as its transpose-form counterpart, but provides much faster convergence and

lower register complexity. From the critical-path evaluation, it is further shown that no

pipelining is required for implementing a direct-form LMS adaptive filter for

most practical cases, and can be realized with a very small adaptation delay in cases where a very high sampling rate is required. Based on these findings, this paper proposes three

structures of the LMS adaptive filter: (i) Design 1 having no adaptation delays, (ii) Design 2

with only one adaptation delay, and (iii) Design 3 with two adaptation delays. Design 1

involves the minimum area and the minimum energy per sample (EPS). The best of existing

direct-form structures requires 80.4% more area and 41.9% more EPS compared to Design 1. Designs 2 and 3 involve slightly more EPS than the Design 1 but offer nearly twice and

thrice the MUF at a cost of 55.0% and 60.6% more area, respectively.

35

NVLSI1415

Topic: An Optimized Modified Booth Recorder for Efficient Design of the Add-Multiply Operator

Abstract: Complex arithmetic operations are widely used in Digital Signal

Processing(DSP)applications. In this work, we focus on optimizing the design of the fused

Add-Multiply (FAM ) operator for increasing performance. We investigate techniques to implement the direct recoding of the sum of two numbers in its Modified Booth (MB) form.

We introduce a structured and efficient recoding technique and explore three different

schemes by incorporating them in FAM designs. Comparing them with the FAM designs

which use existing recoding schemes, the proposed technique yields considerable reductions

in terms of critical delay, hardware complexity and power consumption of the FAM unit.

2014

36

NVLSI1414

Topic: Improved 8-Point Approximate DCT for Image and Video Compression Requiring Only 14 Additions

Abstract: Video processing systems such as HEVC requiring low energy consumption

needed for the multimedia market has lead to extensive development in fast algorithms for

the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of

compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression

performance at very low circuit complexity. Such approximations can be realized in digital

VLSI hardware using additions and subtractions only, leading to significant reductions in

chip area and power consumption compared to conventional DCTs and integer transforms. In

this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational

complexity and is compared to state-of-the-art DCT approximations in terms of both

algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a

candidate for reconfigurable video standards such as HEVC. The proposed transform and

several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45

nm CMOS technology.

2014

37

NVLSI1413

Topic: A Bit-Serial Pipelined Architecture for High-Performance DHT Computation in

Quantum-Dot Cellular Automata

Abstract: In this brief, we consider quantum-dot cellular automata (QCA) realization of the

discrete Hadamard transform (DHT). An analysis of a full-parallel solution based on efficient

multibit addition in QCA is first presented. We show that this leads to large area as well as delay. We then propose a bit-serial pipelined architecture for QCA-based DHT. The

2014

Page 29: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

proposed architecture is based on a new one-bit adder–subtractor requiring only six majority

gates and a feedback latch that requires only one majority gate and limited wiring. The

approach leads to a reduction in area-delay-cycle product of 74% and 91% (over a full-parallel solution) for wordlengths of 4 and 8, respectively. Results of simulations in

QCADesigner are also presented.

38

NVLSI1412

Topic: An Efficient Non-Linear Cost Compression Algorithm for Multi Level Cell

Memory

Abstract: This paper defines a non-linear cost compression problem, proposes an efficient

algorithm, and applies it to a real application of multi level cell memory to minimize energy

consumption and latency. The non-linear cost compression problem extends the traditional

cost compression problem to allow a non-linear cost function of symbol frequencies, while it is a weighted linear combination of symbol frequencies in the cost compression problem. In

order to solve the non-linear cost compression problem efficiently, we propose an encoding

symbol frequency based approach. We first compute frequencies of encoding symbols to

minimize a cost function. To achieve the computed frequencies of a cost -compressed

message, we deploy existing size-decompression algorithms. The proposed algorithm is optimal and as fast as the existing size compression algorithms. Our experimental results

show that it reduces the energy consumption and latency by 70 percent for a text file in multi

level cell memory. Furthermore, it increases the lifetime of endurance limited memory.

2014

39

NVLSI1411

Topic: Lossless Image Compression using Fast Arithmetic Operation

Abstract: In this paper we are presenting a loss less image compression coder and decoder based on fast arithmetic operations. In the proposed method, we are making

use of only simple adder and subtractor in order to reduce the value of the pixel in a

very simple manner such that it takes very less amount of run time memory and the

time required to encode and decode the given image is very much less. In this

proposed method, decompressed image is exactly equal to that of the original image hence it is purely loss less method. Performance of this method is also compared with

arithmetic operation based predictive lossless image compression based on time to

compress and decompress and compression ratio as quantitative parameters. Since this

is taking less time to encode and decode this is much suitable for real time

implementation of image codec.

2014

40

NVLSI1410

Topic: Design of Efficient Binary Comparators in Quantum-Dot Cellular Automata

Abstract: Quantum-dot cellular automata (QCA) are an attractive emerging technology

suitable for the development of ultradense low-power high-performance digital circuits.

Efficient solutions have recently been proposed for several arithmetic circuits, such as

adders, multipliers, and comparators. Nevertheless, since the design of digital circuits in

QCA still poses several challenges, novel implementation strategies and methodologies are highly desirable. This paper proposes a new design approach oriented to the implementation

of binary comparators in QCA. New formulations of basic logic equations required to

perform the comparison function are proposed. The new strategy has been exploited in the

design of two different comparator architectures and for several operands word lengths. With

respect to existing counterparts, the comparators proposed here exhibit significantly higher speed and reduced overall area.

2014

41

NVLSI1409

Topic: A Low-Power and Portable Spread Spectrum Clock Generator for SoC Applications

Abstract: In this paper, a novel portable and all-digital spread spectrum clock generator

2014

Page 30: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

(ADSSCG) suitable for system-on-chip (SoC) applications with low-power consumption is

presented. The proposed ADSSCG can provide flexible spreading ratios by the proposed

rescheduling division triangular modulation (RDTM). Thus it can provide different EMI attenuation performance for various system applications. Furthermore, the proposed

ADSSCG employs a low-power digitally controlled oscillator (DCO) to save overall power

consumption significantly. Measurement results show that power consumption of the

proposed ADSSCG is 1.2 mW (@54 MHz), and it provides 9.5 dB EMI reductions with 1%

spreading ratio. Besides, the proposed ADSSCG has very small chip area as compared with conventional SSCGs which often required large on-chip loop filter capacitors. In addition,

the proposed ADSSCG is implemented only with standard cells, making it easily portable to

different processes and very suitable for SoC applications.

42

NVLSI1408

Topic: Design Flow for Flip-Flop Grouping in Data-Driven Clock Gating

Abstract: Clock gating is a predominant technique used for power saving. It is observed that

the commonly used synthesis based gating still leaves a large amount of redundant clock

pulses. Data-driven gating aims to disable these. To reduce the hardware overhead involved, flip-flops (FFs) are grouped so that they share a common clock enabling signal. The question

of what is the group size maximizing the power savings is answered in a previous paper.

Here we answer the question of which FFs should be placed in a group to maximize the

power reduction. We propose a practical solution based on the toggling activity correlations

of FFs and their physical position proximity constraints in the layout. Our data-driven clock gating is integrated into an Electronic Design Automation (EDA) commercial backend

design flow, achieving total power reduction of 15%–20% for various

types of large-scale state-of-the-art industrial and academic designs in 40 and 65 manometer

process technologies. These savings are achieved on top of the savings obtained by clock

gating synthesis performed by commercial EDA tools, and gating manually inserted into the register transfer level design.

2014

43

NVLSI1407

Topic: Input Vector Monitoring Concurrent BIST Architecture Using SRAM Cells

Abstract: Input vector monitoring concurrent built-in self test (BIST) schemes perform

testing during the normal operation of the circuit without imposing a need to set the circuit

offline to perform the test. These schemes are evaluated based on the hardware overhead and

the concurrent test latency (CTL), i.e., the time required for the test to complete, whereas the circuit operates normally. In this brief, we present a novel input vector monitoring concurrent

BIST scheme, which is based on the idea of monitoring a set (called window) of vectors

reaching the circuit inputs during normal operation, and the use of a static-RAM like

structure to store the relative locations of the vectors that reach the circuit inputs in the

examined window; the proposed scheme is shown to perform significantly better than previously proposed schemes with respect to the hardware overhead and CTL tradeoff.

2014

44

NVLSI1406

Topic: Jitter of Delay-Locked Loops Due to PFD

Abstract: In this paper, delay-locked loop’s (DLLs) jitter due to uncertainties in the phase

frequency detector (PFD) is calculated. First, time-domain equations of the DLL are

introduced. These equations are the key to obtaining a closed form equation related to the

jitter of DLL in presence of a noisy PFD. Jitter equat ions at the output of all stages are calculated theoretically. A DLL is designed in 0.18-µm CMOS technology to validate the

obtained equations.

2014

45

NVLSI1405

Topic: Area-Delay Efficient Binary Adders in QCA

Abstract: As transistors decrease in size more and more of them can be accommodated in a

2014

Page 31: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

single die, thus increasing chip computational capabilities. However, transistors cannot get

much smaller than their current size. The quantum-dot cellular automata (QCA) approach

represents one of the possible solutions in overcoming this physical limit, even though the design of logic modules in QCA is not always straightforward.

46

NVLSI1404

Topic: Reconfigurable CORDIC-Based Low-Power DCT Architecture Based on Data

Priority

Abstract: This paper presents a low-power coordinate rotation digital computer (CORDIC)-

based reconfigurable discrete cosine transform (DCT) architecture. The main idea of this

paper is based on the interesting fact that all the computations in DCT are not equally

important in generating the frequency domain outputs. Considering the importance difference in the DCT coefficients, the number of CORDIC iterations can be dynamically

changed to efficiently tradeoff image quality for power consumption. Thus, the

computational energy can be significantly reduced without seriously compromising the

image quality. The proposed CORDIC-based 2-D DCT architecture is implemented using

0.13µm CMOS process, and the experimental results show that our reconfigurable DCT achieves power savings ranging from 22.9% to 52.2% over the CORDIC-based Loeffler

DCT at the cost of minor image quality degradations.

2014

47

NVLSI1403

Topic: FPGA-Based Bit Error Rate Performance Measurement of Wireless Systems

Abstract: This paper presents the bit error rate (BER) performance validation of digital

baseband communication systems on a field-programmable gate array (FPGA). The

proposed BER tester (BERT) integrates fundamental baseband signal processing modules of

a typical wireless communication system along with a realistic fading channel simulator and an accurate Gaussian noise generator onto a single FPGA to provide an accelerated and

repeatable test environment in a laboratory setting. Using a developed graphical user

interface, the error rate performance of single- and multiple-antenna systems over a wide

range of parameters can be rapidly evaluated. The FPGA-based BERT should reduce the

need for time-consuming software based simulations, hence increasing the productivity. This FPGA-based solution is significantly more cost effective than conventional performance

measurements made using expensive commercially available test equipment and channel

simulators.

2014

48

NVLSI1402

Topic: A Combined SDC-SDF Architecture for Normal I/O Pipelined Radix-2 FFT

Abstract: We present an efficient combined single-path delay commutator-feedback (SDC-

SDF) radix-2 pipelined fast Fourier transform architecture, which includes log 2N−1 SDC stages, and 1 SDF stage. The SDC processing engine is proposed to achieve 100% hardware

resource utilization by sharing the common arithmetic resource in the time-multiplexed

approach, including both adders and multipliers. Thus, the required number of complex

multipliers is reduced to log 4N−0.5, compared with log 2N−1 for the other radix-2

SDC/SDF architectures. In addition, the proposed architecture requires roughly minimum number of complex adders log2N+1 and complex delay memory 2N+1.5log2N−1.5.

2014

49

NVLSI1401

Topic: Bit-Level Optimization of Adder-Trees for Multiple Constant Multiplications for Efficient FIR Filter Implementation

Abstract: Multiple constant multiplications (MCM) scheme is widely used for implementing

transposed direct-form FIR filters. While the research focus of MCM has been on more

effective common sub expression elimination, the optimization of adder-trees, which sum up the computed sub-expressions for each coefficient, is largely omitted. In this paper, we have

identified the resource minimization problem in the scheduling of adder-tree operations for

the MCM block, and presented a mixed integer programming (MIP) based algorithm for

2014

Page 32: BULK IEEE PROJECTS IN VLSI ,BULK IEEE PROJECTS, IEEE 2015-16 VLSI PROJECTS IN CHENNAI, 2015-16 VLSI PROJECTS IN PONDICHERRY,BULK IEEE PROJECTS FOR VLSI ,IEEE MATLAB PROJECTS IN PONDICHERRY,VLSI

NEXGEN TECHNOLOGY

www.nexgenproject.com

No: 66,4th cross, Venkata nagar, Near SBI ATM, Pondicherry.

Email Id: [email protected] Mobile: 9751442511, 9791938249, Telephone: 0413-2211159

more efficient MCM -based implementation of FIR filters. Experimental result shows that up

to 15% reduction of area and 11.6% reduction of power (with an average of 8.46% and

5.96% respectively) can be achieved on the top of already optimized adder/subtractor network of the MCM block.

50

NVLSI1400

Topic: A Look-Ahead Clock Gating Based on Auto-Gated Flip-Flops

Abstract: Clock gating is very useful for reducing the power consumed by digital systems.

Three gating methods are known. The most popular is synthesis-based, deriving clock

enabling signals based on the logic of the underlying system. It unfortunately leaves the

majority of the clock pulses driving the flip-flops (FFs) redundant. A data-driven method

stops most of those and yields higher power savings, but its implementation is complex and application dependent. A third method called auto-gated FFs (AGFF) is simple but

yields relatively small power savings. This paper presents a novel method called Look-

Ahead Clock Gating (LACG), which combines all the three. LACG computes the clock

enabling signals of each FF one cycle ahead of time, based on the present cycle data of those

FFs on which it depends. It avoids the tight timing constraints of AGFF and data-driven by allotting a full clock cycle for the computation of the enabling signals and their propagation.

A closed-form model characterizing the power saving per FF is presented. It is based on

data-to-clock toggling probabilities, capacitance parameters and FFs’ fan-in. The model

implies a breakeven curve, dividing the FFs space into two regions of positive and negative

gating return on investment. While the majority of the FFs fall in the posit ive region and hence should be gated, those falling in the negative region should not. Experimentation on

industry-scale data showed 22.6% reduction of the clock power, translated to 12.5% power

reduction of the entire system.

2014