presentation
TRANSCRIPT
Presented by-
ASHISH MAURYA(2015 VLSI-13)
AN EFFICIENT FIXED CODEBOOK SEARCH METHOD FOR G.729 SPEECH CODEC
ABV-Indian Institute of Information Technology and Management Gwalior,Morena Link Road, Gwalior, Madhya Pradesh, INDIA - 474015.
Presented by- ASHISH MAURYA (2015 VLSI-13)
May 13, 2016
I. Speech codingi. Block Diagram and Description
II. Speech coderIII. G.729???IV. Algebraic Code Excited Linear Prediction (ACELP)
i. Block Diagramii. Fixed Codebook Structureiii. Process Flowgraph
V. Proposed Search Methodi. IFPR Search Methodii. RCM Search Methodiii. Combined Search Method
- Process FlowgraphVI. Conclusion & ScopeVII. Reference Paper
CONTENTS
Presented by- ASHISH MAURYA (2015 VLSI-13)
Speech coding is a procedure to represent a digitized speech signal using as few bits as possible, maintaining at the same time a reasonable level of speech quality. Due to the increasing demand for speech communication, speech coding technology has received augmenting levels of interest from the research, standardization, and business communities.
SPEECH CODING
1/20Presented by- ASHISH MAURYA (2015 VLSI-13)
2/20Presented by- ASHISH MAURYA (2015 VLSI-13)
BLOCK DIAGRAM
Speech source: A continuous time analog speech signal source.
Filter: Used to select required frequency signal by filtering unwanted signals.
Sampler: Convert continuous time signal to discrete time signal. Sampling is done at 8 KHz to satisfy Nyquist criterion .
ADC: Discrete signal is quantized by quantizer to get digital signal.
Source encoder: Encode digitized signal to reduce the bit rate.
3/20Presented by- ASHISH MAURYA (2015 VLSI-13)
BLOCK DESCRIPTION
Channel encoder: Provides error protection to the bit-stream before transmission to the communication channel.
Channel decoder: Processes the error-protected data to recover the encoded data.
Source decoder: Generates the digital speech signal having the original bit rate.
DAC: Converts digital speech signal to continuous-time analog signal.
4/20Presented by- ASHISH MAURYA (2015 VLSI-13)
BLOCK DESCRIPTION (CONTD..)
The encoder/decoder structure represented in Figure is known as a speech coder.
The input speech is encoded to produce a low-rate bit-stream.
This bit-stream is input to the decoder, which constructs an approximation of the original signal.
5/20Presented by- ASHISH MAURYA (2015 VLSI-13)
SPEECH CODER
Encoding
Derive the filter coefficients from the speech frame. Derive the scale factor from the speech frame. Transmit filter coefficients and scale factor to the decoder.
Decoding
Generate white noise sequence. Multiply the white noise samples by the scale factor. Construct the filter using the coefficients from the encoder and filter the scaled white noise sequence. Output speech is the output of the filter.
6/20Presented by- ASHISH MAURYA (2015 VLSI-13)
SPEECH CODER (CONTD..)
G.729 is a speech coding technique
Most important usage is in VoIP
Compress speech signal from 64kbps to 8kbps
It uses the CS-ACELP algorithm
CS-ACELP stands for conjugate structure algebraic code excited linear prediction
7/20Presented by- ASHISH MAURYA (2015 VLSI-13)
G.729 ???
Algebraic CELP or ACELP is an attempt to reduce the computational cost of standard CELP coders.
The term “Algebra” means the use of simple algebra or mathematical rules to create the excitation code vectors ,with the rules being addition and shifting.
The advantage of this method is that no physical storage is required, resulting in significant memory saving.
8/20Presented by- ASHISH MAURYA (2015 VLSI-13)
Algebraic CELP or ACELP
Fixed CB search
9/20Presented by- ASHISH MAURYA (2015 VLSI-13)
ACELP Encoder Block Diagram
The fixed codebook is based on an algebraic codebook structure using an interleaved single-pulse permutation (ISPP) design
40 samples ACELP fixed codebook made up of five single pulse interleaved permutation codes (5 tracks)
Hence called, Interleaved Single-Pulse Permutation (ISPP) .
10/20Presented by- ASHISH MAURYA (2015 VLSI-13)
STRUCTURE OF FIXED CODEBOOK
Each pulse can have either the amplitudes +1 or –1
Each pulse requires 1 bit per sign.
For m0, m1, and m2, 3 bits are needed for position; while for m3, 4 bits are required.
Total of 17 bits are needed to index the whole codebook.
11/20Presented by- ASHISH MAURYA (2015 VLSI-13)
STRUCTURE OF FIXED CODEBOOK (CONTD..)
12/20Presented by- ASHISH MAURYA (2015 VLSI-13)
PROCESS FLOW GRAPH
ktk
nk
k
k
cc
ncnd
EC
239
02
Maximizing Term
13/20Presented by- ASHISH MAURYA (2015 VLSI-13)
PROPOSED SEARCH METHOD
A combined version of reduced candidate mechanism (RCM) and iteration-free pulse replacement (IFPR). Individual pulse contribution in each track is given by RCM.
The replacement of a pulse is performed through the search over the sorted top N pulses by IFPR.
This method requires a search load about to 7.5% of G.729A.
14/20
IFPR SEARCH METHOD
Presented by- ASHISH MAURYA (2015 VLSI-13)
In the IFPR method, new pulses are sought by a number of pulse replacements at a time following pulse contributions evaluated for every track. This is done to maximize over all combinations a search criterion, which replaces the pulses pertaining to the initial codevector with the most significant pulses for every track.
To replace the pulses of the initial codevector with the most significant pulses for every track has overall search complexity of 48.
15/20
RCM SEARCH METHOD
Presented by- ASHISH MAURYA (2015 VLSI-13)
The number of candidate pulses in each track is reduced for the purpose of search complexity reduction.
A pulse sorting is made by the contribution thereof in descending order as the first step, and then, the top N pulses are chosen as the candidate pulses for a full search.
In this way, the search process needs to be performed for merely N^4 number of times for the optimal pulse combination.
So the best combination of the candidates will be selected.
16/20
COMBINED SEARCH METHOD
Presented by- ASHISH MAURYA (2015 VLSI-13)
COMPARISON TABLE
17/20
PROCESS FLOW GRAPH
Presented by- ASHISH MAURYA (2015 VLSI-13)
Individual pulse contribution is evaluated by, and a sorting is made by pulse contribution within the associated track.
The one with the global maximum pulse contribution, named as G1, is located out of all the top 1 pulses among all the tracks.
G1 is presumed to be one of four optimal pulses.
18/20
PROCESS FLOW GRAPH (CONTD..)
Presented by- ASHISH MAURYA (2015 VLSI-13)
The value of N is determined for the searching task conducted over the remaining three tracks through RCM. The searching task terminates the moment the combination of optimal pulses is acquired.
19/20
CONCLUSION & SCOPE
Presented by- ASHISH MAURYA (2015 VLSI-13)
A pulse with a high contribution is more likely to serve as one of the optimal pulses in the associated track.
This proposal requires eight searches in the case of N = 2, i.e, a search load is about 2.5% of that in G.729A, and similar reduction in searching for other values of ‘N’.
The improved G.729A speech codec can be utilized to improve the Voice over Internet Protocol (VoIP) performance on smartphone.
As a consequence, the energy efficiency requirement is met for an extended operation time period due to computational load reduction.
20/20
REFERENCE PAPER
Presented by- ASHISH MAURYA (2015 VLSI-13)
C. Y. Yeh, "An efficient fixed codebook search for G.729 speech codec derived from RCM-based search algorithm," Signal and Information Processing (ChinaSIP), 2014 IEEE China Summit & International Conference on, Xi'an, 2014, pp. 76-79.
THANK YOU
Presented by- ASHISH MAURYA (2015 VLSI-13)