real-time non-intrusive speech quality estimation for voip
TRANSCRIPT
Real-time Non-intrusive Real-time Non-intrusive Speech Quality Estimation for Speech Quality Estimation for
VoIPVoIP
Adil RajaAdil Raja
Wireless Access Research Group, University of Limerick
OutlineOutline
• Research MilestonesResearch Milestones• Theoretical AspectsTheoretical Aspects• Evaluation Platforms Evaluation Platforms • ConclusionConclusion
Wireless Access Research Group, University of Limerick
Research MilestonesResearch Milestones
• Problem Statement.Problem Statement.• Objectives.Objectives.• Related Work. Related Work. • Current Status of The Project.Current Status of The Project.• Future Work.Future Work.
Wireless Access Research Group, University of Limerick
Problem StatementProblem Statement
• Lack of a Lack of a Real-timeReal-time, , Non-intrusiveNon-intrusive Speech Quality Estimation at Speech Quality Estimation at mid-mid-networknetwork points. points.
Wireless Access Research Group, University of Limerick
ObjectivesObjectives
• To develop a To develop a Real-TimeReal-Time, , Non-intrusiveNon-intrusive speech quality estimation model for VoIP speech quality estimation model for VoIP networks.networks.
• Particular emphasis is on effectiveness of Particular emphasis is on effectiveness of the model on the model on “mid-network”“mid-network” points. points.
• The model should assess the over-all speech The model should assess the over-all speech quality by evaluating:quality by evaluating: Transport Layer metrics.Transport Layer metrics. Speech layer metrics.Speech layer metrics.
• Effective implementation of a perceptual Effective implementation of a perceptual model is crucial.model is crucial.
Wireless Access Research Group, University of Limerick
Related WorkRelated Work
• Standards – E-Model, P.563 {ITU-T}.Standards – E-Model, P.563 {ITU-T}.• Industry.Industry.
PsyVoIP for gateways {Psytechnics}.PsyVoIP for gateways {Psytechnics}. VQMon/EP {Telechemy}VQMon/EP {Telechemy} 3SQM {OPTICOM}3SQM {OPTICOM} PSM {Psytechnics}PSM {Psytechnics}
• Theoretical Research.Theoretical Research. Transport layer assessments.Transport layer assessments. Perceptual Models.Perceptual Models. Cognitive Models.Cognitive Models.
Wireless Access Research Group, University of Limerick
Current Status of The Current Status of The ProjectProject
• Transport layer metrics can be Transport layer metrics can be captured using RTP packets and RTCP captured using RTP packets and RTCP reports.reports.
• The metrics include:The metrics include:Packet loss – From RTP packetsPacket loss – From RTP packetsJitter – From RTP packetsJitter – From RTP packetsRound-trip-delay – RTCP-SR/RR reportsRound-trip-delay – RTCP-SR/RR reports
Wireless Access Research Group, University of Limerick
Current Status of The Current Status of The ProjectProject
• A perceptual model based on Perceptual Linear A perceptual model based on Perceptual Linear Prediction (and MFCC) has been ported to Prediction (and MFCC) has been ported to IXP2400 XScale processor.IXP2400 XScale processor.
• SOM_PAK has been ported to IXP2400 XScale SOM_PAK has been ported to IXP2400 XScale processor.processor.
• MicroEngine code for buffering of packets on MicroEngine code for buffering of packets on SRAM has been done.SRAM has been done.
• The overall model design is based on a single The overall model design is based on a single VoIP call.VoIP call.
Wireless Access Research Group, University of Limerick
Future WorkFuture Work
• Integration of Speech layer model with Integration of Speech layer model with transport layer model.transport layer model.
• Testing under various packet delay and Testing under various packet delay and loss scenarios.loss scenarios.
• Evaluation of Model for low bit-rate Evaluation of Model for low bit-rate codecs.codecs.
• Scalability testing for multiple VoIP Scalability testing for multiple VoIP calls.calls.
Wireless Access Research Group, University of Limerick
Theoretical AspectsTheoretical Aspects
• Packet Loss and Jitter Evaluation.Packet Loss and Jitter Evaluation.• Effect of Packet Loss Distribution.Effect of Packet Loss Distribution.• Unordered and Missing Packets.Unordered and Missing Packets.• Computational Lag.Computational Lag.• Methodology.Methodology.• Perceptual Evaluation of Low Bit-Rate Vocoders.Perceptual Evaluation of Low Bit-Rate Vocoders.• Self Organizing Maps.Self Organizing Maps.• Hidden Markov Models.Hidden Markov Models.
Wireless Access Research Group, University of Limerick
Packet Loss and Jitter Packet Loss and Jitter EvaluationEvaluation
• Performance on mid-network points.Performance on mid-network points.
IXP2400NPU
RTCP-SR and RTCP-RRpackets used to compute
round-trip delay.
ENDPOINT-A ENDPOINT-B
RTP PACKETS USED TOCOMPUTE THE VALUES OFJITTER AND PACKET LOSS
Wireless Access Research Group, University of Limerick
Packet Loss and Jitter Packet Loss and Jitter EvaluationEvaluation
RouterComputer ComputerRouter Router
Router
Router
Wireless Access Research Group, University of Limerick
Packet Loss and Jitter Packet Loss and Jitter EvaluationEvaluation
• ReasonsReasons Routing Table updates.Routing Table updates. Traffic Engineering.Traffic Engineering.
Wireless Access Research Group, University of Limerick
Packet Loss and Jitter Packet Loss and Jitter EvaluationEvaluation
• To Capture Packet loss and jitter from RTCP-SR/RR To Capture Packet loss and jitter from RTCP-SR/RR packets.packets.
• Other Advantages.Other Advantages. RTCP-SR/RR report fraction of packets lost over a certain RTCP-SR/RR report fraction of packets lost over a certain
interval of time.interval of time. This provides the mean loss rate for a call in the current This provides the mean loss rate for a call in the current
time frame as opposed to overall loss rate.time frame as opposed to overall loss rate. Some computation is offloaded from the IXP2400.Some computation is offloaded from the IXP2400. End-to-end transport layer metrics as opposed to end-to End-to-end transport layer metrics as opposed to end-to
mid network point metrics.mid network point metrics.
Wireless Access Research Group, University of Limerick
Effect of Packet Loss Effect of Packet Loss DistributionDistribution
• Most models assess the impact of Most models assess the impact of packet loss on speech quality in terms packet loss on speech quality in terms of mean loss rate.of mean loss rate.
• Packet loss is bursty in nature.Packet loss is bursty in nature.• Packet loss location has a variable effect Packet loss location has a variable effect
on the quality of speech. {H. on the quality of speech. {H. Schulzrinne}.Schulzrinne}.
• The impact of packet loss distribution The impact of packet loss distribution should be used as a QoS metric.should be used as a QoS metric.
Wireless Access Research Group, University of Limerick
Unordered and Missing Unordered and Missing PacketsPackets
• Packets arrive out of order.Packets arrive out of order.• Some packets are lost and some take Some packets are lost and some take
alternative paths.alternative paths.• These factors can have adverse These factors can have adverse
effects when acoustic back-end is a effects when acoustic back-end is a HMM (for instance).HMM (for instance).
Wireless Access Research Group, University of Limerick
Computational LagComputational Lag
T0TN
Speech Layer Processing
Transport Layer Processing
• Perceptual Model reports the results of the past Perceptual Model reports the results of the past samples.samples.
• The computational lag between the speech The computational lag between the speech layer model and the perceptual model layer model and the perceptual model increases as the time progresses.increases as the time progresses.
• Some samples have to be skipped to overcome Some samples have to be skipped to overcome this lag.this lag.
Wireless Access Research Group, University of Limerick
MethodologyMethodology
• Transport Layer Model Transport Layer Model Jitter, Loss, Delay.Jitter, Loss, Delay.
• Speech layer ModelSpeech layer Model Perceptual Model.Perceptual Model.
Perceptual Linear Prediction.Perceptual Linear Prediction. Mel Frequency Cepstral Coefficients.Mel Frequency Cepstral Coefficients. Bark Spectral Distortion.Bark Spectral Distortion.
Code-book of Clean Speech Feature VectorsCode-book of Clean Speech Feature Vectors Self-organizing Maps – Vector Quantization.Self-organizing Maps – Vector Quantization. Hidden Markov Models – Probabilistic.Hidden Markov Models – Probabilistic.
Wireless Access Research Group, University of Limerick
MethodologyMethodologySRAM
Optional hostCPU, PCI
bus devices
ExternalMedia
Device(s)
ScratchpadMemory
PCIController
Media SwitchFabric
Interface
CAP
Hash Unit
SRAMController 1
SRAMController 0
DRAMCOntroller 0
DRAM
IXP2400
PCI (64 bit, 33/66 MHz)
SP14, CSIX QDR DDR
Packet Receive/Transmit MEs
Packet ProcessingMEs
SHaC
These MEs Receivethe packets from theMSF interface and
forward them toDRAM controller on
reception. And do theopposite fortransmission
Parse various headerfields of VoIP
packets and Calcultepacket based Qos
Metrics and place theresults on SRAM.They buffer the
speech frames onSRAM on addressesknown to perceptual
model.
Intel XScale Core
The perceptual modelcalculates distortions due toencoding and bit-errors and
places the result on theSRAM.
This module calculates theobjective score from all the
values accumulated onSRAM
ObjectiveScore = S(s,c,e)
Wireless Access Research Group, University of Limerick
MethodologyMethodology
• At a given time a number of At a given time a number of (contiguous) packets are buffered to (contiguous) packets are buffered to be input to the perceptual model.be input to the perceptual model.
• Statistical Analysis.Statistical Analysis.
Wireless Access Research Group, University of Limerick
MethodologyMethodology
• Optimum number of packets to be buffered?Optimum number of packets to be buffered?• Optimum buffering interval?Optimum buffering interval?• The overall speech quality is a function of both The overall speech quality is a function of both
auditory distance and transport layer distortions.auditory distance and transport layer distortions.
Wireless Access Research Group, University of Limerick
MethodologyMethodology
• Assessment of Model for one VoIP call Assessment of Model for one VoIP call scenario.scenario.
• G.711 is the preferred codecG.711 is the preferred codec• Simulate Packet loss rate, packet loss Simulate Packet loss rate, packet loss
distribution, delay and jitter (Fine Tuning).distribution, delay and jitter (Fine Tuning).• Analysis of low-bit rate codecs.Analysis of low-bit rate codecs.• Scale the model for multiple VoIP calls.Scale the model for multiple VoIP calls.• IXP2400 NPU is the target hardware platform.IXP2400 NPU is the target hardware platform.
Wireless Access Research Group, University of Limerick
Perceptual Evaluation of Low Perceptual Evaluation of Low Bit-Rate Vocoders.Bit-Rate Vocoders.
• Real time speech quality estimation for low bit Real time speech quality estimation for low bit rate codecs (G.729, G.723.1) without decoding rate codecs (G.729, G.723.1) without decoding the frames.the frames.
• {Carmen Peláez-Moreno, Ascensión Gallardo-Antolín, and Fernando} perform speech recognition by only extracting the LP coefficients.
Wireless Access Research Group, University of Limerick
SOMSOM
Wireless Access Research Group, University of Limerick
SOM TrainingSOM Training
Wireless Access Research Group, University of Limerick
SOMSOM
• What is the average quantization error (QE)?What is the average quantization error (QE)?• Auditory Distance = Distortion + QE.Auditory Distance = Distortion + QE.• How to deal with QE?How to deal with QE?
Wireless Access Research Group, University of Limerick
SOM – Quantization ErrorSOM – Quantization Error
• SOM Discretizes data.SOM Discretizes data.
Wireless Access Research Group, University of Limerick
SOM – Data DistributionSOM – Data DistributionTimo Kostiainen
Wireless Access Research Group, University of Limerick
Growing Hierarchical SOMGrowing Hierarchical SOMLayer 0
Layer 2
Layer 1
Wireless Access Research Group, University of Limerick
GHSOM - AdvantagesGHSOM - Advantages
• A desired level of granularity in discriminating A desired level of granularity in discriminating input data is achievable.input data is achievable.
• Horizontal Expansion.Horizontal Expansion.• Vertical Expansion.Vertical Expansion.• As the SOM is hierarchical, the searching time As the SOM is hierarchical, the searching time
is reduced.is reduced.• What if a distorted signal of class A has lower What if a distorted signal of class A has lower
AD with class B?AD with class B?
0.i imqe mqe0.m mMQE mqe
Wireless Access Research Group, University of Limerick
Hidden Markov ModelsHidden Markov Models
• Auditory scores based on logically connected Auditory scores based on logically connected sequence of feature vectors.sequence of feature vectors.
• λλ = (A, B, = (A, B, ))• A – transition probability matrix from one A – transition probability matrix from one
phonemic class to the next.phonemic class to the next.• B – Emission probability of a phonemic vector.B – Emission probability of a phonemic vector. - Initial State Probability.- Initial State Probability.• Parameters are learnt during training.Parameters are learnt during training.
Wireless Access Research Group, University of Limerick
Hidden Markov ModelsHidden Markov Models
• A suitably trained HMM can be A suitably trained HMM can be used to find auditory distance.used to find auditory distance.
• Continuous HMM.Continuous HMM.• Reliable Results.Reliable Results.
Wireless Access Research Group, University of Limerick
Evaluation PlatformsEvaluation Platforms
• Cell Broad Band Engine Processor Cell Broad Band Engine Processor Architecture.Architecture.
• Programming the Cell.Programming the Cell.• Some ConcernsSome Concerns
Wireless Access Research Group, University of Limerick
Cell Broadband ProcessorCell Broadband Processor
Wireless Access Research Group, University of Limerick
Cell BE Processor ……Cell BE Processor ……
Sr No Feature Qty
1 Power Processing Element (PPEs) 1
2 Synergistic Processing Elements (SPEs) 8
3 Element Interconnect Bus (EIB) 1
4 Direct Memory Access Controller (DMAC) 1
5 Rambus XDR memory controllers 2
6 Rambus File IO interface
7 PCI Express x 4
7 256 GFLOPS (Single precision at 4 GHz).
8 25 GFLOPS (Double precision at 4 GHz).
Wireless Access Research Group, University of Limerick
Mercury Computer SystemsMercury Computer Systems
Cell Technology Evaluation System (CTES)
Wireless Access Research Group, University of Limerick
Programming the CellProgramming the Cell
• The primary language is C, C++ is also supported The primary language is C, C++ is also supported to some exteent.to some exteent.
• Programming ModelsProgramming Models Job Queue – PPE schedules the jobs for SPEs.Job Queue – PPE schedules the jobs for SPEs. Self-multitasking of SPEs – kernel and scheduling is Self-multitasking of SPEs – kernel and scheduling is
distributed across SPEs.distributed across SPEs. Stream Processing - The SPEs use shared memory for all Stream Processing - The SPEs use shared memory for all
tasks.tasks.
• Development PlatformsDevelopment Platforms Cell BE Engine SDK (alpha version) {IBM} – Full system Cell BE Engine SDK (alpha version) {IBM} – Full system
simulator.simulator. Yellow Dog Linux {Mercury Computer Systems}.Yellow Dog Linux {Mercury Computer Systems}.
Wireless Access Research Group, University of Limerick
Some ConcernsSome Concerns
• Software design is key to effective performance of Cell.Software design is key to effective performance of Cell.• Multithread execution – key to effective execution.Multithread execution – key to effective execution.• The code has to be vectorisable and parrallisable. The code has to be vectorisable and parrallisable. • To port the code to SPEs it has to be partitioned from rest of To port the code to SPEs it has to be partitioned from rest of
the code so that it is fully self-contained.the code so that it is fully self-contained.• Hardware abstraction.Hardware abstraction.• Learning Curve Effect.Learning Curve Effect.
Support from IBM.Support from IBM. Credibility of SDK/APIs.Credibility of SDK/APIs. Comments from Peter Seebach.Comments from Peter Seebach.
• CostCost
Wireless Access Research Group, University of Limerick
AlternativesAlternatives
• XScale.XScale.• Offloading of compute intensive tasks Offloading of compute intensive tasks
to another processor using gigabit port.to another processor using gigabit port.• PCI with Pentium 4.PCI with Pentium 4.• PCI with a suitable graphics card.PCI with a suitable graphics card.
Wireless Access Research Group, University of Limerick
Gigabit Alternative …Gigabit Alternative …
IP NETWORKIP NETWORK
IXP2400NPU
Workstation
Wireless Access Research Group, University of Limerick
ConclusionsConclusions
• Preliminary work of the model is complete.Preliminary work of the model is complete.• Packet loss distribution.Packet loss distribution.• Evaluation of low bit rate codecs.Evaluation of low bit rate codecs.• Evaluation platform.Evaluation platform.• Overall Research GoalsOverall Research Goals
Real-time non-intrusive VoIP Quality Real-time non-intrusive VoIP Quality assessment model.assessment model.
Perceptual Distortion Measures.Perceptual Distortion Measures. Model Training.Model Training.
Wireless Access Research Group, University of Limerick
Thank you for Your TimeThank you for Your Time