mehdi 2miska m hannuksela, universityoftechnology, research...

4
VIDEO SPLICING AND FUZZY RATE CONTROL IN IP MULTI-PROTOCOL ENCAPSULATOR FOR TUNE-IN TIME REDUCTION IN IP DATACASTING (IPDC) OVER DVB-H 'Mehdi Rezaei, 2Miska M Hannuksela, 3Moncef Gabbouj "3Institute of Signal Processing, Tampere University of Technology, 2Nokia Research Center ABSTRACT A novel video splicing and rate control method is proposed which minimizes the tune-in time in IPDC over DVB-H. DVB-H uses a time-sliced transmission scheme to reduce the power consumption used for radio reception. One of the significant factors in tune-in time is the time from the start of media decoding to the start of correct output from decoding, which would be minimized when a time-slice is started with a random access point picture such as an instantaneous decoding refresh (IDR) picture in H.264/AVC. In IPDC over DVB-H, the encapsulation to time- slices is performed independently of encoding in a network element called IP encapsulator. At the time of encoding, time-slice boundaries are not known exactly, and it is impossible to govern the location of IDR pictures relative to time-slices. It is proposed that an additional stream consisting of IDR pictures only is transmitted to the IP encapsulator, which replaces pictures in a normal bitstream with IDR pictures according to time-slice boundaries in order to achieve the minimum tune-in time. It has to be ensured that the "spliced" bit stream resulting from the operation of the IP encapsulator complies with the Hypothetical Reference Decoder (HRD) specification of H.264/AVC. A video rate control system utilizing a fuzzy controller is proposed to satisfy the HRD requirements for the spliced bit stream. Simulation results show that the proposed splicing method and rate control system can provide standard bit streams with good average quality of decoded video and with minimum tune-in time. Index Terms- Digital Video Broadcasting-Handheld (DVB-H), IP Datacasting, Mobile TV 1. INTRODUCTION DVB-H (Digital Video Broadcasting for Handheld terminals) is an ETSI standard specification for bringing broadcast services to battery-powered handheld receivers [1]. DVB-H is largely based on the successful DVB-T specification for digital terrestrial television, adding to it a number of features designed to take account of the limited battery life of small handheld devices, and the particular environments in which such receivers must operate. In a conventional IPDC system over DVB-H, a content encoder receives source signal and encodes the source signal into a coded media bit stream. The coded media bit stream is transferred to a server. The server is typically a normal IP multicast server. The server is connected to an IP Multi-Protocol Encapsulator. The server packetizes the coded media bit stream into RTP packets and the IP encapsulator encapsulates IP packets into Multi-Protocol Encapsulation (MPE) Sections which are further encapsulated into MPEG-2 Transport Stream packets. The IP encapsulator optionally use MPE Forward Error Correction (MPE-FEC) based on Reed-Solomon codes. An IPDC system over DVB-H further includes a radio transmitter which is not essential for the operation of the proposed splicing and rate control system and it is not discussed further. To reduce the power consumption in handheld terminals, the service data is time-sliced and then it is sent into the channel as bursts at a significantly higher bit rate compared to the bitrate of the audio-visual service. Time-slicing enables a receiver to stay active only a fraction of the time, while receiving bursts of a requested service. Finally, the system includes one or more recipients, typically capable of receiving, demodulating, decapsulating, decoding, and rendering the transmitted signal, resulting into uncompressed media stream. Tune-in time in DVB-H refers to the time between the start of the reception of a broadcast signal and the start of the media rendering. The tune-in time for newly-joined recipients consists of several parts mainly including: delay until the start of the desired time-slice, reception duration of a complete time-slice or MPE-FEC frame, delay to compensate the size variation of MPE-FEC frames and media frames, delay to compensate the synchronization between the associated streams (e.g. audio and video) of the streaming session and delay until a media decoder is refreshed by a random access point to produce correct output samples. One of the critical factors in tune-in time is the time until a media decoder is refreshed, which can be minimized if MPE-FEC frame is started with a random access point such as an IDR picture in H.264/AVC. It should be remarked that if decoding started from an IDR picture that is not at the beginning of a time-slice immediately when the time-slice is received, the input buffer for decoding would drain before the arrival of the next time-slice and there would be a gap in video playback. In IPDC over DVB-H, the content encoding and the encapsulation to MPE-FEC frames are implemented independently and it is difficult to set the desired location of 1-4244-0481-9/06/$20.00 C2006 IEEE 3041 ICIP 2006 Authorized licensed use limited to: Tampereen Teknillinen Korkeakoulu. Downloaded on October 13, 2009 at 14:24 from IEEE Xplore. Restrictions apply.

Upload: others

Post on 05-Jan-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mehdi 2Miska M Hannuksela, UniversityofTechnology, Research …moncef/publications/fuzzy-rate-control-in-IP.pdf · 'MehdiRezaei, 2MiskaMHannuksela, 3MoncefGabbouj "3Institute ofSignalProcessing,

VIDEO SPLICING AND FUZZY RATE CONTROL IN IP MULTI-PROTOCOL ENCAPSULATORFOR TUNE-IN TIME REDUCTION IN IP DATACASTING (IPDC) OVER DVB-H

'Mehdi Rezaei, 2MiskaM Hannuksela, 3MoncefGabbouj

"3Institute of Signal Processing, Tampere University of Technology, 2Nokia Research Center

ABSTRACT

A novel video splicing and rate control method is proposedwhich minimizes the tune-in time in IPDC over DVB-H. DVB-Huses a time-sliced transmission scheme to reduce the powerconsumption used for radio reception. One of the significantfactors in tune-in time is the time from the start of media decodingto the start of correct output from decoding, which would beminimized when a time-slice is started with a random access pointpicture such as an instantaneous decoding refresh (IDR) picture inH.264/AVC. In IPDC over DVB-H, the encapsulation to time-slices is performed independently of encoding in a networkelement called IP encapsulator. At the time of encoding, time-sliceboundaries are not known exactly, and it is impossible to governthe location of IDR pictures relative to time-slices. It is proposedthat an additional stream consisting of IDR pictures only istransmitted to the IP encapsulator, which replaces pictures in anormal bitstream with IDR pictures according to time-sliceboundaries in order to achieve the minimum tune-in time. It has tobe ensured that the "spliced" bit stream resulting from theoperation of the IP encapsulator complies with the HypotheticalReference Decoder (HRD) specification of H.264/AVC. A videorate control system utilizing a fuzzy controller is proposed tosatisfy the HRD requirements for the spliced bit stream.Simulation results show that the proposed splicing method and ratecontrol system can provide standard bit streams with good averagequality of decoded video and with minimum tune-in time.

Index Terms- Digital Video Broadcasting-Handheld(DVB-H), IP Datacasting, Mobile TV

1. INTRODUCTION

DVB-H (Digital Video Broadcasting for Handheldterminals) is an ETSI standard specification for bringingbroadcast services to battery-powered handheld receivers[1]. DVB-H is largely based on the successful DVB-Tspecification for digital terrestrial television, adding to it anumber of features designed to take account of the limitedbattery life of small handheld devices, and the particularenvironments in which such receivers must operate.

In a conventional IPDC system over DVB-H, a contentencoder receives source signal and encodes the sourcesignal into a coded media bit stream. The coded media bitstream is transferred to a server. The server is typically anormal IP multicast server. The server is connected to an IPMulti-Protocol Encapsulator. The server packetizes the

coded media bit stream into RTP packets and the IPencapsulator encapsulates IP packets into Multi-ProtocolEncapsulation (MPE) Sections which are furtherencapsulated into MPEG-2 Transport Stream packets. TheIP encapsulator optionally use MPE Forward ErrorCorrection (MPE-FEC) based on Reed-Solomon codes. AnIPDC system over DVB-H further includes a radiotransmitter which is not essential for the operation of theproposed splicing and rate control system and it is notdiscussed further.

To reduce the power consumption in handheld terminals,the service data is time-sliced and then it is sent into thechannel as bursts at a significantly higher bit rate comparedto the bitrate of the audio-visual service. Time-slicingenables a receiver to stay active only a fraction of the time,while receiving bursts of a requested service. Finally, thesystem includes one or more recipients, typically capable ofreceiving, demodulating, decapsulating, decoding, andrendering the transmitted signal, resulting intouncompressed media stream.

Tune-in time in DVB-H refers to the time between thestart of the reception of a broadcast signal and the start ofthe media rendering. The tune-in time for newly-joinedrecipients consists of several parts mainly including: delayuntil the start of the desired time-slice, reception duration ofa complete time-slice or MPE-FEC frame, delay tocompensate the size variation of MPE-FEC frames andmedia frames, delay to compensate the synchronizationbetween the associated streams (e.g. audio and video) of thestreaming session and delay until a media decoder isrefreshed by a random access point to produce correctoutput samples.

One of the critical factors in tune-in time is the time untila media decoder is refreshed, which can be minimized ifMPE-FEC frame is started with a random access point suchas an IDR picture in H.264/AVC. It should be remarked thatif decoding started from an IDR picture that is not at thebeginning of a time-slice immediately when the time-slice isreceived, the input buffer for decoding would drain beforethe arrival of the next time-slice and there would be a gap invideo playback.

In IPDC over DVB-H, the content encoding and theencapsulation to MPE-FEC frames are implementedindependently and it is difficult to set the desired location of

1-4244-0481-9/06/$20.00 C2006 IEEE 3041 ICIP 2006

Authorized licensed use limited to: Tampereen Teknillinen Korkeakoulu. Downloaded on October 13, 2009 at 14:24 from IEEE Xplore. Restrictions apply.

Page 2: Mehdi 2Miska M Hannuksela, UniversityofTechnology, Research …moncef/publications/fuzzy-rate-control-in-IP.pdf · 'MehdiRezaei, 2MiskaMHannuksela, 3MoncefGabbouj "3Institute ofSignalProcessing,

Uncompressed DRBSVideo > Video Encoders

* ~~SBSSignals 'Control Serv

.. ParameterEncoding Encoding

Target Data > CERCS Metadata __

Figure 1: Block diagram of propo

IDR pictures relative to the boundaries of MPE-FECframes. Moreover, very frequent IDR pictures in the codedvideo bit stream drop the compression efficiencyremarkably. A method for fast channel zapping in Set-TopBox applications has been presented in [2] in which anauxiliary bit stream including frequent low quality IDRpictures is sent to the receiver in parallel to the main bitstream. When channel change the receiver replaces an Interpicture from the main bit stream with an IDR picture fromthe auxiliary bit stream. Although this method can decreasethe tune-in time in IPCD over DVB-H, it can not minimizethe tune-in time. Furthermore, the auxiliary bit streamconsumes transmission bandwidth. Moreover, the receiverneeds some modifications to switch between two streams.Finally, the employed low quality IDR pictures degrade thequality of following pictures to the next normal IDR picture.

In our previous work [3], we proposed a video splicingmethod, which minimizes the decoder refresh time in IPDCover DVB-H without any increase in bandwidth andmodification on the receiver. In proposed splicing method,it has to be ensured that standard compliancy of theresulting bitstream is maintained. In this paper, we proposea video rate control system to guarantee the standardcompliancy of the spliced bit stream. A short review on theproposed splicing method is presented in Sections 2 of thispaper. Section 3 presents the details of proposed rate controlsystem. Simulation results are provided in Section 4. Thepaper is closed with conclusions in section 5.

2. PROPOSED SPLICING METHOD

A simplified block diagram of the proposed IPDC system isdepicted in figure 1. At the content encoding level twovideo encoders encode a common uncompressed inputvideo to two encoded primary bit streams including aSpliceable Bit Stream (SBS) and a Decoder Refresh BitStream (DRBS). The spliceable bit stream includes veryfrequent spliceable pictures which are reference picturesconstrained as follow: no picture prior to a spliceablepicture, in decoding order, is referred to in the interprediction process of any reference picture at or after thespliceable picture, in decoding order. Non-referencepictures after the spliceable picture may refer to picturesearlier to the spliceable picture in decoding order. Thesenon-reference pictures cannot be correctly decoded if thedecoding process starts from the spliceable picture, but canbe safely omitted as they are not used as reference for any

)sed

SplicedIP Encapsulator P-Stream

Signals i Rate Control i SEI MessageEncapsulating Commands ParametersTarget Data_ '|~t IERCS

-'I~~~~~~-

[ Splicing and rate control method

other pictures. The decoder refresh bit stream is containingonly IDR pictures corresponding to spliceable pictures andwith a picture quality similar to corresponding spliceablepictures. The primary streams are transmitted from theserver to the IP encapsulator. The IP encapsulator composesMPE-FEC frames, in which the first picture in decodingorder is an IDR picture from the decoder refresh bit streamand the other pictures are from the spliceable bit stream. Infact a spliceable picture is replaced with corresponding IDRpicture in spliced bit stream. The IDR pictures at thebeginning of MPE-FEC frames minimize the decoderrefresh time for newly-joined recipients. No changes in thereceiver operation are required in the proposed system.

The replacing an inter picture with an IDR picturecauses a mismatch in the pixel values of the referencepictures between the encoder and decoder. The mismatchpropagates temporally an error until the next IDR picture inthe spliced stream. A technically elegant solution would beto use SP and SI pictures of H.264/AVC, but they are onlyincluded in the extended profile of H.264/AVC [4]. Theextended profile of H.264/AVC is not allowed in the currentDVB-H standard. Simulation results in [3] showed that thepropagated error is saturated to a constant value after fewpictures. However, the average degradation in quality canbe small comparing to the conventional IPDC system overDVB-H where very frequent IDR pictures in bit streamdegrade the average quality and still the decoder refreshtime can not be minimized.

3. PROPOSED RATE CONTROL SYSTEM

According to the proposed splicing method the spliceablepictures and corresponding IDR pictures in two primarystreams should be encoded with similar qualities. In asimilar quality an IDR picture can consume a bit budgetfrom 5 to 10 times more than corresponding inter picture.Furthermore, similar qualities for corresponding pictures intwo primary streams means only the bit rate of one primarystream can be controlled. Consequently, there is no realcontrol on the bit rate of spliced stream and therefore it ishard to verify the HRD compliancy of spliced stream.Moreover, the encoding parameters can not be set accordingto splicing results, because the encoding and splicing areperformed independently and without any feedback link.

To solve the problem above a comprehensive ratecontrol system is proposed which is implemented in boththe content encoder and IP encapsulator. The content

3042

Authorized licensed use limited to: Tampereen Teknillinen Korkeakoulu. Downloaded on October 13, 2009 at 14:24 from IEEE Xplore. Restrictions apply.

Page 3: Mehdi 2Miska M Hannuksela, UniversityofTechnology, Research …moncef/publications/fuzzy-rate-control-in-IP.pdf · 'MehdiRezaei, 2MiskaMHannuksela, 3MoncefGabbouj "3Institute ofSignalProcessing,

encoder rate control system (CERCS) controls the bit rate oftwo primary streams considering an average value for thefrequency of IDR pictures in a desired spliced bit stream.However, the frequency of IDR picture in the spliced streamhas variations around the average value since the number ofvideo pictures in MPE-FEC frames is not fixed. Moreover,in offline encoding the IDR frequency which has been usedfor the rate control of primary streams at the contentencoder may be very different from the average IDRfrequency of spliced stream. The IP encapsulator ratecontrol system (IERCS) implements another control tocompensate the deviation in frequency of IDR pictures andto guarantee the HRD compliancy of spliced bit stream.Furthermore, the Supplemental Enhancement Information(SEI) message parameters related to buffering of the splicedbit stream is provided by the IERCS.

The CERCS controls the bit rate of primary streamsaccording to encoding target data which are set by user andalso according to several signals which are extracted fromthe uncompressed and compressed video. The encodingtarget data include target bit rate of spliced stream andaverage frequency of IDR pictures in the desired splicedstream. Furthermore, some encoding metadata ascomplementary information are provided by the CERCSwhich are sent to the server and then IP encapsulator.

The IERCS controls the bit rate of spliced streamaccording to the encoding metadata, encapsulating targetdata defined by the server. The encapsulating target dataincluding target bit rate of spliced stream and IDRfrequency of spliced stream are homogeneous with theencoding target date while they may have different values inoffline applications. Although CERCS can solve the HRDcompliancy problem partially with some assumptions, the IPencapsulator rate control system is required to provide thefinal control on the spliced bit stream according to realencapsulating conditions in IP encapsulator. More detailsabout the proposed IERCS are presented as follow.

Figure 2 illustrates the block diagram of proposedIERCS. The IERCS utilizes a fuzzy rate controller and avirtual buffer. The fuzzy controller controls the bit rate ofspliced stream by controlling the frame rate and the type ofpictures. It may drop a number of pictures from an MPEframe or it may replace one or more extra spliceablepictures by IDR pictures. The fuzzy controller and virtualbuffer operate based on MPE frame. The size of virtualbuffer can be computed according to metadata andencapsulating target data.

The fuzzy controller is configured to provide control onthe bit rate of the spliced stream by minimum variation inthe frame rate. It minimizes the number of dropped picturesand also it prevents unnecessary IDR pictures. The output ofcontroller is an integer number. A positive number showsthe number of pictures that should be dropped from the endof MPE frame and a negative number shows the number ofextra spliceable pictures which should be replaced by IDR

Figure 2: Block diagram of proposed IERCS

pictures in MPE frame. Locations of extra IDR pictures aredistributed uniformly along the MPE frame. The fuzzycontroller uses two following signals as inputs:

INPUT1 = FB x FR I TRS - F, (1)

INPUT2 = (BF - Ij + Pj) BS, (2)where FB denotes the total number of bits consumed by thecurrent MPE frame before any dropping or extra IDRpicture. BF and BS refer to the fullness and size of virtualbuffer respectively. TRS and FR are the target bit rate andframe rate of spliced stream. F denotes the target numberof pictures in one MPE frame. Ij and Pj stand for the bit

budgets consumed by the jth replaced IDR picture andcorresponding spliceable picture respectively. It is useful ifthe value of (Ij - pj ) in (2) is replaced with a low pass filter

version of that.All the defined fuzzy rules are summarized in the table 1.

The content of table specifies the output of controller. Theletters H, L, M, V, X and S correspond to linguisticspecifications of high, low, medium, very, extremely andsuper. The desired central values for the output of fuzzysystem correspond to VL, L, ML, M, MH, H, VH, XH, SHare -3, -2,-21, , 1, 2, 3, 4 and 5. The distributions of fuzzymembership functions are shown in figure 3.We used a well-known and simple fuzzy system with

two inputs using "Product Inference Engine", singletonfuzzifier and centre average defuzzifier which is

N1 N

EyEli

y A"i (X1l)-U4i2 (X2 )i11t = 1 '

f(XI 1 X2 ) = Nl N

E E PAAll (X1l)-PAi2 (X2 )N1=1 2=1 2

where f(x1 , x2 ) denotes approximated output

(3)

and

Table 1: Summarization of IF-THEN fuzzy rules

INPUTI XLVLL

MLMMHH

VHXH

M M M M ML L VLM M M M M ML LM M M M M M MLM M M M M M MMH M M M M M MH MH M M M M MVH H MH M M M MXH VH H MH M M MSH XH VH H MH M MVL L ML M MH H VH INPUT2

3043

Authorized licensed use limited to: Tampereen Teknillinen Korkeakoulu. Downloaded on October 13, 2009 at 14:24 from IEEE Xplore. Restrictions apply.

Page 4: Mehdi 2Miska M Hannuksela, UniversityofTechnology, Research …moncef/publications/fuzzy-rate-control-in-IP.pdf · 'MehdiRezaei, 2MiskaMHannuksela, 3MoncefGabbouj "3Institute ofSignalProcessing,

1.5Fuzzy Membership Functions of Linguistic Variables

XL VL ML M MH H VH XH

0.5

0L-10 -5 0

Fuzzy Inputl5 10

15

VL ML M MH H VH

0.5

00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

Fuzy Input2

Figure 3: Fuzzy membership functions

0.9 1

{Ai, A .,2A.}. 2 are fuzzy sets with{4u, (xl)}l<,<NI and

1'Ai2 (X2 )<2i<N membership functions defined for inputs

xl and x2 The centre of output fuzzy set (B"'2 ), denotedby y".2, is chosen as output desired value using ourpractical experiences. More information about the abovefuzzy system (3) is presented in [5].

4. SIMULATION RESULTS

Typical intervals between time-slices containing content fora particular audio-visual service may range from one secondto a couple of seconds. If IDR pictures are placed randomlyin a normal bit stream and the average IDR picture intervalis equal to the time-slice interval, the expected tune-in timedue to decoder refresh is approximately half of the time-slice interval, i.e. typically from half a second to fewseconds. From the tune-in time reduction point of view, theproposed splicing method typically can decrease thedecoder refresh time to very close to zero or even to zero.

The propagated error in spliced stream has been studiedcomprehensively in our previous work [3]. The results ofsimulations showed that the propagation error is saturated toa relative small value after few pictures. The averagedegradation of video quality results of the error can be lessthan 1.6 dB which is almost independent of MPE-FECframe size. It means with a constant cost, using only oneIDR picture in each MPE-FEC frame can minimize thetune-in time and the bit rate at the same time. Despite of thePSNR drop, the subjective impact is hardly noticeable. Ifwetry to decrease the decoder refresh time in a normal bitstream just by frequent IDR pictures, a penalty much higherthan the above degradation in quality should be paid andstill it can not minimize the tune-in time.

To evaluate the HRD compliancy of the proposedspliced bit stream, we encoded 45 minutes video with anumber of 10 different contents to provide the primarystreams for spliced streams with different target bit rates,IDR frequencies and frame rates. Simulation results showthat the proposed rate control system can provide standard

compliant bit stream with a small percentage of droppedpictures (less than 0.3%) and extra IDR pictures (less than0.1%) even if the frequency of IDR pictures in the splicedstream in encapsulator is very different from the averageIRD frequency which is used for the rate control of primarystreams. Moreover, the simulation results show that with acommon content encoder rate control system, withoutproposed fuzzy rate controller the percentages of droppedpictures and extra IDR picture can be extremely larger(about 30 times) than the case we use the proposed fuzzyrate controller.

Simulation results show that the proposed splicingmethod and rate control system can provide standardcompliant video bit streams for IPDC over DVB-H withgood average quality and with minimum tune-in time andbitrate. Therefore, the proposed splicing method can beutilized when the use of SP and SI pictures of H.264/AVCis disallowed.

5. CONCLUSIONS

In this paper we proposed a video splicing with acomprehensive rate control system utilizing a fuzzycontroller which minimizes the tune-in time in IPdatacasting over DVB-H (Digital Video Broadcasting forHandheld terminals). The proposed system providesstandard HRD (Hypothetical Reference Decoder) compliantbit streams including desired random access point tominimize the decoder refresh time. Simulation resultsshowed that the proposed splicing method and rate controlsystem can minimize the tune-in time of IP Datacasting overDVB-H simply at the expense of a relative smalldegradation in quality and very small percentage of droppedframe.

6. REFERENCES

[1] ETSI, "Digital Video Broadcasting (DVB): Transmissionsystems for handheld terminals," ETSI standard, EN 302 304VI.1.1, 2004.

[2] J.M. Boyce, A.M. Tourapis, "Fast efficient channel change[set-top box applications]," IEEE International Conference onConsumer Electronics (ICCE), 8-12 Jan. 2005.

[3] Miska M. Hannuksela, Mehdi Rezaei, Moncef Gabbouj,"Video Encoding and Splicing for Tune-in Time Reduction in IPDatacasting (IPDC) over DVB-H," IEEE Int. Sym. on BroadbandMultimedia Systems and Broadcasting, Las Vegas, April 2006.

[4] M. Karczewicz, R. Kurceren, "The SP- and SI-frames designfor H.264/AVC", IEEE Transactions on Circuits and Systems forVideo Technology, Vol. 13, No. 7, July 2003.

[5] L. X. Wang, Adaptive Fuzzy System and Control: Design andStability Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1994.

3044

Authorized licensed use limited to: Tampereen Teknillinen Korkeakoulu. Downloaded on October 13, 2009 at 14:24 from IEEE Xplore. Restrictions apply.