assessing objective and subjective quality of audio/video...

Assessing Objective and Subjective Quality of Audio/Video for

Internet based Telemedicine Applications

Dissertation Proposal

Submitted to School of Information Science

Claremont Graduate University

Bengisu Tulu

November 8, 2004

1 of 60

Table of Contents

Chapter 1: Introduction................................................................................................................ 2

Chapter 2: Problem Statement ..................................................................................................... 7

Chapter 3: Literature Review..................................................................................................... 11

3.1 Telemedicine, Telehealth, and e-Health ................................................................... 11

3.2 Audio/Speech Quality Measures............................................................................... 15

3.2.1 Objective Measures............................................................................................... 16

3.2.2 Subjective Measures ............................................................................................. 18

3.3 Video Quality Measures ........................................................................................... 20

3.3.1 Objective Measures............................................................................................... 20

3.3.2 Subjective Measures ............................................................................................. 22

3.4 Quality Measurement in Telemedicine applications................................................. 23

3.5 Session Initiation Protocol (SIP) for Internet-based Videoconferencing ................. 25

Chapter 4: Proposed Study ........................................................................................................ 28

4.1 Stage 1 – Development of Telemedicine Taxonomy................................................ 28

4.1.1 Definition of Taxonomy Dimensions ................................................................... 29

4.1.2 Interaction of Proposed Dimensions..................................................................... 33

4.2 Stage 2 – Assessment of Objective and Subjective Quality Measures for

Telemedicine............................................................................................................................. 35

4.2.1 Technological Factors Affecting Quality in Telemedicine over IP Networks ..... 35

4.2.2 Experimental Test-bed .......................................................................................... 44

4.2.3 Experimental Procedures ...................................................................................... 47

4.3 Stage 3 – Development of SIP-based Videoconferencing Tool with Real-time

Telemedicine Capability Index ................................................................................................. 50

4.4 Research Methodology ............................................................................................. 51

4.5 Contributions and Potential Implications.................................................................. 52

4.6 Timeline .................................................................................................................... 54

References.................................................................................................................................. 56

2 of 60

Chapter 1: Introduction

Health expenditures as a share of Gross Domestic Product (GDP) have been rising in the

United States and other member countries of the Organization of Economic Cooperation and

Development (OECD) since 1960s. A study, conducted to examine the reasons for this increase,

concluded that Information technology (IT) can play an important role to reduce increasing costs

[1]. However, experts around the world believe that new demands in providing healthcare will

require fundamental changes in the structure of the industry. Besides the failure to disseminate

medical knowledge quickly enough or use it in a methodical manner, there is another shortfall:

medical practitioners with scarce, specialized knowledge cannot bring it to bear beyond their

geographical confines [2-4]. Telemedicine’s effort to bridge this gap has been reported

repeatedly [2, 3].

Telemedicine, and associated technologies, are touted as critical to solve the above-mentioned

problems. As a result of growing interest in telemedicine during the last decade [5], many

telemedicine applications have been developed and deployed during recent years [6].

Telemedicine and related healthcare technologies aim to provide efficient healthcare to improve

the well being of patients and bring medical expertise at a lower cost to the right people at the

right time. The Internet is moving towards becoming the most widely used communication

medium around the world. Given its objective, telemedicine has been quite slow in utilizing

Internet technologies to provide medical expertise for a larger audience. One of the reasons for

this slow adoption is the quality that can be provided through varying Internet connections.

3 of 60

In telemedicine, quality of the data obtained at the receiving end of the connection is critical

for the medical decisions. The final outcome of the medical session and hence the success of the

whole process, depends on the amount and quality of the data received. Lack of necessary

information may increase the frustration and dissatisfaction for parties involved and may lead to

erroneous decisions, which can severely affect the overall outcome of a telemedicine event. To

prevent this frustration and the negative effects of an unsuccessful telemedicine experience,

current telemedicine applications are conducted with high quality equipment over high-speed

connections that are not IP based. Hence, the spread of telemedicine to areas which are

underdeveloped and which do not have a good telecommunications infrastructure to support such

expensive connection lines and equipment have been limited. The Internet is more or less

accessible from many locations in the world for a very low cost compared to the current

alternatives used in telemedicine. Figure 1.1 illustrates an end-to-end Internet-based telemedicine

network implementing several telemedicine scenarios. For each scenario, user requirements and

available technical infrastructure may vary and this variation affects the quality of medical

decisions made in a session.

IP-based telemedicine systems are vulnerable to various impairments that can occur at the

physical, network, and application levels. Unlike circuit switched networks, the Internet is a

packet switched network where packet loss, delay, and delay variation can occur easily. In

addition, available bandwidth can vary from one location to another. On the application level,

different coding and compression techniques have been developed to enable the transmission of

audio/video data, which can consume large amounts of bandwidth. On the network level, to

provide a more reliable and stable connection, many service providers are offering Quality of

Service (QoS) features that provide some guarantee of performance such as traffic delivery

4 of 60

priority, speed, delay, or delay variation by prioritizing and guaranteeing bandwidth for selected

applications to achieve optimal service performance. However, implementation of these features

is still very limited.

Figure 1.1 End-to-End IP-based Telemedicine Networks

Even though information quality is a critical factor in medical decision making, some decisions

may not require the highest quality of information possible. Certain decisions can be made with

5 of 60

limited quality. An examination of current knowledge reveals that there is no study focusing on

answering the following question: “For any given communication channel and a given

telemedicine purpose, what is the right amount of information to transfer given the limitations of

communication technology and the devices in use for specific scenarios?” Knowing the

boundary for the minimum information required can help us utilize limited channels and prevent

unnecessary information load on larger channels.

Today, there is an increased awareness of the errors that may occur in medicine and the lack of

decision support systems and tools that can help physicians. The problem is particularly acute

when it comes to telemedicine. In order to provide support for physicians that are involved in

telemedicine events, this study proposes to develop and utilize objective and subjective

audio/video quality measures to calculate a real-time quality index given a specific telemedicine

setting for IP-based networks. This index will be utilized by study participants, while making

decisions, regarding whether a telemedicine setting is capable of providing the required quality

to complete a specific task in a given application domain. Early evaluation of the existing setting

can prevent frustrations and time loss while enhancing the response time and satisfaction in

telemedicine events. This study will first, investigate the factors that affect the quality of

information required in a telemedicine event that will lead to a taxonomy of telemedicine

applications. Second, considering that audio and video are the two data formats that are most

affected by the factors affecting quality, objective and subjective quality measures for providing

real time feedback to the participants of a telemedicine event will be collected in an experimental

testbed. Finally, an a videoconferencing tool that can predict and present perceived quality of a

telemedicine session based on the objective values collected in real time will be developed.

6 of 60

Contributions of this study are threefold. First, it will provide a telemedicine taxonomy for

classifying different telemedicine events based on factors affecting quality and outcome. Second

it will develop a subjective measure database for telemedicine in two application domains and

investigate a heuristic measure to predict perceived quality of a telemedicine event in real time.

Third, it will implement this heuristic method in a new artifact, a videoconferencing tool with a

telemedicine capability indicator, capable of measuring objective quality and providing

subjective quality feedback to users in real time.

One limitation of this study is that the subjective measures will be developed only for two

specific telemedicine events (listening to heart beats and viewing an eye image). The

generalizability of these measures to other areas of telemedicine is unknown; hence future

research will be needed to test and expand these quality measures to other application domains

and telemedicine events. Another limitation may arise from the number of subjects that will be

recruited for subjective tests. To minimize the effects of this limitation, ITU’s minimum

requirements for subjective tests will be the criteria for recruiting subjects.

7 of 60

Chapter 2: Problem Statement

Since the introduction of the term “telemedicine”, various studies have outlined how one can

utilize this new technology and reap its benefits. However, telemedicine remained a “black box”

for the public as a result of the fact that even the authorities have not yet reached a consensus on

a clear and precise definition of telemedicine content and boundaries [7]. What is telemedicine

and how does it differ from traditional medicine? What are the necessary new laws and

regulations to bring this technology across the globe?

Despite the fact that the government has supported it and there have been continued reductions

in the equipment and transmission costs, there have not been enormous numbers of actual

implementations of telemedicine. Among the many barriers listed in the literature, the lack of

information about the effect of telemedicine on cost, quality, and access has been a significant

one [8]. It is important to analyze and predict the success of the future applications to make

better decisions. In telemedicine, the quality of the data obtained at the receiving end of the

connection is critical for the medical decision. The final outcome of the telemedicine session

and hence the success of the entire process depends on the amount and quality of the data

received. If the information necessary to make a decision cannot be retrieved during the

telemedicine event, both parties involved in the process will feel frustration and dissatisfaction,

which can severely affect the results of telemedicine visit. To prevent this frustration and the

negative effects of an unsuccessful telemedicine experience, current telemedicine applications

are conducted over high-speed connections that are not IP based. One drawback of this approach

is the barriers it introduces regarding the spread of telemedicine to areas which are

8 of 60

underdeveloped and which do not have a good telecommunications infrastructure to support such

expensive connection lines. On the other hand, the Internet is more or less accessible from many

locations worldwide for a very low cost compared to the current alternatives used in

telemedicine.

Use of the Internet for telemedicine has been studied for the last ten years, and some of these

studies have demonstrated that even on Internet and IP-based connections it is feasible to

conduct telemedicine sessions, but no study has been able to provide convincing evidence to the

telemedicine community for its widespread adoption. The unreliable connection properties of IP-

based systems prevent the spread of these applications. However, if one can measure the

predicted quality that can be obtained for a specific telemedicine setting and compare the

predicted quality with the requirements outlined by the parties involved, then a feasibility and

capability value can be presented to the parties involved and a decision to either continue with

the telemedicine event or switch to an alternative method can be made. Early evaluation of the

existing setting can prevent frustration and time loss while enhancing the response time and

satisfaction. An evaluation method, which is described here, introduces the existing gap in the

literature for predicting quality of telemedicine event settings.

Existing studies [9] indicate the importance of studying quality of the existing or future

applications.

“Use of quality improvement process not only results in improved output quality but it

also makes the production process sensitive to changes in input, output and the

environment.” [10]

Quality is important; however, the channel used to deliver this quality information is limited.

Thus, there is a need to understand the requirements of each application in its own domain and

9 of 60

define the amount and quality of information required to provide telemedicine services. It is

important to classify telemedicine applications based on their potential use by taking the medical

domains they serve into consideration. Then one can identify the IT infrastructure needs and

requirements for each of these applications in order to provide a satisfactory telemedicine

experience to end users. There are a variety of applications, devices, and communication

technologies that are used in telemedicine. The reasons for this variety are: (1) the diversity of

telemedicine locations and physical limitations of each location; (2) the application areas that

utilize the telemedicine applications; and (3) the purpose for the use of telemedicine.

Communication infrastructure technologies, such as telephone lines or leased lines, also have a

critical impact on the applications utilized in telemedicine, and hence, on the outcome. A

telediagnosis case in the psychiatry domain or a teleconsultation case in telecardiology domain

are expected to have different requirements since the information that is necessary to make a

clinical decision differs based on the application domain. Therefore, it is not reasonable to expect

similar results from the same technology when it is being used in different domains and/or for

different purposes.

With the goal of solving some of the problems introduced above, this proposed study will try

to answer the following research questions:

RQ1. What are the factors that affect the quality of information required in a telemedicine

event?

RQ2. Given the current set of objective and subjective audio and video quality

measurements, which ones – if any – provide the appropriate quality assessment for IP-based

telemedicine systems that can aid decision makers?

10 of 60

RQ3. Using the appropriate objective and subjective quality assessment measures for

telemedicine, is it possible to integrate them as a capability index in a SIP-based desktop

videoconferencing tool to provide real time feedback for decision makers regarding quality?

11 of 60

Chapter 3: Literature Review

In this chapter, an overview of literature in telemedicine as a concept, voice and video quality

measurement techniques in general, and quality measurement in telemedicine is presented. The

first section introduces a brief summary of telemedicine literature including various definitions

of telemedicine and related terminology, as well as how the technological changes affected the

evolution of telemedicine applications. Next two sections provide a summary of objective and

subjective quality measures developed and utilized in prior research and practice for voice and

video respectively. The forth section summarizes the quality measurement literature in

telemedicine and provides a list of factors used in these quality measurement studies. The last

section introduces the Session Initiation Protocol, which will be utilized in this study to develop

a videoconferencing application for telemedicine.

3.1 Telemedicine, Telehealth, and e-Health

Telemedicine has various potential uses such as clinical, educational and administrative. The

promising potential of bringing high quality service to under-served areas via telemedicine is an

example of how IT can reduce the quality-adjusted cost. Bashshur [7] notes that telemedicine

provides a solution to the problems such as access to care for large segments in the population,

continuing healthcare cost inflation, and uneven geographic distribution of quality by: (1)

enhancing accessibility to care for underserved populations, (2) containing cost inflation as a

result of providing appropriate care to remote patients in their home communities, and (3)

improving quality as a result of providing coordinated and continuous care for patients, targeted

12 of 60

and highly effective continuous education for providers, and highly effective tools for decision

support.

The evolution and growth of telemedicine is highly correlated with the developments in

communication technology and IT software development. This dependency is evident if we

quickly browse through the history of telemedicine technologies, which was categorized into

three eras [11]. All the definitions during the first era of telemedicine focused on medical care as

the only function of telemedicine. The first era can be named as telecommunications era of the

1970s [11]. Telemedicine programs during the first era ended as the government terminated

funding before these programs matured. It is important to note that “telemedicine is a product of

the information age, just as the assembly line was the product of the industrial age.” [11]. The

application in this era was dependent on broadcast and television technologies where

telemedicine application was not integrated with any other clinical data.

The second era of telemedicine, the dedicated era, started during the late 1980s as a result of

digitization in telecommunications and it grew during 1990s [11]. The transmission of data was

supported by various communication means ranging from telephone lines to Integrated Service

Digital Network (ISDN) lines. The high costs attached to the communications infrastructure that

can provide higher bandwidth became an important bottleneck for telemedicine.

The dedicated era turned into the Internet era where more complex and ubiquitous networks

are supporting telemedicine. The third era of telemedicine is supported by the technology that is

cheaper and accessible to an increasing user population [11]. The enhanced speed and quality

offered by Internet2 is providing new opportunities in telemedicine as well. In this new era of

telemedicine, the research strategies should include “…an understanding of the functional

13 of 60

relationships between telemedicine technology and the outcomes of cost, quality, and access”

beyond the assessment of technical sufficiency [11].

During the evolution of telemedicine, new terminologies were developed as the applications

and delivery options increased in variety, and the application areas expanded to almost all the

fields medicine can cover. This resulted in confusion and misidentification of what could be

termed telemedicine and what could be termed telehealth or e-health. This became even more

complicated as these fields advanced. Cybermedicine is yet another term introduced lately into

the literature.

Since the first formal definition of telemedicine by Bird in 1971, many researchers tried to

define this term in order to clarify the boundaries of telemedicine and its use. Even though the

core of these definitions is the same, telemedicine, and hence its definition, evolved dramatically

as a result of the tremendous changes experienced in the telecommunication and information

technologies. These changes were so significant that new terminologies like telehealth, e-health,

and others were introduced, and explaining the difference between telemedicine and these new

terms became important. Studies defined telehealth as a big umbrella that encompasses more

applications than the definition of telemedicine can cover [12, 13]. Table 3.1 presents a selected

list of definitions proposed in the literature for telemedicine, telehealth, and e-health.

This list of definitions gives an indication of the competing terminologies; more terminologies

may be introduced in the future as further technological advances are achieved. Therefore, it is

important to understand that the purpose of research in this field is to support the “ultimate

quest” which is to cure disease, prevent it if possible, reduce infirmity, and enhance quality of

life, as stated by Bashshur [7].

14 of 60

“Some may question whether this is telemedicine, telehealth, e-health, health informatics,

or biohealth informatics. It does not really matter what we call it or where we draw boundaries.

…collective and collaborative efforts from various fields of science, including what we call now

telemedicine is necessary. [7]”

Table 3.1. Definition of terms

Definition Ref.

Telemedicine is the practice of medicine without the usual physician-patient

confrontation …via an interactive audio-video communications system.

1971 Bird [11]

Telemedicine is a system of care composed of six elements: (1) geographic

separation between provider and recipient of information, (2) use of information

technology as a substitute for personal or face-to-face interaction, (3) staffing to

perform necessary functions (including physicians, assistants, and technicians), (4) an

organizational structure suitable for system or network development and

implementation, (5) clinical protocols for treating and triaging patients, and (6)

normative standards of behavior in terms of physician and administrator regard for

quality of care, confidentiality, and the like.

1975 Bashshur [11]

Telemedicine is the use of electronic information and communications technologies

to provide and support healthcare when distance separates the participants.

1996 Committee on

Evaluating Clinical

Applications of

Telemedicine [14]

Telemedicine is the delivery of health services when there is geographical

separation between healthcare provider and patient, or between healthcare

professionals.

2001 Miller [8]

Telemedicine is the provision of healthcare services, clinical information, and

education over a distance using telecommunication technology.

2001 Maheu [13]

Telehealth is the removal of time and distance barriers for the delivery of healthcare

services or related healthcare activities. (In this study, telemedicine is a subset of

telehealth)

2001 American Nurses'

Association [12]

E-health refers to all forms of electronic healthcare delivered over the Internet,

ranging from informational, educational, and commercial “products” to direct

services offered by professionals, non-professionals, businesses or consumer

themselves.

2001 Maheu [13]

15 of 60

It is important to note that the ultimate goal of any telemedicine effort is to improve the

well being of patients. However, since the first definition of the term, uncertainty on the

meaning of telemedicine became evident over time. This uncertainty is hindering efforts in

developing a clear definition and a classification method for telemedicine. Previous attempts [14]

to classify telemedicine were motivated by the demand for evidence of its effectiveness and

therefore, were focused on developing a strategy to evaluate the telemedicine applications and

their effects on quality, accessibility or cost of healthcare. In 1996, Committee on Evaluating

Clinical Applications of Telemedicine published a report [14] that classified clinical application

of telemedicine under six categories (p.30): (1) initial urgent evaluation, (2) supervision of

primary care, (3) provision of specialty care, (4) consultation, (5) monitoring, and (6) use of

remote information and decision analysis resources to support or guide care for specific patients.

The broad classification that will be developed for this study, which is more focused on

identifying different dimensions of telemedicine and telehealth, and which can then be used to

identify user requirements for different categories in an organized manner, is expected to have a

positive impact on the use and development of current and future applications.

3.2 Audio/Speech Quality Measures

Measuring speech quality has been researched for many years and its results were utilized in

public-switched telephone networks (PSTN). The goal of many studies was to find an objective

measure that can be used to predict the perceived subjective quality of a human subject. Many

objective and subjective speech (audio) quality measures were developed. However, most of

these measures, if not all, were originally developed for PSTN networks, which are circuit-

switch networks, and recent research indicates that these measures may not work well for packet

switched networks, such as Internet telephony [15]. In his work, Hall [15] compared three

16 of 60

different measures, which are: (1) perceptually weighted distortion measures such as enhanced

modified Bark spectral distance (EMBSD) and measuring normalizing blocks (MNB), (2) word-

error rates of continuous speech recognizers, and (3) the ITU E-model, under conditions of a

typical VoIP system. The results of his study indicate that E-model provides the highest

correlation with Mean Opinion Score (MOS) for VoIP systems. This section summarizes the

most popular quality measures with a brief description of each one.

3.2.1 Objective Measures

The most widely adopted objective speech quality measure is the Signal-to-Noise Ratio (SNR),

which compares original and processed speech signals sample by sample. The SNR is the

simplest measure possible as it measures the distortion of the waveform coders that reproduce

the input waveform [16]. The SNR is also defined as “ratio of the energy of the original target

source to the energy of the difference between original and reconstruction – that is, the energy of

a signal which, when linearly added to the original, would give the reconstruction”[17]. A

modified version of the SNR is called segmental Signal-to-Noise Ratio (SNRseg), which

decomposes the entire signal into segments and calculates the average SNR of these short

segments [16]. These measures are easy to compute; however, their disadvantages limit their use

in various scenarios. First of all, SNR measures require access to the original signal, which

eliminates them for use in real time measurements. Other drawbacks of these time-domain

measures are reported in [16, 17].

When speech quality must satisfy human listeners, there is no better way then performing

subjective tests. However, due to the cost of such evaluations, researchers often utilize

algorithms that can estimate the outcomes of these tests. These algorithms can be grouped under

perceptual models whose measures are based on human auditory perception models[16]. One

17 of 60

example for these perceptual models is Bark Spectral Distortion (BSD) [18], which is based on

the assumption that speech quality is directly related to speech loudness (the magnitude of

auditory sensation). It works well when the distortion in voiced regions represents the overall

distortion, and hence identifying the voiced regions is required [16]. An Enhanced Modified

BSD, which consists of a perceptual transform followed by a distance measure that incorporates

cognition model, was also proposed. Based on the test in [16], its correlation with subjective

results is relatively good for encoding impairments but poor on network impairments.

Another example of perceptual models for estimating subjective quality is the ITU

Recommendation P.861, Perceptual Speech Quality Measure (PSQM). The PSQM algorithm

measures the distortion experienced by a speech signal in an internal psychoacoustic domain

when transmitting through various codecs and transmission media. The transformation of

physical domain to loudness domain is used to mimic the sound perception of human subjects in

real-life situations. An extension of PSQM, named PSQM+, improves the performance of its

predecessor by adopting a simple algorithm in the cognition module. This improves the poor

performance of PSQM for temporal clipping distortions but the performance of PSQM+ for other

types of distortion is questionable [16]. There are other examples of perceptual models in the

literature such as Measuring Normalizing Block (MNB) [19]. However, each of these

measurements has its limitations on certain impairment types.

The most commonly used objective speech quality measure is the ITU’s E-model, which was

originally developed to evaluate the speech quality for PSTN. It takes into account multiple

variables such as encoding distortion, delay, jitter, echo, etc. As mentioned above, the E-model

provides the closest correlation to MOS results among other measures discussed in this section

[15]. One important advantage of this model is that it does not require access to the original

18 of 60

speech signal and hence it can be used for real-time quality assessments [16]. The E-model

generates the rating R and the formula for its computation is provided below:

AIIIRR eds +−−−= 0

0R , the highest possible rating for this system with no distortion [15, 16], is the basic signal-to-

noise ratio based on send, receive loudness, electrical, and background noise [20]. sI is the

impairment of the speech signal itself [16] and captures impairments that happen simultaneously

with the voice signal, such as sidetone and PCM quantizing distortion [20]. These two values do

not depend on the transmission over the network. dI is the impairment level caused by delay,

jitter, and echo. eI , also known as the “equipment factor”[20], is the level of impairments caused

by encoding and hence captures the degradation in quality due to compression and loss during

transmission. A stands for the advantage factor that captures the willingness of users to accept

some degradation of quality in return for the other benefits the system may provide such as

mobility in the case of cellular phones. E-model values can be directly matched with MOS

values by using a simple table provided in the standard.

3.2.2 Subjective Measures

Measuring subjective quality of speech has been an important issue since the transmission of

audio over telephone networks began. Over the years, standards emerged based on the results of

various studies carried out in various laboratories. Today, ITU recommendations are the most

widely used standards utilized by researchers while working on quality assessment methods. The

ITU-T P.800 provides numerous methods for the subjective assessment of transmission quality.

Scales for these methods are provided in Table 3.2. Results from 5-point category scales are

averaged across participant responses to provide a Mean Opinion Score (MOS).

19 of 60

Table 3.2. Speech Quality Measurement Scales provided by ITU-T recommendations

Listening Quality Scale Conversation Difficulty Scale Quality of the speech/connection Score Score

Excellent 5 Yes 1

Good 4 No 2

Fair 3

Poor 2

Bad 1

Did you or your partner have any

difficulty in talking or hearing over the

connection?

Listening Effort Scale Loudness Preference Scale Effort required to understand the meaning of the

sentences

Score Loudness Preference Score

Complete relaxation possible; no effort required 5 Much louder than preferred 5

Attention necessary; no appreciable effort required 4 Louder than preferred 4

Moderate effort required 3 Preferred 3

Considerable effort required 2 Quieter than preferred 2

No meaning understood with any feasible effort 1 Much quieter than preferred 1

Comparison Category Rating Scale Degradation Opinion Scale The quality of the second compared to the first is Score Degradation is inaudible 5

Much better 3 Degradation is audible but not annoying 4 Better 2 Degradation is slightly annoying 3 Slightly better 1 Degradation is annoying 2 About the same 0 Degradation is very annoying 1 Slightly worse -1 Worse -2 Much worse -3

Watson and Sasse [21] criticized these recommended scales, with respect to speech in real time

multimedia communications, in three main areas: (1) vocabulary of the scale labels, (2) length of

the recommended test material, and (3) conversation difficulty scale. They note that transmission

of speech in real time over IP-based networks may be carried on low bandwidth connections and

is subject to various network impairments. Hence the reason for their first criticism regarding

scale labels claims that even with training, it is likely that responses will be skewed towards the

lower end of the scale. Regarding their second criticism, they note that the recommended test

length of 10 seconds is too short in duration to understand the rapid and unpredictable changes

that can occur in speech quality due to changes in network conditions. And finally, they criticize

the binary scale by arguing that even a small amount of packet loss is likely to cause difficulty in

hearing or talking, even if it is short-lived. In one study [22] they proposed a new subjective

20 of 60

quality scale termed polar continuous quality scale, which was shown to be a reliable means of

measuring perceived quality. During their experiments, users were consistent in their use of it

and the rating trend followed the same slope obtained with MOS. One other important finding of

their study was that the perceived quality of speech is not affected with network impairments as

much as it is affected by factors such as volume discrepancies, poor quality microphones, or

echo.

3.3 Video Quality Measures

Video quality has been an important issue, first for television broadcasting applications.

Various measures have been developed for analog video systems to evaluate the effects of

transmission on the original video signal. However, today, digital video systems are replacing

these analog systems and are becoming an essential part of the U.S. and world economy [23].

Wolf and Pinson [23] states that, “To be accurate, digital video quality measurements must be

based on the perceived quality of the actual video being received by the users of the digital video

system rather than the measured quality of traditional video test signals (e.g., color bar)”. Hence,

new measurement techniques for measuring quality of digital video signals are being developed

by various researchers and organizations. This section presents a survey of existing objective and

subjective video quality measurement techniques utilized in the literature.

3.3.1 Objective Measures

Peak Signal-to-Noise Ratio (PSNR) is the most commonly used metric for measuring video

and image quality. It measures how close a sequence is compared to the original one [16]. The

calculation of the PSNR for a video sequence of K frames each having NxM pixels with m-bit

21 of 60

depth is explained below [16]. First, the Root Mean Square Error is calculated according to the

following formula:

∑∑∑= = =

−=K

k

N

n

M

m

kjixkjixKMN

RMSE1 1 1

2)],,(),,([..

1

where ),,( kjix and ),,( kjix are the pixel luminance value in the ji, location in the k frame for

the original and distorted sequences respectively. Once the RMSE is calculated, the PSNR can be

calculated using the following formula:

2

2

log.10RMSE

mPSNR =

The PSNR is usually reported in decibels (dB) [24]. An image with a PSNR of 25 dB or below

is usually unacceptable. Between 25 dB and 30 dB, perceived quality usually improves and

above 30 dB, images are often perceived as good as the original image. Markopoulou [25] notes

that the PSNR is exclusively used as a quality measure, partly because of its mathematical

tractability and partly because of the lack of better alternatives. It is has also been noted [16] that

the PSNR does not always correlate well with subjective measures.

One other commonly used metric is the “Video Quality Metric (VQM)”[23], which was

developed by the Institute for Telecommunication Sciences (ITS). It requires the extraction and

classification of features from both the original and processed video sequences similar to the

other measurement techniques. Once these features are extracted, the distance between the

original and processed video sequences are computed based on these features, and this distance

is mapped to a subjective score [23]. Compared to the PSNR, this metric offers different models

for various transmission types, such as videoconferencing or TV models. It is also possible to

identify the nature of an impairment using the VQM, which the PSNR does not provide [25].

22 of 60

There are other standard and proprietary measurement techniques that have been developed

and reported in the literature that are not mentioned here. One commonality between these

objective measures, however, is that they require access to both original and processed video

sequences. One recent study [16] proposed a new measure, which does not require access to the

original video sequence, similar to the idea of E-model for voice quality. In this new method,

artificial neural networks (ANN) are used to predict perceived voice and video quality using a

trained engine based on previous objective and subjective tests. This type of measurement

techniques enables real-time measurement of video quality and is an open area for research.

3.3.2 Subjective Measures

The ITU-R 500 is the standard for subjective assessment of image quality and has evolved over

the years to include measures for digital video transmissions as well. This standard provides

scales for single and double stimulus methods. The Absolute Category Rating (ACR) is a single

stimulus method where test sequences are presented one at a time and are rated on a category

scale after they are viewed. Usually a 5-point category scale is used as illustrated in Table 3.3.

The Single Stimulus Continuous Quality Evaluation (SSCQE) is different from the ACR in terms

of the scale it uses and the assessment process. The scale used in the SSCQE is a continuous

quality scale, illustrated in Figure 3.1, and assessment takes place in a continuous manner during

the presentation of the video sequence.

Table 3.3 ITU Video Quality Assessment Scales

5-point Quality Scale 5-point Impairment Scale Estimated Quality Score Estimated Impairment Level Score

Excellent 5 Imperceptible 5

Good 4 Perceptible 4

Fair 3 Slightly Annoying 3

Poor 2 Annoying 2

Bad 1 Very Annoying 1

23 of 60

Figure 3.1 Continuous 5-point Quality Scale

Figure 3.2 Continuous 5-point Quality Scale for DSCQS

Among the double stimulus methods, the Double Stimulus Impairment Scale (DSIS) - also

known as the Degradation Category Rating (DCR) - presents pairs of original and impaired video

sequences during the test respectively. In this case, subjects are asked to rate the impairment of

the second stimulus with respect to the reference (first stimulus) using the 5-point impairment

scale illustrated in Table 3.3. In the Double Stimulus Continuous Quality Scale (DSCQS)

method, the sequences are presented in pairs like in the DSIS and subjects are asked to evaluate

the quality of both sequences. The original sequence is included for reference; however, the

observers are not told which one is the reference sequence and the order of appearance changes

for each test. The scale used in this method is illustrated in Figure 3.2. There are other test

methods where the two sequences are shown simultaneously and the observers are asked to make

a comparison of the two based on stimulus comparison scale.

3.4 Quality Measurement in Telemedicine applications

Quality in telemedicine has been studied from different perspectives in the literature. As a

common way of assessing quality of a telemedicine event, user satisfaction was used in a large

24 of 60

number of articles. Another approach common in literature is to study the quality of the

transmitted media (image, audio, etc.). These studies have been usually limited to the

compression techniques and their effects on the perceived quality of the users. For example,

Eikelboom [26] investigated image compression of digital retinal images and the effect of

various levels of compression on the quality of the images. They compared JPEG and Wavelet

image compression techniques and concluded that; “for situations where digital image

transmission time and costs should be minimized, Wavelet image compression to 15 KB is

recommended, although there is a slight cost of computational time. Where computational time

should be minimized, and to remain compatible with other imaging systems, the use of JPEG

compression to 29 KB is an excellent alternative”.

To answer the question of which compression technique is better in a generic way, some

studies focused on quality measures. Cosman et al. [27] studied an interesting question, “How

does one decide if an image is good enough for a specific application, such as diagnosis, recall

archival, or educational use?”, and compared and contrasted three approaches to the

measurement of medical image quality: the signal-to-noise ratio (SNR), a subjective rating, and

diagnostic accuracy. They concluded that there is a need for computable measures of image

quality that can accurately predict the outcomes of image quality evaluation studies. Another

recent article on image quality by Przelaskowski [28] states that, “A numerical measure, which is

able to predict diagnostic accuracy rather than subjective quality, is required for compressed

medical image assessment.”. A new vector measure for image quality, reflecting diagnostic

accuracy was developed in this recent study.

A recent study by Rosenthal [24] focused on understanding the impact of certain variables

affecting the transmission of video over IP networks. This study is one of the few studies that

25 of 60

investigated the effects of network impairments and the codec bit rate on the quality of video on

IP networks for telemedicine purposes. This study used the PSNR and a proprietary objective

measurement technique, the Picture Quality Rating (PQR). His findings suggests that an increase

in codec bit rate and network bandwidth have positive effects on the PQR and the PSNR levels

for sequences subjected to delay and jitter impairments, but not for those in which periodic

packet drops were introduced. He concludes that with or without the existence of selected

packet-specific impairments, increases in bandwidth and codec bit rate improve the objective

quality of video transmitted over IP networks. Another study by Dev et al. [29] presented a

method to obtain an end-to-end characterization of the performance of an application over a

network by taking into account network impairments and application constraints. The

applications selected for testing were two medical education tools: (1) an image serving

application that delivers a sequence of linked images based on user movement of the mouse

cursor and, (2) an application intended to train students remotely in various surgical procedures.

They were tested on four different types of networks. They propose that the subjective

evaluations used in their study can be utilized to predict the conditions under which the

application will be running based on predefined requirements.

3.5 Session Initiation Protocol (SIP) for Internet-based Videoconferencing

A recent Voice over IP signaling standard approved by IETF called Session Initiation Protocol

(SIP) is attracting telemedicine application developers due to its ability to handle voice, video, as

well as multimedia communications over IP-based networks and with a native security

mechanism built-in. Until the introduction of SIP, the only standard available for

videoconferencing applications was the H.323 family of ITU standards. However, the H.323

standard does not lend itself to integration with web and messaging, and does not have a native

26 of 60

security mechanism. With the increasing importance of security in the medical field, additional

effort and integration with other security mechanisms is necessary to provide authentication and

authorization. A brief technical summary of SIP is provided in this section.

Session Initiation Protocol (SIP) is the Internet Engineering Task Force (IETF) standard for IP

Telephony [30]. It is an application layer control protocol that can create, modify, and terminate

multimedia sessions. Different types of entities are defined in SIP: user agents, proxy servers,

redirect servers, and registrar servers. Figure 3.3 shows a typical SIP session including these

entities.

Figure 3.3 Entities in Session Initiation Protocol

In a typical SIP session, user agents first register to a registrar and forward their request to a

SIP proxy, which is responsible from discovering the location of the requested destination so that

two user agents can negotiate their session description [31]. Figure 3.4 illustrates a simple call

flow with single proxy server.

27 of 60

Using SIP for telemedicine is relatively new compared to the previous H.323 standard that has

been the dominant protocol since the early 1990s. SIP was first introduced in 1999 and in 2002 a

revised version of the protocol was published. Since then, it became the commonly used protocol

for Internet telephony and videoconferencing applications. However, use of the SIP for

telemedicine has been slow. A small number of studies [32, 33] have mentioned how the SIP can

be used in telemedicine applications.

Figure 3.4 SIP Call Flow

28 of 60

Chapter 4: Proposed Study

The proposed study consists of three stages. During the first stage of this study, a telemedicine

taxonomy will be developed to classify telemedicine applications based on their potential use

and by taking the medical domains they serve into consideration. The goal is to identify the IT

infrastructure needs and requirements for each of these applications in order to provide a

satisfactory telemedicine experience to end-users. The second stage of this study will first review

the available objective and subjective audio/video quality measurements in the literature and

select appropriate measures for telemedicine environments keeping the proposed telemedicine

taxonomy dimensions in mind which, as has been discussed, can play an important role in the

decision making process during telemedicine events. Experiments will then be conducted with

physicians to identify the proper subjective audio/video quality required to make a decision in a

telemedicine event for specific application area and purpose for IP-based telemedicine networks.

The findings from these experiments will be used to define indices for telemedicine event

capability. The last stage will be the development and testing of a videoconferencing tool in a

telemedicine environment incorporating the developed index. The next three sections explain

each stage in more detail. Later, the research methodology, potential implications, and a timeline

for this study are presented.

4.1 Stage 1 – Development of Telemedicine Taxonomy

This study proposes five dimensions that will help to categorize different telemedicine efforts.

These dimensions were derived from a survey of literature and reflect a combination of various

29 of 60

classification schemes proposed in early studies. The first subsection will provide a description

of these five dimensions: Application Purpose, Application Area (Domain), Environmental

Setting, Communication Infrastructure, and Delivery Options. The next subsection will explain

how these dimensions are related in the taxonomy.

4.1.1 Definition of Taxonomy Dimensions

Application Purpose refers to the purpose of communication and is categorized under two

main groups: Clinical and Non-clinical [34]. In addition to the six categories proposed in [14], it

is stated that clinical purpose covers diagnostic and treatment (surgical and non-surgical)

components of patient care as well. Telemedicine not only provides a tool that can be utilized by

professional medical technicians, but it is slowly moving in the direction where a patient can be

treated through electronic channels without the intervention of a local professional. Hence Table

4.1 extends the previous classification and presents a list of clinical telemedicine application

purposes.

Non-clinical purpose includes medical education, administrative meetings, and does not

involve decisions about care for particular patients. Table 4.2 shows non-clinical purposes that

will be utilized in this taxonomy. This study will not focus on the non-clinical applications of

telemedicine.

Table 4.1. Clinical application purpose

Triage

Diagnostic

Non-Surgical Treatment

Surgical Treatment

Consultation

Monitoring

Provision of specialty care

Cli

nic

al

Supervision of primary care

Table 4.2. Non-Clinical application purpose Professional Medical Education

No

n-

Cli

ni

Patient Education

30 of 60

Research

Public Health

Administrative

Application Area refers to the domains in the medical field. The domains listed in Table 4.3

represent a high-level example list of medical domains and can be expanded as necessary. The

reason for including medical domains as a dimension in this taxonomy is to point out the domain

specific differences that affect the information required and gathered through communication

channels. For example, the information required to make a diagnostic decision may differ

significantly in the cardiology domain compared to the psychiatry domain. Information can be in

various formats, such as text, audio, and video, and the application purpose and application area

defines the amount and type of information required to make a clinical decision. Based on a

review of the current literature, no studies have identified the application domain as a

classification criterion for telemedicine efforts.

Table 4.3. An example list of application areas

Neurology Home Care

Microbiology and

Immunology

Cardiology Ophthalmology Mental Health

Pathology Dermatology Otolaryngology

Radiology Rheumatology Emergency Room

Pediatrics Surgery Obstetrics and

Gynecology

Environmental Setting refers to the type of physical environment that the physician or the

patient will be using during the telemedicine event. These settings can be dramatically different

and can range from a patient at a primary care hospital to a mobile patient, or a professional at a

31 of 60

fully equipped hospital to a professional being reached at home. Considering the physical

environment attributes of medical videoconferencing identified in [35], a difference in the

quality of the information transferred between two ends is inevitable regardless of the

communication channel, as long as the two sites involved are not identical in terms of

environmental setting. These physical attributes are usually related to the characteristics of the

physical location. Therefore, environmental setting was included in this taxonomy as the third

dimension. Table 4.4 illustrates some possible telemedicine settings that can be encountered

during a telemedicine event.

Table 4.4. Environmental settings

Location 1 Location 2

Large Hospital Large Hospital

Small Hospital Small Hospital

Outreach Clinic Outreach Clinic

Health Center Health Center

Home Home

Mobile Mobile

LeRouge et al. [35] has provided a list of physical environment attributes for

videoconferencing. These attributes are facilitating décor, quite/soundproof environment, privacy

of the exam room, space and room size, and room lighting. Some of these attributes are very

specific to videoconferencing. However, some of them can be generalized to various delivery

options. The main idea is to be able to provide a meaningful description of the physical setting

and environmental values with regards to the telemedicine event. The personal preferences and

skills of patients and physicians should also be taken into account in order to assess the

feasibility of a telemedicine system use by the parties involved. Some patients may be capable of

performing related tasks only through the help of others as noted by Kaufman et al.[36].

32 of 60

Therefore, setting attributes should also include the presence of assistive personnel and their

relevant skills.

Communication Infrastructure refers to the channels that are available for the transmission,

emission, or reception of data or information in any format. The communications infrastructure

can be based on wired networks, radio waves, fiber optic lines, and many other forms of

telecommunication technologies. Each of these technologies comes with their own limitations

and advantages. These need to be considered carefully before a telemedicine event occurs in

order to understand the possible limitations, available resources, and how these various factors

can affect the event. Table 4.5 illustrates communication infrastructure possibilities as a function

of the telecommunication technologies that can be used in a telemedicine event and the

bandwidth they provide.

Table 4.5. Telecommunication technologies and their bandwidth capabilities

Technology Bandwidth

Dial-up 33.6kbps

DSL 64kbps – 1.544Mbps up

128kbps-1.544Mbps down

Cable Modem 200kbps – 2Mbps

Wir

ed

High Speed 10/100Mbps to 1Gbps

802.11b 11Mbps

802.11g 54Mbps

802.16a 70Mbps

3G 144kbps-1Mbps Wir

eles

s

2G >128kbps

Delivery Options is the final dimension of the taxonomy and it refers to the applications

provided to conduct a telemedicine event by fully complying with the requirements generated

based on the other dimensions explained above, as well as the requirements posed by the

professionals and patients. Even though various delivery options exists in today’s world of

advanced technological innovations, delivery options in telemedicine can be categorized under

33 of 60

two main groups [13, 37]: (1) synchronous and (2) asynchronous. Synchronous and

asynchronous communication refers to information transactions that occur among two or more

number of participants simultaneously and at different points in time respectively [38]. Table 4.6

presents some examples of these delivery options based on these two main categories. The

chosen delivery options can have an important affect on the final quality of the telemedicine

event and the outcome.

Table 4.6. Delivery options

Synchronous Asynchronous

Audio Telephone,

Audioconferencing

Voicemail

Video Videoconferencing Video/Audiostreaming

Data Instant Messaging,

Shared Electronic

white boards

Paging, Fax, Email, Web

Pages, Store and

Forward, Web Forums

4.1.2 Interaction of Proposed Dimensions

These five dimensions can be grouped under two main themes. The first two are dimensions

strictly related to the medical field. Therefore, they are in the medical dimensions group. The rest

form a group of various dimensions (environmental setting, infrastructure, delivery options),

which are related to the way healthcare is delivered. This group in termed “delivery dimensions”

since all the dimensions have a common goal, that is, to support the medical dimensions’ needs

in order to deliver health services. A simple picture of the taxonomy is presented in Figure 4.1.

As Figure 4.1 illustrates, there is an additional group termed organizational dimension in the

taxonomy that is pervasive to all healthcare organizations and their activities. This group consists

of various aspects of the organization such as human resources and IT management. These issues

will not be addressed in this study since the main focus is on the higher levels of the taxonomy.

34 of 60

However, future studies will be conducted to more fully understand the effects of the

organizational dimension on the final outcome of the telemedicine event.

Figure 4.1. Telemedicine taxonomy

Two other important dimensions that were excluded from this study, but have significant

importance for future telemedicine efforts, are the cost dimension and the legal issues dimension,

which we grouped under the organizational dimension. The taxonomy excluded these two

dimensions so as to concentrate on the core dimensions of telemedicine and to provide a simple

way of identifying varying efforts. These core dimensions will eventually affect the cost and

legality of the telemedicine applications.

Legal issues and cost have been discussed and are very important in the healthcare industry.

One study [39] reported how laws regarding telemedicine are being enacted by different states

35 of 60

and how the cost of telemedicine applications is affecting the decision making process. Further

studies are needed to understand how the core dimensions can make a difference on the decision-

making processes of lawmakers and payers.

4.2 Stage 2 – Assessment of Objective and Subjective Quality Measures for Telemedicine

Based on the five core dimensions identified in the previous stage as factors affecting

telemedicine events, this phase of the study proposes to integrate specific technical factors into a

measurement technique, which can then be used to predict telemedicine capability within a

specific setting, potentially real-time. While doing so, further dimensions will be treated as

constants in laboratory experiments to identify the effects of the selected technical factors on the

telemedicine capability of a setting. This study will focus on two application areas

(ophthalmology and cardiology) and one application purpose within each application area. The

delivery mechanisms will be videoconferencing and audioconferencing over IP networks using

SIP. The next subsection provides a brief overview regarding the technical factors that will be

included in this study. The experimental test-bed details are presented next. This section

concludes with a discussion on objective and subjective tests that will be conducted.

4.2.1 Technological Factors Affecting Quality in Telemedicine over IP Networks

The Internet Protocol (IP) is a packet-based network protocol that enables the transmission of

data packets, from one end system to another based on address information carried in the data

message. It can be used with two different transport layer protocols: Transmission Control

Protocol (TCP) and User Datagram Protocol (UDP). TCP is a connection oriented, reliable

transport protocol designed for data transmission. However, it is not suitable for real-time

applications because the retransmission of packets may cause high delay and increase delay

36 of 60

variation, which can significantly affect the quality of real-time applications. There are other

problems associated with using TCP for real time applications, which are not mentioned here.

Hence, real-time applications use UDP, a connectionless transport layer protocol that does not

guarantee the arrival of a packet.

Real-time multimedia applications use two protocols that run over UDP: the real-time transport

protocol (RTP) and the RTP control protocol (RTCP) [40]. RTP is designed to carry data that

has real-time properties. RTCP is designed to monitor the quality of service and to convey

information about the participants in an on-going session. Even though RTP is the commonly

used protocol for real-time applications; RTP, by design, does not provide any mechanism to

ensure timely delivery or provide other quality-of-service guarantees, but relies on lower-layer

services to do so. Therefore, real-time multimedia applications are vulnerable against any

impairment that can happen in the lower layers of the network. The next subsection presents

these impairments followed by two additional subsections that provide an overview of audio and

video codecs available for multimedia applications.

4.2.1.1 Network Impairments

Since the Internet was not designed for real-time applications and only provides best effort

service, carrying real-time applications over the Internet presents a number of challenges. These

include lack of guarantee in terms of bandwidth, packet loss, delay, and jitter, all of which affect

the quality of voice and video over the Internet as reported in various studies [20, 24, 41].

Packet loss – Unlike circuit-switched networks, in packet switched networks no physical end-

to-end circuit is established [41]. Packets are transmitted from the source to the destination over

the Internet by the help of routers. Arriving packets to a router are first queued and then

transmitted one-by-one, usually with the first in first out (FIFO) policy. However, if the queue

37 of 60

(buffer) of a router is already full when a packet arrives, then this packet is dropped and

consequently, is not transmitted to its destination. Network congestion occurs when routers start

dropping packets. The effects of packet loss on real-time multimedia applications are critical.

During a voice conversation, human cognition can handle only a certain amount of packet loss. If

too many packets are lost, the voice becomes incomprehensible. For video the effect of extensive

packet loss is more acute. If packet loss happens, some parts of the video cannot be decoded and

displayed. It is easy to understand the effects of packet loss on the perceived quality of voice and

video applications. Researchers have developed various techniques to overcome, or at least ease,

the effects of packet loss on applications; some of these techniques are discussed in [16, 41].

Packet Delay – End-to-end packet delay is typically caused by a number of components [41]:

(1) codec delay is the time it takes to convert analog voice to digital data and vice versa, (2)

serialization delay is the time it takes to place a packet on the transmission line, and is

determined by the speed of the line, (3) queuing delay occurs at the various switching and

transmission points of the network, such as routers and gateways, where voice packets wait

behind other packets waiting to be transmitted over the same outgoing link, and (4) propagation

delay is the time required by signals to travel from one point to another, which is fixed as

determined by the speed of light. The effects of large packet delay become even more severe for

voice communications, as timing is an important characteristic of voice. This is especially true

when an interactive conversation is being transmitted on the network; delay effects can turn the

conversation into a half-duplex mode where one speaks and other listens and pauses to make

sure the other is done. Echo is another unwanted effect of packet delay. Various techniques were

also developed to overcome these problems over packet-based networks since in current circuit-

switched networks the primary source of delay is propagation delay.

38 of 60

Packet Delay Variation (Jitter) – Packet delay variation refers to the variation or gaps

between packet arrival times at the receiving buffer. This occurs due to the variability in queuing

and propagation delays. To eliminate the effects of this variation, usually a playout buffer is

used. The receiver holds the first packet in the buffer for a specific amount of time before

playing it out. Therefore, a small jitter is tolerable but large fluctuation causes difficulty in

decoding and playback, and cause quality degradation. The effects of delay variation are similar

to the effects of packet loss. Large variation in delay will result in some packets arriving long

after the playout time scheduled for them based on the buffer size. The receiver will discard

these packets since they are out of order.

4.2.1.2 Audio Codecs

Audio data does not contain as much redundant data as video data and hence, it is harder to

compress. Speech coding techniques can be categorized in three groups: (1) waveform coding,

(2) source coding, and (3) hybrid coding. They are used at high, low, and moderate bit rates

respectfully.

Waveform encoding is almost a lossless coding scheme since the resultant signal is very close

to the original one. The simplest form of this coding is Pulse Code Modulation (PCM). Many

codecs try to predict the value of the next sample from the previous ones and an error signal is

computed from the original and predicted signals. Another method that utilizes this error signal

for encoding is called Differential Pulse Code Modulation (DPCM). Other examples of

waveform coding are sub-band coding (SBC) and discrete cosign transformation (DCT). Source

codecs implement the idea of understanding how the speech signal is produced and sending

certain parameters of the signal to the decoder. Hybrid coding is a mix of these two techniques.

Analysis-by-Synthesis (AbS) coding is the most famous type of hybrid coding. Using these

39 of 60

coding techniques, a large number of audio codecs have been developed over time and below is

an overview of some of these codecs.

The ITU-T G.711 (PCM at 64Kbits/s) codec, also known as µ−law, is a variant of PCM codec,

which is commonly used in North America and Japan for digital telephony. It does not require

much CPU power and it provides good quality with simplicity. However, sometimes the

resulting bit rate may be higher compared to other codecs. Two other public standards by the

ITU-T for compressing voice data are G.721 (ADPCM at 32 Kbits/s) and G.723 (ADPCM at 24

and 40 Kbits/s). They both use Adaptive DPCM (ADPCM), which utilizes an adaptive prediction

and quantization scheme to increase the performance of DPCM coding. Another application of

ADPCM is the DVI codec, a recommendation from the Interactive Multimedia Association

(IMA) Digital Audio Technical Working Group. It compresses 16-bit linear PCM samples into

4-bit samples, yielding a compression rate of 4:1.

Finally, GSM stands for Global System for Mobile Communications and is a variant of LPC

called RPE-LPC (Regular Pulse Excited - Linear Predictive Coder). It is a European standard

originally for use in encoding speech for satellite distribution to mobile phones. Its use results in

very good compression with good quality output but is very costly in terms of performance.

4.2.1.3 Video Codecs

Video streaming is a resource and bandwidth intensive application type [24] that requires the

video to be compressed before transmission to utilize the existing resources efficiently without

saturating them. The goal of video compression is to remove the redundancy in the original

source signal, which will eventually reduce the amount of bandwidth required for transmission

[16]. There are three types of video compression coding, they are: (1) lossless coding, (2) lossy

coding, and (3) hybrid coding.

40 of 60

Lossless coding (e.g. Huffman coding) is a reversible process with the perfect recovery of

original data. Therefore no quality degradation exists due to lossless coding. Lossy (e.g. Source

coding) coding is an irreversible process in which the recovered data is degraded. Hybrid (e.g.

JPEG) coding is the one used by most multimedia systems and it combines both lossy and

lossless coding. H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 are the most popular video

codec standards. In this study, H.261 and H.263 will be used as video codecs during the

experiments.

H.261 is an ITU video-coding standard originally designed for ISDN lines. Its output bit rates

are multiples of 64Kbits/s. It is a constant-bit-rate codec with no constant quality and variable-

bit-rate encoding meaning that the encoding algorithm trades the picture quality against motion.

Therefore, to obtain higher quality, it is suitable to use this codec for scenes having a small

amount of motion. It supports only two resolutions: (1) Common Interchange Format (CIF),

which is 352x288 pixels and, (2) Quarter CIF (QCIF), which is 176x144 pixels.

H.263 is also an ITU video-coding standard originally designed for low bit rate

communications (less than 64Kbits/s – this limitation has now been removed). It uses a similar

coding algorithm with H.261 with some changes to improve the performance and error recovery.

As a result of these improvements, H.263 output stream is more resilient to packet loss, which

makes it very attractive for real-time communications over the Internet. It supports five

resolutions. In addition to CIF and QCIF, it provides resolution at SQCIF (128x96 pixels), 4CIF

(704x576 pixels), and 16CIF(1408x1152 pixels).

4.2.1.4 Working Around Impairments: Application and Network Level Quality of Service

Previous subsections summarized how certain network impairments can affect real-time

applications on IP-based networks. Currently two approaches exist to provide Quality of Service

41 of 60

(QoS) for real-time applications: (1) QoS at the application level and, (2) QoS at the network

level. Application-level QoS provides quality improvements without requiring changes of the

network infrastructure. In initial implementations of real-time applications, incoming data was

played out either immediately upon arrival or after a fixed delay. Since both methods lead to

significant signal degradation under high delay variance conditions, adaptive playout techniques

were introduced to make real-time applications more tolerant of delays and delay jitter and to

dynamically adjust the playback point [25]. Researchers have also studied reconstruction

methods at the receiver to compensate for packet loss in real-time applications. Various error

concealment methods for audio are summarized in Table 4.7.

Table 4.7 Error Concealment Techniques for Audio [41]

Name Technique

Silence

Substitution

Substitutes lost packet with silence. Causes voice clipping. Deteriorates voice quality

when packet size is large and loss rate is high.

Noise

Substitution

Substitutes lost packet with background noise. Better than silence substitution. Relies on

the ability of human brain to repair the received message if there is background noise.

Packet

Repetition

Substitutes the lost packet with the replays of the last correctly received packet.

Packet

Interpolation

Substitutes lost packet with a replacement packet produced based on the characteristics

of the packets in the neighborhood of the lost one (a.k.a. waveform substitution).

Frame

Interleaving

Reduces the effect of packet loss by interleaving voice frames across different packets.

Error concealment techniques for video try to recover the corrupted data by exploiting the

spatial and temporal redundancies of the video data [43]. The spatial-domain error concealment

algorithms interpolate the lost area using spatially neighboring image data and since these

algorithms recover an isolated lost macroblock (MB), which is made by the coded modification,

and provide good performance. On the other hand, temporal-domain error concealment schemes

utilize the previously decoded image data to recover the lost MBs where they estimate motion

42 of 60

vectors (MVs) for the lost MBs, and compensate for the lost MBs with the estimated MVs. Some

error concealment techniques are provided in Table 4.8.

Table 4.8 Error Concealment Techniques for Video [16]

Name Technique

Block

Replacement

Replaces the lost areas with the corresponding areas of the previous frame or field. Works quite

well in still parts of the picture but fails in areas where there is a lot of motion.

Linear

Interpolation

Replaces the lost areas with the linearly interpolated values calculated from the neighboring areas

of the same frame. Assumes that surrounding areas are correctly received and works well in a

uniform surface.

Motion

Vector

Replaces the lost areas with pixel blocks of the previous frame shifted by the average motion vector

of the neighboring blocks. Performance drops when the blocks have different motion vectors.

Hybrid

Technique

Uses both spatial and temporal redundancies to predict the lost MBs.

Development of network Quality of Service (QoS) features was partially motivated by the fact

that real-time traffic (as well as other applications) may sometimes require priority treatment to

achieve good performance on the Internet [44]. QoS can be achieved by managing router queues

and by routing traffic around congested parts of the network. The IETF proposed two models to

provide Internet QoS: Integrated Services (Int-Serv) [45] and Differentiated Services (Diff-Serv)

[46].

In IntServ, resources are reserved for each flow through the network using the Resource

ReSerVation Protocol (RSVP) [47]. When an application requests a specific QoS for its data

stream, the RSVP can be used to deliver the request to each router along the path and to maintain

router state to provide the requested service [44]. Current implementations of IntServ allow a

choice of Guaranteed Service [48] or Controlled-Load Service [49]. In Guaranteed Service

agreements, peak traffic is limited by a certain rate and packet size is restricted to be in a specific

range at all times. Based on these limitations and restrictions, a bandwidth requirement is

declared, and sufficient bandwidth is reserved on each hop to satisfy all the requirements of the

flow. If each node and hop can accept the service request, the flow should be lossless.

43 of 60

Controlled-Load Services [49] on the other hand uses only traffic specifications and does not

define any service request specifications. Hence, flows using this service should experience the

same performance as they would in a lightly loaded “best-effort” network.

Several reasons, including scalability problems, were reported for not using IntServ for IP-

based real time applications in [44]. To overcome these problems, a simpler framework and

architecture to support DiffServ was developed [46]. The primary goal of differentiated services

is to allow different levels of service to be provided for traffic streams on a common network

infrastructure [44]. In the Diff-Serv model, the QoS information is carried in a band within the

packet in the Type of Service (TOS) field in the IPv4 header or the Differentiated Service (DS)

field in IPv6 [50]. The TOS or the DS field is used to indicate the need for low-delay, high-

throughput, or low-loss-rate service. Backbone routers provide per-hop differential treatments to

different service classes as defined by Per Hop Behavior (PHB) that describes the forwarding

behavior a packet receives at a given network node. Despite the fact that DiffServ is a simpler

mechanism that provides performance improvements compared to “best effort” IP networks, it

has some shortcomings; it relies on ample network capacity for expedited forwarding traffic and

makes use of standard routing protocols that make no attempt to use the network efficiently [44].

One other type of network level QoS technique is provided by the Multiprotocol Label

Switching (MPLS) architecture offering IP networks the capability to provide traffic engineering

as well as a differentiated services approach to voice quality [44]. In IP networks, as packets

travel from one router to another, each router independently chooses a next hop for the packet,

based on its analysis of the packet's header and the results of running the routing algorithm [51].

Analysis of the packet header identifies the forwarding equivalence class (FEC) of a packet and

routing algorithm maps this FEC to a next hop. This is repeated at each hop until the packets

44 of 60

reach their destination. Notice that no distinction can be made between the packets with the same

FEC value in conventional IP networks. In MPLS, the assignment of a particular packet to a

particular FEC is done just once, as the packet enters the network [51] and hence the MPLS

separates routing from forwarding [44]. This FEC value is encoded as a short fixed length value

known as a "label" and when a packet is forwarded to its next hop, the label is sent along as well.

DiffServ and the MPLS can be combined to provide better QoS for real-time applications.

Regardless of the techniques developed for network QoS, the implementation of these

techniques is limited and available to only a small group of users. The reasons for this slow

adoption of network QoS techniques is discussed in [52] extensively and the conclusions of this

study suggests that the QoS community and researchers need to reach out and include business,

systems control, and marketing expertise in their efforts to get IP QoS meaningfully deployed

and used.

4.2.2 Experimental Test-bed

Previous sections identified the important factors that play a role on the perceived and

measured quality of voice and video over the Internet. This study will setup a test-bed where

these factors will be individually defined as variables within the experiments. Objective and

subjective evaluations will measure the effects of variance in these variables on the decisions

made for the selected telemedicine purpose. Figure 4.2 illustrates a summary of the application

and network components and how the identified factors can be positioned between these

components.

45 of 60

Figure 4.2 Application and Network Components

Figure 4.3 illustrates the simple test-bed setup for the experiments. There will be two

computers with video cameras to capture voice and video for an ophthalmology scenario and a

stethoscope for the cardiology scenario. Several software tools will be utilized in this test-bed to

capture audio and video, to transmit the captured information on the network, and to manipulate

and measure network impairments during the transmission.

Figure 4.3 Experimental Testbed

Hardware and equipments available for experiments of this study are listed in Table 4.9. Table

10 presents a list of tools that will be utilized during the experiments for manipulating network

46 of 60

parameters and for capturing and storing video and audio sequences. Network monitoring tools

presented in this table are selected from [53] where a comprehensive list of products can be

found.

Table 4.9 List of Available Hardware for the Testbed

Network Equipments

Make Model

Hub 1 SMC EtherEZ Hub 3605T 10Mbps

Hub 2 D-Link Hub DSH-5 100Mbps

Router 1 D-Link 2.4Ghz Wireless Router DI-614+

Router 2 Linksys EtherFast Cable/DSL Router BEFSR41 ver.3

Computers, Laptops, and Servers

Operating System CPU RAM

Laptop 1 Windows XP Home Edition 2.0 GHz 256 MB

PC 1 Windows XP Professional 1.8 GHz 256 MB

PC 2 Windows 2000 Professional 1.8 GHz 256 MB

PC 3 Windows 2000 Professional 1.8 GHz 256 MB

Proxy Server Linux Red Hat 7.3 1.8 GHz 256 MB

Table 4.10 Software List for the Testbed

Name Type Description

JMStudio Java based

Media Player

The Java Media Framework API (JMF) enables audio, video and other time-

based media to be added to applications and applets built on Java technology.

JMStudio is an application developed based on JMF which can capture, play,

record audio and video files. It can also receive and play RTP Media Streams.

Ethereal Packet

Capture Tool

It is a free network protocol analyzer for Unix and Windows that provides

features to examine data from live network or from a capture file on disk.

Distributed

Internet

Traffic

Generator

(D-ITG)

Network

Monitoring

Tool

D-ITG is a platform capable to produce traffic (network, transport and

application layer) and accurately replicate appropriate stochastic processes for

both IDT (Inter Departure Time) and PS (Packet Size) random variables

(exponential, uniform, cauchy, normal, pareto, etc.).

Netperf Thruput Tool

It provides general measures of performance of a network such as latency

between request and response of generic transactions across a TCP/IP

network. It is maintained by HP.

Bing

Pathrate

Pipechar

Bandwidth

Estimation

Tool

Bing is a point-to-point bandwidth measurement tool (hence the 'b'), based on

ping.

Pathrate measures end-to-end capacity.

Pipechar is a tool for reporting dynamic network characteristics in particular

the bottleneck bandwidth.

Traceping Ping It measures the packet loss to nodes along a route.

47 of 60

4.2.3 Experimental Procedures

The initial step for the experiments is to identify the telemedicine application area and purpose

under consideration. The two telemedicine application areas selected for this study are

ophthalmology and cardiology. The application purpose is currently restricted to diagnosis. The

next step is to obtain sample exam sequences for the selected application area and purpose. For

the experiments of diagnosis in ophthalmology application area, video sequences of an eye

examination session will be necessary. There are two possible ways of obtaining this video

sequence. One way is to request a readily available video sequence from the National Library of

Medicine (NLM) and feed this video sequence into the experimental test-bed for objective and

subjective measurement collection. Another way is to record a live session using the listed

devices in the previous section and use this self-obtained video sequence for testing purposes.

For the experiments of diagnosis in cardiology application area, audio sequences of heart beats

will be required. The first possible way of obtaining this audio sequence is to request it from a

source like NLM. Another possible way is to use electronic stethoscopes to capture the audio

sounds of heartbeats and directly feed this to the computer as audio input.

Once the audio and video files are captured and ready for use in the test-bed, the next step will

be to feed these files into the test-bed while manipulating the factors identified to have an effect

on the quality of degradation for voice and video over IP-based networks. Factors that will be

manipulated during the experiments are audio/video codecs, packet loss, packet delay, packet

delay variation, and bandwidth. As a result of these experiments, a set of distorted signals will be

collected and stored for future use in subjective tests. During the transmission of the original

signal over the test-bed, data for objective measurements will be collected. Ethereal will be used

as the main tool to monitor network traffic and to capture traffic on the network for further

48 of 60

packet and traffic analysis. In this test-bed, the sender (patient-end) will control the selection of

the codec; the router will control the loss rate, loss pattern, delay, and delay variation; and the

receiver (physician-end) will store the received signals, decode them, and use concealments

methods selected by the application in use to recover lost packets.

At this point, all values necessary to evaluate objective measures will be collected and stored.

The next step is to measure objective quality. Among the several objective quality measures

introduced in Section 3.2.1, the ITU-Emodel will be used for measuring audio quality for the test

sequences. This measure was chosen based on evidence that it is the only available measure that

does not require the original signal for calculations and it correlates well with the MOS values.

As mentioned in section 3.3.1, there are no objective measurements available other than the ones

proposed in [16] that can measure objective quality of a video sequences in the absence of the

original sequence. However, the VQM explained in section 3.3.1 can be used for this study to

measure the video quality of the distorted signals since it will be available as the original signal.

These will not affect the results of the final real-time tool for quality measurement because the

new tool will rely on the previously collected values in the database for assessment. The

objective measurement values calculated in this step will be stored in a database with the values

of the impairment factors. Table 4.11 illustrates the predicted fields of the quality database for

audio and video quality excluding objective measurement value field and MOS field.

Table 4.11 Fields of the proposed Quality Database

For Audio

Quality

Packet

Loss Rate

Consecutive

Lost Packets

Max.

Jitter

Max. Packet

Delay

Available

Bandwidth

Audio Codec

For Video

Quality

Packet

Loss Rate

Consecutive

Lost Packets

Max.

Jitter

Max. Packet

Delay

Available

Bandwidth

Video

Codec

Bit

Rate

Frame

Rate

49 of 60

Subjective measurements will follow the objective measurements. Selection of test subjects

will be based on availability. First, invitations to physicians familiar with telemedicine

applications in Loma Linda University Medical School will be sent. Based on the response rate,

if further recruitment of subjects is required, final year medical students will be recruited for

subjective tests. The ITU recommends that 4 to 40 test subjects be used for completing

subjective quality tests. Subjective tests will involve at least the minimum required number of

subjects. Since the use of subjective measurements for telemedicine related to voice and video

still is an immature area of research, this study will utilize different subjective measurement

techniques discussed in 3.2.2 and 3.3.2. Test subjects will be asked to view the recorded sessions

and provide their opinion for the questions asked in the standard. MOS scores of the subjective

test results will also be calculated and added as a new field to the quality database illustrated in

Table 4.11. The last step in this stage is to find a correspondence between objective and

subjective measures in the database. The quality database will be the final outcome of the second

stage in this study. A summary of the second stage is illustrated in Figure 4.4 below.

Figure 4.4 Process flow for the second stage of the study

50 of 60

4.3 Stage 3 – Development of SIP-based Videoconferencing Tool with Real-time

Telemedicine Capability Index

In this last stage, an existing SIP videoconferencing client, the CGUsipClient, will be enhanced

with a simple quality indicator based on the results obtained in the previous stage of this study.

The CGUsipClient was developed by the Network Convergence Laboratory (NCL) to provide

low-cost, low-bandwidth videoconferencing. It is a java-based client that utilized the Java Media

Framework (JMF) Sun libraries for voice and video handling. The video codecs supported by

this client are H.261 and H.263, the latter being the default codec for video communications. The

audio codecs supported are G.723, DVI, GSM, and G.711 (µ-law); the user can change the

default audio codec. Detailed information regarding the CGUsipClient architecture can be found

at [54]. Another study [32] reported the many useful features of this client for use in

telemedicine and how it can add value in the telemedicine setting.

The CGUsipClient will feature new user interface windows that will provide real-time quality

information, derived from the objective measures that will be collected in real-time during a

telemedicine session and the calculation of their correspondence to subjective measures using the

quality database. In order to achieve this goal, several improvements are required on the client.

First, a real-time objective measure collection module will be incorporated with the existing

client. This module will collect packet loss, delay, bit rate, and frames per second information

from the network. Second, a new module for calculating a correspondence to these objective

measures in terms of a subjective MOS value will be developed and incorporated into the

CGUsipClient. Finally, two graphical user interfaces (GUI) will be developed. The Session

information GUI will collect information regarding application area, purpose, and delivery

option (only audio, audio and video) before the session begins as part of objective measures.

51 of 60

Based on this information, relevant quality database will be used for calculations. The

Telemedicine Capability Index GUI will provide the outcomes of the correspondence

calculations in real-time to the user. A snapshot of the predicted Telemedicine Capability Index

GUI is provided in Figure 4.5. One final improvement can be to add a module to obtain instant

evaluations from the users and add these values to the relevant session database for future use.

Figure 4.5 GUI for Telemedicine Capability Index Indicator

4.4 Research Methodology

This study focuses on three research objectives. First, provide a telemedicine taxonomy as a

method to classify different telemedicine events while defining them based on five dimensions.

Second, evaluate the quality of information necessary to make medical decisions under

fluctuating conditions of network and application parameters. Third, develop an artifact that

provides a real-time quality and capability index for users based on evaluation results. To meet

these research objectives, a hybrid research methodology is utilized.

This study will first define a new taxonomy based on the exiting definitions and theories for

telemedicine after an extensive literature review. Later, an evaluation study will be conducted to

complete the second stage. There are two types of evaluation – formative and summative. As

described in [55] (p.208) “Formative evaluation is intended to help in the development of the

52 of 60

programme, innovation or whatever is the focus of evaluation. Summative evaluation

concentrates on assessing the effects and effectiveness of the programme.” In the context of this

study, the formative evaluation will address the effects of impairments caused by the network or

application parameters on perceived quality and hence, the medical decision making capability

of a physician. The evaluation will utilize two different data collection techniques – objective

data collection (quantitative) and subjective data collection (qualitative). Quantitative methods

will be used to analyze the results of the experiments.

The results of the formative evaluation will then be used to build an artifact. As stated by

Hevner et al. in [56] “Design science,…, creates and evaluates IT artifacts intended to solve

identified organizational problems.” They also state that the goal of behavioral science research

is truth whereas the goal of design science research is utility. “Truth informs design and utility

informs theory [56].” Utility is also one of the four features of evaluation research design as

reported in (p.209) [55]. Hence, both research methodologies are expected to produce an

outcome that is useful to its intended audience. The audience of this study is users of

telemedicine systems. Using formative evaluation and design research methodologies together to

build an artifact will provide utility for this audience.

4.5 Contributions and Potential Implications

The first research contribution of this study will be the telemedicine taxonomy, which

addresses multiple dimensions of telemedicine environments that need to be considered while

planning such systems or operating them. Implications of using this taxonomy may be important

for physicians, patients, medical organizations, and researchers. Using this taxonomy can also

help medical providers in understanding the building blocks of telemedicine systems and provide

them with possible explanations as to why their telemedicine system is a success or a failure.

53 of 60

Moreover, providers can outline their current status under each dimension, learn where they fit in

the taxonomy, and utilize this positioning to initiate new services that they are capable of

providing. This taxonomy may also help patients grasp the unfamiliar world of telemedicine,

inform their telemedicine expectations, and evaluate their own telemedicine capabilities at home

or in their local communities. This may eventually improve the acceptance and adoption of

telemedicine applications among patients. Organizations, such as HMOs and hospitals, can make

use of this taxonomy while identifying which dimensions are most critical for the services they

provide currently or in the future. It can be used as a guideline for planning or evaluating existing

or new services by hospital management. Finally, for the researchers, this taxonomy presents an

original effort to put all important telemedicine dimensions and their interactions together in

order to develop a comprehensive taxonomy and provides a method to compare and contrast

different efforts and studies in the field.

The channel used to deliver telemedicine services is always limited. It should be used wisely to

allocate enough capacity based on the priority of data required on each end. Unfortunately, every

channel and setting can support only a limited variety of medical dimensions. Before starting a

telemedicine event, it is useful to understand the capabilities of the support dimensions in hand

and what types of scenarios under specific medical dimensions can utilize that capability. The

measurement results of this study will help to further understand the acceptable quality levels for

confidently making medical decisions in a given telemedicine channel. The subjective results of

the experiment will be a first step in understanding the effects of impairments on telemedicine

events. Even though many studies have been conducted to measure user acceptance of

telemedicine systems, very few studies consider quality of information and its effects on decision

54 of 60

making. Using standards to measure perceived quality, this study will extend the telemedicine

research within the information systems field.

The final contribution of this study will be a videoconferencing client with a capability index

indicator. This new tool will fill a technology gap by providing a low-cost, low-quality,

telemedicine tool that can be applied in several settings. The quality indicator will help users

make decisions regarding the sessions they are planning through the existing channels. Even

though the telemedicine capability index will be limited to a very small subset of possible

telemedicine settings (cardiology and ophthalmology diagnosis), it can be improved by further

research since the procedures and tests necessary to conduct the experiments are selected from

standards available today. Moreover, the results of this study can be a starting point for

understanding how the objective/subjective audio/video quality assessment should be carried out

to extend the results to a larger subset by further experiments.

4.6 Timeline

Timeline for this study is provided in Figure 4.6 below.

55 of 60

Figure 4.6 Study Timeline

56 of 60

References

[1] C. I. Jones, "Why have health expenditures as a share of GDP risen so much?," National

Bureau of economic research, Working Paper 9325, 2002.

[2] S. K. Moore, "Extending Healthcare's Reach: Telemedicine can help spread medical

expertise around the globe," IEEE Spectrum, vol. 39, pp. 66 - 71, 2002.

[3] J.-M. Ho, J.-C. Hu, and P. Steenkiste, "A conference gateway supporting interoperability

between SIP and H.323," presented at the ninth ACM International Multimedia

Conference, Ottawa, Canada, 2001.

[4] Pricewaterhouse Coopers, "HealthCast 2010: Smaller World, Bigger Expectations,"

PricewaterhouseCoopers 1999.

[5] H. C. J. Linderoth, "Implementation and Evaluation of Telemedicine -a Catch 22?,"

presented at 35th Hawaii International Conference on Systems Sciences, Hawaii, USA,

2002.

[6] K. Hung and Y. T. Zhang, "On the feasibility of the usage of WAP devices in

telemedicine," presented at IEEE EMBS International Conference on Information

Technology Applications in Biomedicine, Arlington, Virginia - USA, 2000.

[7] R. L. Bashshur, "Telemedicine and Health Care," Telemedicine Journal and e-Health,

vol. 8, pp. 5-12, 2002.

[8] E. A. Miller, "Telemedicine and doctor-patient communications: an analytical survey of

the literature," Journal of Telemedicine and Telecare, vol. 7, pp. 1-17, 2001.

[9] W. H. DeLone and E. R. McLean, "Information systems success revisited," presented at

35th Hawaii International Conference on Systems Sciences, Hawaii, USA, 2002.

[10] J. G. McDaniel, "Improving system quality through software evaluation," Computers in

Biology and Medicine, vol. 32, pp. 127-140, 2002.

[11] R. L. Bashshur, T. G. Reardon, and G. W. Shannon, "Telemedicine: A New Health Care

Delivery System," Annual Review of Public Health, vol. 21, pp. 613-37, 2000.

[12] American Nurses' Association Developing telehealth protocols : a blueprint for success.

Washington, DC: American Nurses Association, 2001.

[13] M. M. Maheu, P. Whitten, and A. Allen, E-Health, Telehealth, and Telemedicine: A

Guide to Start-Up and Success, First ed. San Francisco: Jossey-Bass Inc., 2001.

[14] Committee on Evaluating Clinical Applications of Telemedicine, Temeledicine: A Guide

to Assessing Telecommunications in Health Care. Washington, D.C.: National Academy

Press, 1996.

[15] T. A. Hall, "Objective Speech Quality Measures for Internet Telephony," presented at

Proceedings of SPIE on Voice Over IP (VoIP) Technology, 2001.

[16] S. Mohamed, "Automatic Evaluation of Real-Time Multimedia Quality: a Neural

Network Approach." Rennes: University of Rennes I, 2003.

[17] D. P. W. Ellis, "Evaluating Speech Separation Systems," in Perspectives on Speech

Separation, P. Divenyi, Ed. New York: Kluwer Academic Publishers, 2004.

[18] S. Wang, A. Sekey, and A. Gersho, "An objective measure for predicting subjective

quality of speech coders," IEEE Journal on Selected Areas in Communications, vol. 10,

pp. 819 - 829, 1992.

57 of 60

[19] S. Voran, "Objective Estimation of Perceived Speech Quality-Part I: Development of the

Measuring Normalizing Block Technique," IEEE Transactions on Speech and Audio

Processing, vol. 7, pp. 371-382, 1999.

[20] A. P. Markopoulou, F. A. Tobagi, and M. J. Karam, "Assessing the quality of voice

communications over internet backbones," IEEE/ACM Transactions on Networking, vol.

11, pp. 747-760, 2003.

[21] A. Watson and M. A. Sasse, "Measuring perceived quality of speech and video in

multimedia conferencing applications," presented at The Sixth ACM International

Conference on Multimedia, Bristol, United Kingdom, 1998.

[22] A. Watson, "Assessing the Quality of Audio and Video Components in Desktop

Multimedia Conferencing," in Department of Computer Science. London, UK: University

of London, 2001.

[23] S. Wolf and M. Pinson, "Video Quality Measurement Techniques," U.S. DEPARTMENT

OF COMMERCE - National Telecommunication and Information Administration NTIA

Report 02-392, June 2002.

[24] D. A. Rosenthal, "Analyses of selected variables effecting video streamed over IP,"

International Journal of Network Management, vol. 14, pp. 193-211, 2004.

[25] A. P. Markopoulou, "Assessing the Quality of Multimedia Communications Over

Internet Backbone Networks," in Department of Electrical Engineering. Stanford, CA:

Stanford University, 2002.

[26] R. H. Eikelboom, K. Yogesan, C. J. Barry, I. J. Constable, L. Jitskaia, P. H. House, and

M. L. Tay-Kearney, "Methods and limits of digital image compression of retinal images

for telemedicine," Investigative Ophthalmology and Visual Science, vol. 41, pp. 1916-24,

2000.

[27] P. C. Cosman, R. M. Gray, and R. A. Olshen, "Evaluating quality of compressed medical

images: SNR, subjective rating, and diagnostic accuracy," Proceedings of the IEEE, vol.

82, pp. 919-932, 1994.

[28] A. Przelaskowski, "Vector quality measure of lossy compressed medical images,"

Computers in Biology and Medicine, vol. 34, pp. 193-207, 2004.

[29] P. Dev, D. Harris, D. Gutierrez, A. Shah, and S. Senger, "End-to-End Performance

Measurement of Internet Based Medical Applications," presented at American Medical

Informatics Association (AMIA) Symposium, 2002.

[30] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M.

Handley, and E. Schooler, "SIP: Session Initiation Protocol," Internet Engineering Task

Force RFC 3261, June 2002.

[31] H. Schulzrinne and J. Rosenberg, "The Session Initiation Protocol: Internet-centric

signaling," IEEE Communications Magazine, vol. 38, pp. 134 - 141, 2000.

[32] B. Tulu, S. Chatterjee, T. Abhichandani, and H. Li, "Secured video conferencing desktop

client for telemedicine," presented at 5th International Workshop on Enterprise

Networking and Computing in Healthcare Industry (Healthcom), Santa Monica, CA,

2003.

[33] K. Arabshian and H. Schulzrinne, "A SIP-based medical event monitoring system,"

presented at 5th International Workshop on Enterprise Networking and Computing in

Healthcare Industry (Healthcom), Santa Monica, CA, 2003.

[34] M. J. Field, Telemedicine: A Guide to Assessing Telecommunications in Health Care.

Washington, D.C.: National Academy Press, 1996.

58 of 60

[35] C. LeRouge, M. J. Garfield, and A. R. Henver, "Quality attributes in Telemedicine Video

Conferencing," presented at 35th Hawaii International Conference on Systems Sciences,

Hawaii, USA, 2002.

[36] D. R. Kaufman, V. L. Patel, C. Hilliman, P. C. Morin, J. Pevzner, R. S. Weinstock, R.

Goland, S. Shea, and J. Starren, "Usability in the real world: assessing medical

information technologies in patients’ homes," Journal of Biomedical Informatics, vol. 36,

pp. 45-60, 2003.

[37] E. Coiera, Guide to Medical Informatics, The Internet and Telemedicine, First ed.

London,UK: Chapman & Hall, 1997.

[38] R. L. Glueckauf, J. D. Whitton, and D. W. Nickelson, "Telehealth: The New Frontier in

Rehabilitation and Health Care," in Assistive Technology: Matching Device and

Consumer for Successful Rehabilitation, M. J. Scherer, Ed., 1st ed. Washington D.C.:

Amarican Psychological Association, 2002.

[39] T. L. Huston and J. L. Huston, "Is Telemedicine a Practical Reality?," Communications

of the ACM, vol. 43, pp. 91-95, 2000.

[40] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A Transport Protocol

for Real-Time Applications," Internet Engineering Task Force (IETF) RFC 3550, July

2003.

[41] M. Hassan, A. Nayandoro, and M. Atiquzzaman, "Internet telephony: services, technical

challenges, and products," IEEE Communications Magazine, vol. 38, pp. 96 - 103, 2000.

[42] C. Demichelis and P. Chimento, "IP Packet Delay Variation Metric for IP Performance

Metrics (IPPM)," Internet Engineering Task Force (IETF), RFC 3393 November 2002.

[43] J.-W. Suh and Y.-S. Ho, "Error Concealment Techniques for Digital TV," IEEE

Transactions on Broadcasting, vol. 48, pp. 299-306, 2002.

[44] B. Goode, "Voice Over Internet Protocol (VoIP)," Proceedings of the IEEE, vol. 90, pp.

1495-1517, 2002.

[45] S. Shenker and J. Wroclawski, "General Characterization Parameters for Integrated

Service Network Elements," Internet Engineering Task Force (IETF), RFC 2215,

September 1997.

[46] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W.Weiss, "An Architecture for

Differential Services," Internet Engineering Task Force (IETF), RFC 2475, December

1998.

[47] R. Braden, L. Zhang, S. Berson, S. Herzog, and S. Jamin, "Resource Reservation

Protocol (RSVP) - Version 1 Functional Specification," Internet Engineering Task Force

(IETF), RFC 2205, September 1997.

[48] S. Shenker, C. Partridge, and R. Guerin, "Specification of Guaranteed Quality of

Service," Internet Engineering Task Force (IETF), RFC 2212, September 1997.

[49] J. Wroclawski, "Specification of the Controlled-Load Network Element Service," Internet

Engineering Task Force (IETF), RFC 2211, September 1997.

[50] K. Nichols, S. Blake, F. Baker, and D. Black, "Definition of the Differentiated Services

Field (DS Field) in the IPv4 and IPv6 Headers," Internet Engineering Task Force (IETF),

RFC 2474, December 1998.

[51] E. Rosen, A. Viswanathan, and R. Callon, "Multiprotocol Label Switching Architecture,"

Internet Engineering Task Force (IETF), RFC 3031, January 2001.

59 of 60

[52] G. J. Armitage, "Revisiting IP QoS: why do we care, what have we learned? ACM

SIGCOMM 2003 RIPQOS workshop report," ACM SIGCOMM Computer

Communication Review, vol. 33, pp. 81 - 88, 2003.

[53] L. Cottrell, "Network Monitoring Tools," http://www.slac.stanford.edu/xorg/nmtf/nmtf-

tools.html#public, accessed on: November 4, 2004.

[54] B. Tulu, T. Abhichandani, S. Chatterjee, and H. Li, "Design and Development of a SIP-

based Video-Conferencing Application," presented at IEEE 6th International Conference

on High Speed Networks and Multimedia Communications, Estoril, Portugal, 2003.

[55] C. Robson, Real Worl Research, Second ed. Malden, Massachusetts: Blackwell

Publishers Inc., 2002.

[56] A. R. Hevner, S. T. March, J. Park, and S. Ram, "Design Science in Information Systems

Research," MIS Quarterly, vol. 28, pp. 75-105, 2004.

assessing objective and subjective quality of audio/video...

Documents