Quality Planning for Distributed

Collab orative Multimedia Applications

PhD Thesis

Mark Claypool

Advisor John Riedl

University of Minnesota

Computer Science Department

September

Acknowledgements

There are many p eople who have help ed me in this work either directly by adding

their work to mine or indirectly by giving me vision and supp ort However ab ove

all there are two p eople who deserve my deep est gratitude Professor John Riedl and

my wife Ka jal As my advisor John has contributed to all levels of this thesis from

guidance on writing C co de and English grammar rules to exp erimental design and

data analysis to a rich exciting vision of science research Most imp ortantly

he has provided motivation to p ersevere through the slow dicult parts of research

motivation to pursue research problems that make a dierence in the world and

motivation to pursue a career as a professor myself I hop e to emulate much of

Johns style when I advise my future graduate students

Equally imp ortant Ka jal has b een a pillar of strength and supp ort She has b een

understanding of the uctuations in the demands on the time of a graduate student

She has b een patient through the many years she worked to supp ort us while I was

still in school working on this thesis She has b een inspirational when I was down

ab out progress or down ab out some exp eriment results And her technical knowledge

has provided me with nudges in the right direction when my research lacked fo cus I

hop e I can b e as supp ortive of her during her PhD quest as she was during mine

I would like to thank the members of my examining committee Professors John

Carlis Shashi Shekhar David Lilja and George Wilcox Professors Shekhar and Lilja

were also on my Masters committee work which was later molded to this thesis

Professors Carlis and Wilcox collab orated with me on the Flying interface work i

which b ecame Chapter of this thesis Professor Carlis provided valuable insights

during the very last phase of this do cument including writing the abstract and making

my nal defense a successful talk

I also must thank those who long ago gave me the to ols to achieve a PhD Pro

fessors Steven Janke David Ro eder and Michael Siddoway professors at Colorado

College taught me the elegance of academic work and gave me insights into com

puter science My high school teacher Herr Berndt gave me the appreciation for

the richness of life a vision which help ed me see the b eauty in computer science as

well Ab ove all my parents taught me how to set my sites high and how to succeed

as well as providing me with my rst computer a TRS mo del I I I Little did any

of us know it at the time but that computer b ecame the reason I am writing this

do cument to day

I would like to thank several additional p eople for their direct contributions to

my work Dan Frankowski Mike Maley Mike Stein Vahid Mashayekhi and Jo e

Hab ermann all worked with me closely during several smaller research pro jects that

led to this thesis

Without the contributions of the ab ove p eople this work would never have b een

completed nor even b egun

Vita

Mark Claypo ol was b orn in Washington DC on September He attended

primary and secondary school in Colorado and nished his secondary education at

Nuremburg American High School in Nuremburg Germany In he completed

his Bachelor of Arts in Mathematics at Colorado College in Colorado Springs CO

He entered the Computer Science program at the University of Minnesota in the fall

of He received his MS and PhD in Computer Science in the summer of

and the summer of resp ectively iii

Abstract

The tremendous p ower and low price of to days computer systems have created the

opp ortunity for exciting applications rich with graphics audio and video These new

applications promise to supp ort and even enhance the work we do in teams by allow

ing users to collab orate across b oth time and space Despite their exciting p otential

building distributed collab orative multimedia applications is very dicult and pre

dicting their p erformance can b e even more dicult Performance predictions must

deal with future hardware future software and future users and their requirements

The rate of change in new technologies continues to accelerate with each new genera

tion of hardware and software introducing new capabilities Furthermore distributed

environments have a wider variety of p otential congurations than do centralized

systems

To determine the computer resources needed to meet distributed application de

mands we have developed a exible mo del and a metho d for applying it that allows

us to predict multimedia application p erformance from the users p ersp ective Our

mo del takes into account the comp onents fundamental to multimedia applications

latency jitter and data loss The contributions of this thesis to b oth computer sci

entists and computer system developers are a multimedia application quality

mo del and metho d for predicting application p erformance and evaluating system de

sign tradeos detailed p erformance predictions for three distributed collab orative

multimedia applications an audio conference a ying interface to a D scientic

database and a collab orative ight simulator the eects of system improvements iv

on these multimedia applications a demonstration of measures of jitter that have

b een used by jitter researchers showing all reasonable choices of jitter metrics are sta

tistically similar and exp erimentbased studies of the eects of highp erformance

pro cessors realtime priorities and highsp eed networks on jitter

In predicting the b ottlenecks in quality for the ab ove three applications we have

identied three general traits pro cessors are the b ottleneck in p erformance for

many multimedia applications networks with more bandwidth often do not in

crease the quality of multimedia applications and p erformance for many multime

dia applications can b e improved greatly by shifting capacity demand from computer

system comp onents that are heavily loaded to those that are more lightly loaded

Contents

Introduction

Motivation

Problem

Our Solution

Applications

A Taxonomy of Computer Supp ort for Multimedia Quality

Overview

Related Work

Taxonomy

Quality of Service

Quality

Computer Supp ort for Multimedia

Capacity

Scheduling

Reservation

Summary

Jitter

Overview

Hyp otheses vi

Related Work

Teleconferencing Systems

Delay Buering

Realtime Performance

Shared Exp erimental Design

Pro cessor Exp eriments

Sp ecic Exp erimental Design

Jitter versus Pro cessor Load

Jitter versus Pro cessor Power

Section Summary

Realtime Priority Exp eriments

Sp ecic Exp erimental Design

Results and Analysis

Section Summary

Network Exp eriments

Sp ecic Exp erimental Design

Results and Analysis

Section Summary

Summary

Application Metho d

Overview

Related Work

Benchmarks

Capacity Planning

Study Application

Mo del

Micro Exp eriments

Macro Exp eriments

Predictions

Summary

Audio conferences

Overview

Related work

Audio conference Exp eriments

Measuring Pro cessor Load

Analyzing UDP

Using Silence Deletion

Digital Audio Compression

Mo del

Micro Exp eriments

Design

Data Collection

Results

Macro Exp eriments

Design

Results

Predictions

Increasing Participants

Increasing Silence

Potential Audio conference Improvements

Quality

Summary

Flying through the Zo omable Brain Database

Overview

Related Work

Scientic Visualization

Neuroscience

Compression

Network Performance

Disk Performance

Mo del

Micro Exp eriments

Predictions

Networks

Compression

Disks

Pro cessors

Quality

Summary

The Virtual Co ckpit

Overview

Related work

Distributed Interactive Simulation

Geographic Information Systems

Mo del

Micro Exp eriments

Macro Exp eriments

Predictions

Networks

Pro cessors

Quality

Summary

Conclusions

Future Work

List of Figures

Relative Bytes Required for Multimedia

Our Contributions

The Pro cess for Computing Application Quality

Application Quality Space

The Region of Scarce Resources

Taxonomy of Multimedia Application Quality

A JitterFree Stream

A Stream with Jitter

The Reection Eect

Jitter versus Packet Rate

Jitter versus Packet Rate for Byte Packets

Sun SLC Jitter versus Receiver Load

Jitter versus Receiver Load

Eects on Jitter

Jitter versus Power

Jitter versus Receiver Load

Zo om of Jitter versus Receiver Load

Interarrival Times for Dierent Networks and Network Loads

Jitter versus Network Bandwidth

Jitter versus Years xi

Quality Planning Metho d

Scop e of Quality Planning Mo dels

Quality Planning Mo del

The Counter Pro cess

Count versus Number of Counters

Audio conference Mo del

Absolute Silence Deletion Algorithm

Ham Silence Deletion Algorithm

Exp onential Silence Deletion Algorithm

Dierential Silence Deletion Algorithm

Pro cessor Time for Deletion Algorithms

Sun IPX Pro cessor Load Without Silence Deletion

IPX Pro cessor Load with Silence Deletion

Comparison of Pro cessor Loads

Percentage of Packets Removed

Eects of Faster Pro cessors

Eects of Times Faster Network

Eects of Multicast

Eects of Compression

Eects of DSP

Eects of DSP

Eects of Mixing

Times p er Byte

Eects of Times Faster Audio

Eects of More Silence Deleted

Jitter Comp ensation

Jitter Comp ensation Area versus Buer Size

Jitter versus Jitter Comp ensation Area

Audio conference Quality versus Pro cessor or Network Increase

Audio conference Quality versus Users

Audio conference Quality versus Load

Flying Mo del

Pro cessor load for JPEG Compression and Decompression

Bandwidth versus Number of Simultaneous Users

Bandwidth versus Servers

Bandwidth versus Simultaneous Users

Bandwidth versus Servers

Bandwidth versus Simultaneous Users

Bandwidth versus Number of Servers

Bandwidth versus Simultaneous Users

Bandwidth versus Number of Servers

Client Application Quality

Quality versus Clients

Bandwidth requirements for the Zo omable Brain Database

A Prop osed National Data Highway Network

Bandwidth requirements for the Zo omable Brain Database

Actual Path versus Dead Reckoning Path

Dead Reckoning Accuracy

Virtual Co ckpit Mo del

Network Bandwidth versus Soldiers

Pro cessor Load versus Soldiers

Virtual Co ckpit Quality versus Soldiers

Virtual Co ckpit Quality versus Soldiers

Virtual Co ckpit Quality versus Soldiers

Virtual Co ckpit Quality

List of Tables

Three Distributed Collab orative Multimedia Applications

Hierarchy of Possible Jitter Reducing Techniques

Correlation Among Jitter Measures

used in Pro cessor Exp eriments

Bare Counts

Values for Sun SLC and Sun IPX Line Fits

Predicted and Measured Loads from the Macro Exp eriments

Predicted versus Actual Jitter

Flying Throughput

Flying Comp onent Loads

Hardware Flying Throughput

Pro cessor Load Breakdown

Comp onent Predictions for the Virtual Co ckpit xiv

Chapter

Introduction

Motivation

If the automobile had fol lowed the same development cycle as the com

puter a RollsRoyce would today cost $ and get a mil lion miles per

gal lon Rob ert X Cringely InfoWorld

Never b efore has industry seen the tremendous change that is transforming to

days There has b een a tremendous growth in computer p erformance

and reliability accompanied by an equally dramatic decrease in price Today a few

thousand dollars will buy more p erformance memory and disk storage than a million

dollar computer b ought in Since the p erformance growth rate for sup er

computers and mainframes has b een just under p er year and the p erformance

growth rate for micropro cssorbased computers has b een ab out p er year

There has b een an even more explosive growth in computer networks The number of

hosts on the Internet has roughly tripled in the time from January to January

And the World Wide Web has grown substantially faster than the Internet

The rate of the World Wide Webs growth has b een and continues to b e exp onential

doubling in number of hosts in under six months

With worldwide networks connecting a myriad of p owerful computers there is

a growing need for collab orative applications For thousands of years p eople have

done their b est work when they can interact and work together Only recently they

discovered the computer and now seek to use it to solve problems The application

of the computer to co op erative work can supp ort and even enhance the way we work

together Computer supp orted distance learning provides educational opp ortunities

to remote students Scientic collab oration can b e enhanced when sup

p orted by computers as can military training Businesses can also b enet

from computerassisted collab oration Business transactions can b e more ecient

through the use of electronic cash Business costs and errors can b e reduced by

the use of Electronic Data Interchange applications

Computersupp orted co op erative work CSCW helps to overcome timespace lim

itations and allows p eople to share information from distributed lo cations at the same

or dierent times Collab orative applications can b e enhanced by multimedia People

communicate b est when they can draw pictures and use voice inections and b o dy

language rather than simply type text Communication can more closely resemble

facetoface interaction when computers are used through the use of recordplayback

onscreen sp eaker identication o or control and subgroup communication

Multimedia can even enable collab oration environments that do not currently exist

to day such as virtual reality

Problem

A Picture is worth a thousand words Thats why it takes a thousand

times longer to load Anonymous

Despite the growing need and prevalence of distributed collab orative multimedia

applications planning for prop er computer systems to supp ort multimedia eectively

is still dicult Multimedia has greater system requirements than traditional text

based applications Figure gives a coarse illustration of the number of bytes

7000K

Bytes 720K for 1 page

300K

38K 2K

text graphics color audio video

Figure Relative Bytes Required for Multimedia The bars indicate the number of bytes

required to convey a pap ersized page of information in each type of media We assume a standard

x inch piece of pap er containing lines of text characters p er line Text is the number of

bytes required to store the page in bit ASCI I characters Graphics is the number of bytes required

to render those characters in p ostscript Color is the number of bytes to convey the p ostcript in

bit color Audio is the number of bytes required to read that page with an audio sampling rate of

bytessecond Video is the number of bytes required to read that page of text over a small

framessecond video stream

required to convey one page of information in various media Each step up in media

requires increased computing capabilities Text graphics and color require relatively

little computing capabilities Audio is feasible on most workstations currently b eing

marketed Video however requires highend multimedia workstations And when

multimedia applications supp ort multiusers the system requirements are increased

even more

Moreover the dierent media each place dierent constraints on computer sys

tems For instance human eyes can smo oth over o ccasional glitches in a video stream

more readily than human ears can smo oth over breaks in an audio stream Hav

ing the computer system place greater emphasis on preserving audio data than on

preserving video data might b e imp ortant for user satisfaction

In addition the constraints for each type of media can vary from application to

application For example the acceptable delay for an audio broadcast application

such as a radio program may b e far more than the acceptable delay for an audio con

ference In an audio conference users require low latencies so that the conversation

is as lifelike as p ossible However in an audio broadcast program the users do not

interact allowing a larger delay to go unnoticed You could imagine a case where

a user downloads an entire radio program overnight and then plays it back in the

morning In this case a delay of over twelve hours might b e quite acceptable

Multimedia applications that must run in a distributed environment and supp ort

many users p ose even more challenges Network software comp onents for multimedia

environments have often b een overshadowed by issues such as human interface design

or graphics realism This has led to the fact that there are not many go o d

solutions to network problems that scale to many users Even schemes that have

readily apparent b enets such as IP multicast have b een slow to b e adopted Even

commercial systems are nding the limits to current technology in creating networked

environments for large numbers of users America Online provides lowbandwidth

mltiuser games messaging and chat services for ab out a million subscrib ers yet its

recent attempt to provide unlimted access time created to o much demand for their

mo dem servers to handle

Collab orative applications have additional communication requirements and para

digms that need to b e explored and supp orted Co op erative pro cesses among humans

involving issues of conict selective attention selfinterest trust and privacy are not

understo o d well enough esp ecially when computers are used to supp ort such pro cesses

Separating time and space introduces new problems in building workenhancing

collab orative applications such as the continued need for facetoface meetings and

supp ort of roles in normal communication environments

Planning is the rst fundamental step in developing a software system And not

only are distributed collab orative multimedia applications dicult to build their

p erformance is dicult to predict also Part of the diculty arises from the diculty

in predicting the future Performance predictions must deal with future hardware

future software and future users and their requirements Even the users themselves

often cannot predict what they want The rate of change in new technologies continues

to accelerate Often by the time the p erformance of a technology is understo o d it

is obsolete Furthermore distributed environments have a wider variety of p otential

congurations that do centralized systems And each new generation of hardware

and software introduces new capabilities

Our Solution

In order to build systems that will adequately supp ort distributed collab orative multi

media applications it is imp ortant to predict application p erformance as the number

of users increases and evaluate p erformance and cost tradeos for dierent system

designs We have developed a solution to predict b ottlenecks in the p erformance of

such applications The contribution of our solution is depicted by Figure

Our solution consists of a generalpurp ose metho d to predict the p erformance of

multimedia applications At the heart our metho d is a exible mo del adjustable

to applications with dierent user requirements and systems with dierent system

designs and hardware Our mo del uses a quality metric as a means of measuring the

p erformance of an entire computer system

One indication of the p erformance of an entire computer system is the users opin

ions on the multimedia quality of the applications they run Multimedia quality is a

measure of the p erformance of a multimedia application based on the requirements

exp ected by the user If the user p erformance requirements are met application qual

ity will b e acceptable If the user p erformance requirements are not met application

quality will b e unacceptable We have developed a quality planning mo del to aid in

designing systems that meet users quality requirements for multimedia applications

in the future

Although we often think of a multimedia application as a continuous stream of

"Before"

Study Model Predict

Model Multimedia Applications Quality Study Delay Predict Users Jitter Hardware Data Loss Metrics

"After"

Figure Our Contributions The area on the left lab eled Multimedia represents the space

of future multimedia applications A and B are two dierent applications Before shows how

the p erformance of A and B is predicted without the contribution of our metho d After shows

how the p erformance of A and B is predicted with the contribution our metho d The two gures

on the right represent the users of the applications The shaded triangles represent p erformance

parameters that the users care ab out

data computer systems handle multimedia in discrete events An event may b e

receiving an up date packet or displaying a rendered video frame on the screen The

quantity and timing of these events give us measures that aect application quality

Based on previous multimedia application research

we have identied three measures that determine quality for most

distributed multimedia applications

Latency The time it takes information to move from the server through the

client to the user we call latency Latency decreases the eectiveness of appli

cations by making them less like reallife interaction

Data Loss Any data less than the amount determined by the user requirements

we call data loss Data loss takes many forms such as reduced bits of color pixel

groups smaller images dropp ed frames and lossy compression

Data loss may b e done voluntarily by either the client or the server in order to

reduce load or to reduce jitter andor latency

Jitter Distributed applications usually run on nondedicated systems The

underlying networks are often packetswitched and the workstations are often

running multiple pro cesses These nondedicated systems cause variance in the

latency which we call jitter Jitter can cause gaps in the playout of a stream such

as in an audio conference or a choppy app earance to a video display

The eects of latency on a users p erception of an application is wellunderstoo d

and wellresearched Similarly there is a clear relationship b etween

data loss and application quality deterioration Metho ds to ameliorate

the eects of jitter have b een explored by many researchers The

tradeo b etween buering and jitter has also b een explored We contribute to

the investigation of the eects of highp erformance workstations realtime priorities

and highsp eed networks on jitter Moreover we incorp orate these jitter results with

latency and data loss into a general mo del for application quality Lastly we study

of the eects of dierent system congurations in our general mo del for application

quality

There may b e additional measures that aect application quality that are appli

cation sp ecic For example Distributed Interactive Simulation applications use a

pro cess of computing the lo cation of other simulators through dead reckoning

When state up date packets are dropp ed the accuracy of the simulation decreases

The use of dead reckoning creates an additional quality measure sp ecic to DIS

applications In the rest of this thesis we consider only delay jitter and data loss

except where noted but our analysis could b e similarly applied to such application

sp ecic measures

Applications

In order to determine the quality of distributed collab orative multimedia applications

we must rst understand understand the user requirements and derive the system

requirements Some of the user requirements that aect quality and determine system

requirements include number of users type of media time of collab oration and

distribution of data Users can range from several to several thousand Media can

b e any combination of text color audio video and graphics Collab oration time

can at the same time synchronous or at dierent times asynchronous Data can

reside in a central rep ository clientserver or data can b e communicated equally

among clients p eertop eer We study three emerging applications that span these

characteristics for distributed collab orative multimedia applications

Audioconferences Audio conferences have b een shown to enhance collab orative

work among users on distributed workstations Why are audio

conferences b ecoming so imp ortant Hearing is one of our strongest senses

Thus sound is one of our most p owerful forms of communication If we wish

to use the exibility and p ower of computers to supp ort communication and

collab oration then they must supp ort audio data

Zoomable Database Neuroscientists from diverse disciplines plan to collab orate

across distances in exploring various asp ects of brain structure Their design

includes a zo omable multimedia database of images of the brain tissue High

resolution magnetic resonance imaging MRI shows the entire brain in a single

dataset Even higher resolution confo cal microscop e images are anchored to

these MR images in three dimensions The user starts a typical investigation

by navigating through the MR images in a coarse d mo del of the brain to a site

of interest The user then zo oms to higher resolution confo cal images embedded

in the MRI landscap e This realtime navigating and zo oming is called ying

In order to b e an eective collab oration to ol ying must provide highresolution

images and a highframe rate as well as highquality audio and video to allow

neuroscientists to communicate eectively

Distributed Interactive Simulation Distributed Interactive Simulation DIS

applications are designed to enable soldiers to engage in simulated combat

The DIS proto col allows participation from soldiers at military bases across the

country using current packetswitched networks saving the time and trouble of

traveling for combat training Many DIS developers are designing simulators

that use otheshelf general purp ose hardware In order for the combat to

b e realistic the simulators use highquality graphics and allow communication

among the soldiers with audio and video With the high multimedia system

requirements and many users applications such as DIS applications will stress

all parts of a computer system

Many textbased applications such as email have simple user requirements send

and receive mail byte for byte in a reasonable amount of time ie reasonable

latency and no data loss Other textbased applications such as le transfer have

more stringent user requirements put and get les byte for byte in a short amount

Application Users Media Time Distribution

Audio conference to s audio synchronous p eertop eer

Zo omable Database s to s color text video graphics asynchronous clientserver

Distributed Interactive Simulation s to s color audio video synchronous p eertop eer

Table Three Distributed Collab orative Multimedia Applications We study the three ap

plications listed in this table ab ove Users are the number of users who will simultaneously use

the application Media is the type of media the application provides Time is synchronous if

users work at the same time and asynchronous if users work at dierent times Distribution is

clientserver if workstations access a centralized data rep ository or p eertop eer if workstations

communicate equally to other workstations

of time ie short latency and no data loss Multimedia applications such as a video

conference have more complicated user requirements record and play video frames

keep the frames mostly clear and deliver at a fast and reasonably steady rate ie

low latency low data loss and low jitter

Table summarizes the application characteristics

The rest of this thesis is organized as follows

Section introduces a taxonomy of computer supp ort for multimedia quality Our

taxonomy describ es our quality mo del for multimedia applications and categorizes the

means with which computer systems supp ort multimedia

Section presents an indepth study of jitter one of the fundamental comp onents

of our mo del We present a comparison of measures of jitter that have b een used by

jitter researchers an exp erimentallybased study of the eects of highp erformance

pro cessors realtime priorities and highsp eed networks on jitter and predictions on

the imp ortance of application jitter reduction techniques in the future

Section describ es the metho d we use to apply our quality planning mo del to

distributed collab orative multimedia applications We describ e our mo del of the

user application and system and detail how our exp eriments are used as the basis

for our mo del and as a means of testing the accuracy of our predictions

Section applies our metho d to audio conferences We introduce audio conferences

and present our hypotheses We then apply our mo del and predict audio conference

p erformance under a variety of system congurations including faster pro cessors

networks and Digital Signal Pro cessing DSP hardware

Section applies our metho d to a d interface to a scientic brain database In

applying our mo del we move through b ottlenecks in application p erformance from

the network pro cessor and disk and arrive at a means to congure the ying interface

to t on a prop osed national network top ology

Section applies our metho d to the Virtual Co ckpit We describ e the Virtual

Co ckpit in detail including a proto col designed to reduce network bandwidth We

then apply our mo del and predict the p erformance of the virtual co ckpit under high

sp eed networks and highp erformance pro cessors

A man who does not read good books has no advantage over the man

who cant read them Mark Twain

Each Chapter has a Related Work section that discusses research that is sp ecif

ically related to that section

Chapter

A Taxonomy of Computer Supp ort

for Multimedia Quality

Overview

We present a taxonomy of multimedia application quality Our taxonomy seeks to

categorize p opular multimedia research topics into three main groups It denes and

claries the usage of several common terms used when building multimedia systems

In addition it provides a framework on which research in application quality can

build We may use our taxonomy as a map towards nding and improving the p er

formance of a multimedia application In lo oking to improve a computer systems

supp ort for multimedia we can use our mo del obtain a quantitative measure of ap

plication quality We lo cate the current b ottleneck in acceptable application quality

and determine how appropriate system changes from the taxonomy categories aect

the application quality

A taxonomy represents a view of the subelds of a research area For a rapidly

developing and evolving research area such as multimedia a taxonomy represents a

snapshot of the eld at one p oint in time A taxonomy can change and should change

as research in the eld evolves The view presented in any taxonomy is sub jective

based on the background and exp erience of those developing the taxonomy

A taxonomy can b e useful for categorizing work in the eld Note that research

very rarely ts neatly into exactly one category More often it overlaps two or three

topics so that taxonomies should often not b e considered strict or binding However

the very pro cess of developing a taxonomy forces a fo cus on the dierences b etween

the dening keywords of the eld and may help clarify their usage

This Chapter is organized as follows Section presents related work in taxon

omy and quality Section introduces our mo del of multimedia applications that

allows us to predict user satisfaction with a computer system and application cong

uration Section describ es computer systems supp ort for multimedia and presents

our taxonomy and Section summarizes the imp ortant contributions of this sec

tion including a start at dening the relationship b etween our quality mo del and our

taxonomy

Related Work

Taxonomy

Heller and Martin describ ed a media taxonomy that serves b oth research and devel

opment of multimedia applications It correlated well with previous categorizations of

multimedia and help ed researchers b etter understand the impact and value added by

an individual medium in a multimedia presentation Their media taxonomy included

a table with rows sp ecifying media types text graphics sound and motion with

columns sp ecifying media expressions elab oration representation and abstraction

In describing a taxonomy for music Pope made a strong case for the usefulness

of taxonomies Researchers in the eld of computer music struggling with the

notion of a taxonomy identied seven hierarchical areas related to comp osition Their

categories were theory acoustics representation synthesis hardware education and history

We present a taxonomy for multimedia application quality We examine how

computer systems supp ort multimedia applications in light of this taxonomy

Quality of Service

Quality of Service QoS represents the set of those quantitative and qualitative char

acteristics of a distributed multimedia system necessary to achieve the required func

tionality of an application The user should b e the starting p oint for overall QoS

considerations Some QoS research denes user limits of tolerability for application

p erformance The userdened QoS parameters are converted into systemlevel QoS

parameters System parameters include such notions as throughput and delay

Kalkbrenner presented an interface to map user choices into system parameters

For example users chose image size resolution and color for video and sp eech

of telephone or CD quality for audio These choices are then mapp ed into lowerlevel

system parameters

Wijesekera and Srivastava dened QoS parameters for continuity and synchroniza

tion sp ecication of continuous media presentations They dened metrics for

average frame rate variance in average frame rate frame losses and synchronization

Another category of QoS research takes userdened parameters and maps them

into parameters for various system level comp onents such as network throughput

Naylor and Kleinro ck developed a mo del for measuring the quality of an audio con

ference based on the amount of dropp ed frames and clientside buering

We develop a mo del for application quality that covers a wider scop e than past

QoS research Our mo del provides a view of quality from the user level Thus our

mo del incorp orates b oth the and network when determining application

quality In addition our mo del is tunable to dierent networks and workstations

allowing it to b e applied to systems of the future

Quality

There is an old network saying Bandwidth problems can be cured with

money Latency problems are harder because the speed of light is xed

you cant bribe God David Clark MIT

Although there are many dierent measures of computer system p erformance

ultimately the users of the applications must b e satised A multimedia p erformance

metric must account for comp onents fundamental to multimedia applications As

describ ed in Section we have identied three fundamental measures that determine

quality for most distributed multimedia applications latency jitter and data loss

The quality of a distributed multimedia application is a measure of the applications

acceptability to the user Our quality mo del allows us to compare p erformance for

dierent system congurations from a usercentered approach

Ideally we would like there to b e no latency jitter or data loss Unfortunately on

a variable delay network and nondedicated computer this can never b e achieved To

compute the application quality we use the ab ove quality comp onents in a pro cess

depicted by Figure The user requirements for the application dene the accept

able latency jitter and data loss The system determines the predicted latency jitter

and data loss Acceptable and pro jected data are fed into a quality metric for the ap

plication The quality metric is a function based on the acceptable comp onents and

dep endent up on the pro jected comp onents that computes the application quality

In order to quantitatively compare application quality for dierent system con

gurations we need a reasonable quality metric In the mathematical sense given

a space S with at least elements xyz a metric is a real function of variables

Dxy such that

Dxy i xy x and y are the same elements

Dxy Dyx symmetry

User Acceptable Latency Jitter Data Loss Quality Metric Application Latency Quality Jitter Data Loss

System Projected

Figure The Pro cess for Computing Application Quality The user denes the acceptable

latency jitter and data loss and the system determines the actual values Based on the acceptable

values sp ecied in the user requirements a quality metric computes the application quality from the

actual values

Dxy nonnegative

Dxy Dyz Dxz triangle inequality which says you cannot gain

by going through an intermediate p oint

A metric is sometimes called distance b etween p oints in any space We further

dene a multimedia quality metric as having several other imp ortant prop erties

It incorp orates the three fundamental multimedia quality comp onents latency

jitter and data loss

It treats the fundamental comp onents equally which seems appropriate in the

absence of user studies to the contrary

It pro duces a convex region of acceptable quality This ts our intuition ab out

changes in quality the measure increases total quality with any increase in

quality along one axis There are no p o ckets of unacceptable quality within the

acceptable quality region nor can you move from unacceptable to acceptable

by any combinations of increase along the axes

To form our quality metric we build up on the work of Naylor and Kleinro ck

Naylor and Kleinro ck developed a mo del for measuring the quality of an audio con

ference based on the amount of dropp ed frames and clientside buering We extend

this mo del by using each quality comp onent as one axis creating a multidimensional

quality space We place the b est quality value for each axis at the origin and scale

each axis so that the userdened minimum acceptable values have an equal weight

An instantiation of the application lies at one p oint in this space The lo cation of

the p oint is determined by our predictions of the amount of latency jitter an data

loss that would o ccur with the given system conguration In order to satisfy the

mathematical prop erties ab ove we compute the application quality by taking the

Euclidean distance from the p oint to the origin All p oints inside the region dened

by the userdened minimums have acceptable quality while p oints outside do not

Figure depicts a d quality space for multimedia applications The user

requirements determine a region of acceptable application quality depicted by the

shaded region All p oints inside the shaded region have acceptable quality while

those outside the region do not An instantiation of the application and the underlying

computer system would lie at one p oint in this space

Our metric attains all of the mathematical metric prop erties and multimedia met

ric prop erties listed ab ove

There can b e many p ossible quality metrics for a given application In fact there

may b e many quality metrics that agree with a users p erception of the application

Mean opinion score MOS testing can b e used to determine if a metric agrees with

users p erception The MOS is a vepoint scale where a MOS of indicates p erfect

quality and a score of or more represents high quality MOS has b een used exten

sively in determining the acceptability of co ded sp eech MOS testing is b eyond the

scop e of this pap er so we cannot b e certain our quality metric ts user p erceptions

However the rest of our mo del is indep endent of the quality metric chosen If new

metrics are developed and validated with MOS testing they can b e used in place of

Data Loss

Limit of Acceptable Data Loss

Jitter Limit of Acceptable Jitter

Limit of Acceptable Latency Latency

Figure Application Quality Space The user denes the acceptable latency jitter and data

loss These values determine a region of acceptable application quality depicted by the shaded

region All p oints inside the shaded region have acceptable quality while those outside the region

do not An instantiation of the application and the underlying computer system would lie at one

p oint in this space

our quality metric

One limitation to multimedia quality metrics is that after scaling the upp er limits

on the axes have dierent characteristics The data loss axis has a nite upp erlimit

of while the latency and jitter axes each have an innite b ound Compar

ing application quality for two dierent congurations at the upp erlimit of any of the

axes may not match user p erception Fortunately this limitation only arises when

comparing two unacceptable congurations The metric is most valuable for deter

mining whether a conguration provides acceptable or unacceptable application

quality and comparing congurations within the acceptable region

Computer Supp ort for Multimedia

From the time that there was more than one computer there has b een a range of

computer pro cessing capabilities As the number of low and highend computers

continues to grow the size of this range continues to expand Just as the capacity

of computer systems has grown so has the number of applications Today as new

computer systems emerge new applications are b eing developed and old

applications are b eing enhanced The growth of computers has resulted in a

large space of p ossible system capacities increasingly b eing lled by developing and

emerging applications

The ability of to days computer systems to supp ort a given multimedia applica

tion falls into one of three categories represented in Figure The computer system

can have abundant enough resources to always provide acceptable application p er

formance For example to day most computer workstations provide a graphical user

interface Or the computer system can have insucient resources to every provide

acceptable application quality For example a typical workstation and network will

not have enough resources to provide adequate quality for the Virtual Co ckpit

In b etween these rst two scenarios lies the Region of Scarce Resources Within

this region a computer system will have enough resources to adequately supp ort the

application only if the resources are carefully utilized For example workstations

that can provide acceptable audio quality most of the time may suer from gaps in

an audio stream when the audio application is comp eting for resources with another

application

Yesterday computer systems could handle text but struggled with graphics To

day they can handle audio and video but to do so computer resources must b e

carefully managed Tomorrow we will have GigaFLOP pro cessors and Terabit net

works so we will not have a b ottleneck for audio and video just as to day there is no

b ottleneck for graphics However there will always b e a region of scarce resources

As capacity and capacity demand b oth increase so do es the size of the region of

scarce resources enabling more applications that will require careful capacity plan

ning Moreover applications continue to expand to ll or surpass available capacity

A recent quote from Byte Magazine summarizes this situation

imagine a MHz processor MB of RAM and a GB

hard drive for less than dol lars and the new types of applications that

it would spawn I personally think the new word processor releases would

have enough fatware to bring that system to its knees Daren Copp o ck

Byte December

If Daren Copp o ck is correct some multimedia applications will always lie in the

region of insucient or scarce resources particularly as the number of collab orative

users increases Applications and system resources must b e enhanced so that appli

cations currently in the region of insucient resources will lie in the region of scarce

resources At the same time current and future resources must b e carefully managed

so that applications in the region of scarce resources will have acceptable quality In

light of this view we have b egun developing a taxonomy for the areas of computer

science research that b enet the p erformance of multimedia applications Our tax

The Region of Scarce Resources Capacity Demand

Insufficient Resources 10-Person Flying

1000-Person Cockpit

Sufficient but Scarce Resources 100-Person Cockpit

1-Person Cockpit

10-Person Audio Abundant Resrouces 1-Person Audio

Remote Login Capacity 1980 1990 2000 2010 Resources

Year

Figure The Region of Scarce Resources The horizontal axis represents the year and corre

sp onding capacity increase The vertical axis is the capacity demand The regions represent areas

in which there are insucient resources to meet demand sucient but scarce resources to meet

demand and abundant resources to meet demand

onomy is a developing do cument and is meant to represent the current research areas

for multimedia application p erformance

In our taxonomy we identify three categories of research aimed at improving

multimedia application p erformance

Capacity research seeks to increase computer system resources to meet the p er

formance requirements for multimedia applications Capacity research strives

to move applications currently in the region of insucient or scarce resources

into the region of abundant resources

Scheduling research seeks to use computer resources in a more eective man

ner for supp orting multimedia applications Scheduling research attempts to

make the p erformance of applications inside the region of scarce resources more

acceptable

Reservation research seeks metho ds to reserve computer resources for more con

sistent multimedia application p erformance Reservation research attempts to

improve the p erformance of applications inside the region of scarce and abun

dant resources

These three categories are depicted in Figure The b oxes inside each category

represent subcategories of multimedia quality research The words in italics represent

sp ecic examples of research taking place within a research subcategory As our

taxonomy will show there is enormous breadth and depth of the research in improving

multimedia application p erformance Our work concentrates on carefully analyzing

the p erformance improvements of a few sp ecic areas the capacity improvements of

pro cessors networks and disks and realtime op erating system priorities

Capacity

Computer system capacity is the maximum available amount of output over a given

time p erio d Capacity is the ultimate prerequisite for handling multimedia Neither CBSRP Rether ST-II RTP Immediate Advanced RSVP Reservation SRP + Buffer ARTS Zebra RTP Leaky Bucket Scheduling YARTOS History Buffering Policies Synchronization Playout Schemes DASH MetaScheduler Schedule Layout Continous Media File Systems Real-Time Priorities I and E InterleavedMPEG Time Stamp Taxonomy of Multimedia Application Quality DSP Multiproc Ordinary Special Dead Reckoning Multicast Graphics Compression More Hardware Capacity RAM High-speed Network Increase Capacity Better Hardware Replication High-perf Workstation Decrease Capacity Demand Client vs. Server Processing

Shift Capacity Demand

Figure Taxonomy of Multimedia Application Quality There are three categories depicted for

improving computer system supp ort for multimedia Computer system capacity is the maximum

available amount of output over a given time p erio d Scheduling is the pro cessing of controlling

computer resources so that application quality is maximized Reservation is a metho d of controlling

resource access

careful scheduling nor optimal reservation schemes will help if the computer system

do es not have the capacity to supp ort the multimedia application

The computer system comp onent with the smallest capacity will determine the

total capacity For example it do es not matter that your pro cessor can send

Mbitssecond of video if your network can only carry Mbitssecond

Demand will also impact on capacity Having a workstation and network that can

pro cess Mbitssecond of video for an application that only requires a maximum of

Mbitssecond of video results in an excess capacity

1

We consider computer systems made up of

Workstations memory pro cessor disk display bus audiovideo cards and

other sp ecialized hardware Capacity for the workstation is the workstation

multimedia throughput such as framessecond or Kbitssecond of audio

Networks network cards cables and routers for multihop networks Capacity

for the network is Mbitssecond

The capacity of computer system and application can b e changed by increasing

the capacity decreasing the capacity demand from the application or shifting the

capacity demand from one computer comp onent to another

Increasing Capacity Application quality may b e improved by increasing the ca

pacity of the supp orting computer system particularly when the current computer

resources provide insucient capacity for minimal services Capacity may b e in

creased either through b etter hardware more hardware or sp ecialized hardware

Better Hardware With b etter hardware either the workstation can b e upgraded

or the network upgraded or b oth Increasing the capacity of the workstation

1

We also consider the software that drives computer systems such as an op erating system and

network proto col as a comp onent that aects the computer system capacity

involves improving any of disk pro cessor memory bus graphics card or

monitor or display device Increasing the capacity of the network involves

improving either the network card network cable or network proto col

More Hardware Workstations can b e improved by adding more basic hardware

or sp ecialized multimedia hardware Some ordinary hardware that may b e

increased to increase workstation capacity includes

Memory Most workstations will have increased capacity with increased

memory

Processors Some applications can greatly b enet from a multiprocessor

workstation

Replication Duplicating servers will often increase the capacity of a com

puter system esp ecially when there are multiuser applications

Specialized Hardware Hardware sp ecic to multimedia tasks will often increase

system capacity Some examples of sp ecialized hardware includes graphics

b oards for video frame rendering and DSP chips for sp ecialized signal pro cessing

algorithms such as silence deletion

Decreasing Capacity Demand When a computer system do es not have su

cient capacity to supp ort the application the application capacity requirements may

b e reduced For example with unicast routing you do one send for each of N other

receivers With multicast routing you do only one send for N other receivers Mul

ticast routing has b een shown to dramatically reduce network capacity requirements

and to mo derately reduce pro cessor capacity requirements

Alternatively capacity demand may b e decreased through voluntary data loss

As mentioned in Section data loss takes many forms such as decreased frame

rate reduced bits of color jumbo pixels smaller images dropp ed frames and lossy

compression

Shifting Capacity Demand When user requirements cannot b e completely sat

ised there is often a tradeo in system choices If part of a computer system has

sucient capacity while another part has surplus capacity it may b e p ossible to

shift some of the application capacity requirements from the comp onent with scarce

resources to the comp onent with surplus resources

For example compression can reduce the size of the data packets that are sent

from server to client decreasing the network capacity requirements However the

compression and decompression increase the workstation capacity requirements

Capacity requirements can often b e shifted b etween the client and the server

Many applications have data that needs pro cessing b efore it is displayed to the user

This pro cessing can b e done at either the client or the server For example

in a Geographic Information Systems application the frame computation can take

place by two dierent metho ds in remote image pro cessing the server do es the image

computation and transfers only a D frame to the client in lo cal image pro cessing the

server transfers the D data and the client do es the image computation In remote

image pro cessing the server capacity requirements are high but the network capacity

requirements are low In lo cal image pro cessing the pro cessing capacity demand has

shifted from the server to the client and the network

Another example that shifts capacity demand is dead reckoning Dead reckoning

is used in Distributed Interactive Simulation DIS DIS is a virtual environment

b eing designed to allow networked simulators to interact through simulation using

compliant architecture mo deling proto cols standards and databases DIS simula

tors use dead reckoning algorithms to compute p osition Dead reckoning shifts

the capacity demand from the network to the simulators See Section for more

information

Scheduling

Lets do smart things with stupid technology today rather than wait and

do stupid things with smart technology tomorrow William Buxton Uni

versity of Toronto

Scheduling is the pro cessing of controlling computer resources so that applica

tion quality is maximized Traditional scheduling techniques for workstations and

networks are RoundRobin or FirstComeFirstServed or variants of them While

fair these algorithms generally are not careful in ensuring that the users quality re

quirements are met Scheduling that favors multimedia data may b e done in several

ways

RealTime Priorities Multimedia applications incorp orating audio and video of

ten have stringent delay requirements Realtime priorities seek to reduce delay by

giving multimedia pro cesses and network data precedence over traditional pro cesses

or data

Solaris Some mo dern versions of Unix have supp ort for realtime scheduling

In particular SunOS Solaris allows for xedpriority pro cesses and has a

fully preemptive kernel These features pro duce dispatch latencies on the order

of a few milliseconds

YARTOS Jeay and Stone have built an exp erimental realtime op erating sys

tem that provides guaranteed resp onse times to tasks It is used as a vehicle

for research in the design analysis and implementation of realtime applications

RETHER Venkatramani and Chiueh prop ose and evaluate a softwarebased

proto col called RETHER Realtime ETHERnet that provides realtime p er

formance guarantees to multimedia applications without mo difying existing

Ethernet hardware cite RETHER RETHER features a hybrid mo de of op

eration to reduce the p erformance impact of nonrealtime network packets

RealTime Network Verma Zhang and Ferrari describ e a scheme for setting

up a connection on a packet switched network with guaranteed p erformance

parameters Such parameters include a b ound on the minimum band

width maximum packet delay and maximum packet loss rate Their scheme is

theoretically capable of providing a signicant reduction in jitter

RTP Schulzrinne Casner Frederick and Jacobson prop ose RTP the realtime

transp ort proto col as an Internet standard RTP provides endtoend net

work transp ort functions suitable for applications transmitting realtime data

such as audio video or simulation data over multicast or unicast network ser

vices The data transp ort is augmented by a control proto col RTCP to allow

monitoring of the data delivery in a manner scalable to large multicast networks

and to provide minimal control and identication functionality RTP and RTCP

are designed to b e indep endent of the underlying transp ort and network layers

The proto col supp orts the use of RTPlevel translators and mixers

Playout Policies Ideally we would like to playout multimedia frames contin

uously with each successive frame right after the previous However this smo oth

playout is often not p ossible unless strict p olicies are enforced that control the play

out There are several categories of such p olicies

Buering Policies To supp ort continuous playout we must comp ensate for jit

ter by having a delay buer at the receiver end Buering can comp ensate for

jitter at the exp ense of latency Transmitted frames are buered in memory

by the receiver for a p erio d of time Then the receiver plays out each frame

with a constant latency achieving a steady stream If the buer is made suf

ciently large so that it can hold all arriving data for a p erio d of time as long

as the tardiest frame then the user receives a complete steady stream How

ever it is not clear that the primary goal should b e playout with no gaps

In most multimedia applications gaps from delay variance or even dropping

frames o ccasionally is tolerable In these cases playout with lowlatency and

some gaps is preferable to playout with highlatency and no gaps The added

latency from buering can b e disturbing so minimizing the amount of delay

comp ensation is desirable

Another buering technique to comp ensate for jitter is to discard any late frame

at the exp ense of data loss Discarding frames causes a temp oral gap in the

playout of the stream Discarding frames can keep playout latency low and

constant but as little as gaps in the playout stream can also b e disturbing

In the case of audio sp eech the listener would exp erience annoying gaps

somewhat similar sounding to static during this p erio d In the case of video

the viewer would see the frozen image of the most recently delivered frame

Naylor and Kleinro ck describ e two p olicies that make use of these buering

techniques the EPolicy for Expanded time and the IPolicy for late data

Ignored Under the Ep olicy frames are never dropp ed Under the Ip olicy

frames later than a given amount are dropp ed Since it has b een observed that

using a strict EPolicy tends to cause the playout latency to grow excessively

and that dropping frames o ccasionally is tolerable we use the IPolicy

as a means of examining needed jitter comp ensation for a multimedia stream

Queue monitoring uses a p olicy for decreasing display latency based on ob

serving queue lengths over time at the receiver If the queue length

b ecomes suciently long compared with previous queue length we can con

clude that display latency can b e reduced without causing gaps by discarding

a frame

Frankowski and Riedl dene a heuristic solution for choosing the buer time

called History The buer times are based on an ordered list of past inter

arrival times They nd History to have some b enets over buering metho ds

that use the standard deviation or absolute deviation of past interarrival times

for choosing delay buer size

As opp osed to p olicies of receiverside buering Ferrari presents a scheme for

buering data at the network no des b etween the sender and receiver His

buering scheme is capable of providing a signicant reduction in jitter with

no accumulation of jitter along the path of a channel This form of buering

signicantly reduces the buer space required in the network

Jeay presents a metho d of buering at the side of the sender instead of the

receiver He argues that a sender based buering scheme may b e more

eective than are receiver based schemes

Continuous Media File Systems Continuous media le systems seek to im

prove multimedia application quality by decreasing either the time to access

continuous multimedia data on disk and decreasing the variance in this access

time Continuous media le systems are most used to improve the quality of

asynchronous multimedia applications as synchronous multimedia application

generally do not rely on multimedia data on a disk There are two ways multi

media disk access may b e improved

Scheduling Under scheduling access to the disk o ccur at the time when

the multimedia data is needed

Layout One approach to providing continuous media is to organize disk

layout to allow ecient access to continuous data For example Zebra

is a network le system that strip es le data across multiple servers for

increased le throughput Rather than striping each le separately

Zebra forms all the new data from each client into a single stream which it

then strip es This provides high p erformance for reads and writes of large

les and also for writes of small les Zebra also writes parity information

in each strip e in the style of RAID disk arrays this increases storage costs

slightly but allows the system to continue op eration even while a single

storage server is unavailable A prototype implementation of Zebra built

in the Sprite op erating system provides times the throughput of the

standard Sprite le system or NFS

Synchronization Many multimedia applications have synchronized renditions of

audio and video The granularity of synchronization may dier from application

to application For example one application may require tight lip synchronization

b etween a sp eaker and his voice while another application may only require sp oken

words displayed with a changing background There are two kinds of synchronization

Interleaved With interleaved synchronization the audio data is interleaved

with the video data with which it is to b e played The client up on receiving

the data separates the audio and video data and plays it without p erforming

any manual synchronizing An example of interleaved synchronization is the

standard for video from the Motion Picture Exp erts Group MPEG

Timed Synchronization of multimedia streams can b e achieved by the use of

timestamps The sender timestamps each sample and the receiver uses the

timestamps to restore the intersample spacing An example is vat a voice

conferencing system developed by Van Jacobson in exp erimental use on the

Internet

Reservation

Reservations recommended Ristorante Luci St Paul MN

The b est scheduling metho d is useless if the requirements imp osed on a resource

exceeds the system capacity Reservation is a metho d of controlling resource access

Setting aside resource capacity implies some form of reservation We have identied

two categories of reservation schemes

Immediate Most reservation services for realtime communication assume that

realtime channels are requested for an indenite duration Clients are not asked how

long such changes will b e alive and are assumed to b e for an indenite duration This

approach is similar to the mo del of a telephone call where you do not have to sp ecify

how long you are going to talk

Anderson Herrtwich and Schaefer describ e the Session Reservation Proto col SRP

which allows communicating p eer entities to reserve the resources such as CPU and

network bandwidth necessary to achieve given delay and throughput The im

mediate goal of SRP is to supp ort continuous media digital audio and video in

IPbased distributed systems SRP is based on a workload and scheduling mo del

called the DASH resource mo del This mo del denes a parameterization of client

workload an abstract interface for hardware resources and an endtoend algorithm

for negotiated resource reservation based on cost minimization

Yau and Lam present a framework for scheduling multimedia applications that

allows pro cesses to reserve CPU time to achieve guarantees It provides re

wall protection b etween pro cesses such that the progress guarantee to a pro cess is

indep endent of how other pro cesses actually make scheduling requests

The Exp erimental Internet Stream Proto col STI I is a network layer proto col

providing multicast services It provides facilities to negotiate and server re

sources for packet size and data rate

Advanced Some imp ortant multimedia applications require that advance reserva

tions b e p ossible For example users wanting a multiperson video conference might

desire to reserve resources in advance so that a meeting can b e held at the app ointed

time Advanced reservation involves setting aside resources in advance such that at

a sp ecied future time the resources will b e available Advanced reservations have

start time and duration The crucial dierence b etween immediate reservations and

advanced reservation is in the duration

The coarseness of the time granule for reservations slots has a large impact on

the amount of pro cessing time needed to determine the acceptance of an advanced

reservation Coarse time granules will require less pro cessing time Limiting the

duration of advanced reservation requirements will allow more sharing of resources

Ferrari Gupta and Ventre describ e the Tenet RealTime Proto col Suite a suite

b eing developed for multiparty communication which will oer advance reservation

capabilities to its clients

Summary

This section presents a taxonomy of computer supp ort for multimedia quality Our

taxonomy may b e useful for categorizing related research in multimedia applications

and it can provide a framework on which research in application quality can build

At the heart of our taxonomy is our mo del for determining distributed collab ora

tive multimedia application quality Our quality mo del can b e used to compare the

eects of dierent system congurations on application quality Using our mo del we

can nd the b ottlenecks in application quality and determine which categories under

our taxonomy improve quality the most In addition we can use our mo del to eval

uate the p otential p erformance b enets from exp ensive highp erformance pro cessors

and highsp eed networks b efore installing them We can even investigate p ossible

p erformance b enets from networks and pro cessors that have not yet b een built In

order to improve multimedia quality system congurations can b e altered aecting

one of the three categories in our taxonomy capacity scheduling and reservation

After determining application quality we may seek to improve it by making a

change in the underlying computer system The change is reected by tuning the

system conguration part of our mo del The p otential b enet from the system change

can improve latency jitter andor data loss We have b egun to identify how system

changes from each of the taxonomy categories aects latency jitter or data loss

Since capacity improvements to computer systems will continue indep endent of

other research in multimedia it is imp ortant to understand what impacts larger ca

pacities will have on multimedia application p erformance In the rest of this thesis

we concentrate on the the eects that changes in capacity have on multimedia appli

cation p erformance Capacity improvements usually improve multimedia quality by

reducing latency jitter and data loss As shown in Section capacity improvements

reduce latency b ecause faster hardware can resp ond faster As shown in Section

capacity improvements reduce jitter b ecause faster hardware also reduces variance in

resp onse time And as describ ed in Section larger capacities reduce the amount

of data loss from system saturation

In order to supp ort current and future multimedia application requirements it

will always b e crucial to use available resources in an eective matter In Section

we take an indepth lo ok at how realtime scheduling aects jitter a fundamental

comp onent of multimedia application quality We demonstrate that some metho ds of

scheduling improve jitter more than do capacity improvements

Reservation has the p otential to improve multimedia application quality by re

ducing latency jitter and data loss However this thesis fo cuses on the eects of

capacity and scheduling on multimedia application quality

Synchronization is an interesting case Synchronization do es not improve latency

synchronized data arrives no so oner and may even arrive later than nonsynchronized

data Synchronization do es not improve jitter synchronization do es not improve

the variance in the multimedia data And synchronization do es not improve data

loss Synchronization p oints out the need for some applications to have an additional

axis to determine application quality in addition to the three fundamental axes We

may choose to call the additional axis the accuracy axis For example the quality

of Distributed Interactive Simulation applications includes an accuracy axis see

Section Previous quality research by Jeay et al has evaluated video conference

quality based on audio and video synchronization

A computer systems change in one category might require a change in another

For example to implement a buering scheme a scheduling change you might need

more memory a capacity increase Or to carry out a network reservation scheme

you might lose some capacity from ineciency requiring an increase in network band

width another capacity increase

In Sections and we apply our quality mo del to three applications audio

conferences a d interface to a scientic database and a distributed multiperson

combat simulator We vary system congurations in order to predict the eect of

capacity and scheduling changes on application quality

Chapter

Jitter

Overview

In a distributed multimedia application a multimedia stream is generated at the

sending workstation and sent over a network to the receiving workstation The data

is generated and sent in xedsized quantities called frames The endtoend frame

delay is the time b etween a frames generation on the sender and the time it is

pro cessed by the receiver Variation in this endtoend delay we call jitter We have

observed jitter on the order of a few hundred milliseconds when sending a multimedia

stream using unloaded workstations on a quiet Ethernet lo cal area network

How do es jitter aect a multimedia stream In the absence of jitter the frames can

b e played as they are received resulting in a smo oth playout depicted by Figure

However in the presence of jitter interarrival times will vary as depicted in Figure

In Figure the third frame arrives late at r In the case of audio sp eech the

listener would exp erience an annoying pause during this p erio d In the case of video

the viewer would see the frozen image of the most recently delivered frame

If we decrease jitter we will have a less choppy playout and b etter application

quality In seeking to decrease jitter we can tune the application the op erating sys

tem or the underlying hardware This gives us a hierarchy of p ossible jitter reducing

sender s0 s1 s2 s3 s4

receiver r0 r1 r2 r3 r4

Figure A JitterFree Stream The ab ove gure is a mo del of a jitterfree stream Each si is

the time at which the send pro cess initiates the transmission of frame i Each ri is the time at which

the receiving pro cess plays frame i

sender s0 s1 s2 s3 s4

receiver r0 r1 r2 r3 r4

Figure A Stream with Jitter The ab ove gure is a mo del of a stream with jitter Each si

is the time at which the send pro cess initiates the transmission of frame i Each ri is the time at

which the receiving pro cess plays frame i

techniques Table depicts this hierarchy Application techniques for reducing jit

ter include application tuning and buering System techniques for reducing jitter

include disk layoutscheduling alternate network proto cols and op erating system pri

orities Hardware techniques for reducing jitter include hardware improvements such

as highp erformance pro cessors highsp eed networks and fast disks We consider the

merits of conducting research on each jitter reducing technique

Application Tuning and Buering Applicationlevel techniques for reducing

jitter are done in the context of comp ensating for the jitter pro duced by the underlying

computer system There has b een extensive research done on application techniques

Controlled frame playout through buering in particular has b een well

studied With buering transmitted frames are held in memory

by the receiver for a p erio d of time Then the receiver plays out each frame with a

Level Possible Jitter Reducing Technique

Application Application Tuning

Buering

System Disk LayoutScheduling

Alternate Network Proto cols

Op erating System Priorities

Hardware HighPerformance Pro cessors

HighSp eed Networks

Fast Disks

Table Hierarchy of Possible Jitter Reducing Techniques

constant latency achieving a steady stream If the buer is made suciently large so

that it can hold all arriving data for a p erio d of time as long as the tardiest frame

then the user receives a complete steady stream However the added latency of data

can b e disturbing So there is a tradeo b etween ameliorating the eects of jitter

and minimizing the amount of latency due to buering

Computer systems continue to get faster while human p erceptions remain the

same Future system improvements may remove enough of the underlying jitter such

that applicationlevel jitter reduction techniques are unnecessary How far in the

future will this b e In this work we seek to answer this question by exp erimen

tally measuring the eects of system improvements on jitter Then based on the

rate of hardware improvements we predict when the jitter levels contributed by the

underlying system will drop b elow human p erception

Disk LayoutScheduling and Fast Disks Fast disks and disk layoutscheduling

strategies may b e crucial for reducing jitter for some multimedia applications In this

work none of the three multimedia applications we study audio conferences the scien

tic brain database and the virtual co ckpit access multimedia data on disks directly

so we do not consider the eects of disk layout or scheduling on jitter further

Alternate Network Proto cols Alternate network proto cols may reduce jitter

over traditional network proto cols There has b een a lot of research into network pro

to cols designed for handling multimedia data Our work exp erimentally

measures the eects on jitter of two highsp eed network proto cols Fibre Channel and

HIPPI

Op erating System Priorities There is evidence to suggest that a signicant

source of jitter in the transmission of multimedia may b e found in the op erating

system of the sending and receiving workstations The principal deciency in

pro cess scheduling in mo dern op erating systems is that userlevel pro cesses are not

preempted while in kernel mo de As a result increased system activity can increase

resp onse time without b ound In addition mo dern op erating systems allow priority

inversions to o ccur whereby a low priority pro cess excludes highpriority pro cesses

from accessing timecritical data thus causing the highpriority pro cesses to miss

deadlines These deciencies can result in latency on the order of milliseconds

Realtime scheduling allows the op erating system to accurately time events and

allo cation of memory to the pro cess and provide for priority scheduling This feature

can reduce latency to the order of a few milliseconds Our work exp erimentally

compares the b enets of realtime scheduling to normal scheduling to determine if

realtime scheduling do es reduce jitter

HighPerformance Pro cessors and HighSp eed Networks Highp erformance

pro cessors have higher throughput and a faster context switch time than typical pro

cessors resulting in b etter application resp onse time Highsp eed networks deliver

frames from the sender to the receiver faster than typical networks reducing the

network transmission time Together highp erformance pro cessors and highsp eed

networks networks will reduce application latency The reduced latency should b e

accompanied by a reduction in latency variation or jitter We ran exp eriments on

SGI Challenge workstation clusters and Fibre Channel and HIPPI networks under

b oth light and heavy load to determine the eects that hardware improvements have

on jitter

Hyp otheses

Given the ab ove discussion we prop ose the following hypotheses

Highperformance processors reduce jitter

Realtime operating system priorities reduce jitter

Highspeed networks reduce jitter

Pro cessor p erformance approximately doubles every year As pro cessor p er

formance increases latency decreases This decrease in latency may b e accompanied

by a decrease in jitter However in order to signicantly improve application quality

any jitter reduction from highp erformance pro cessors needs to b e large compared to

the total jitter contributed by the network and op erating system In other words the

pro cessor may not b e the b ottleneck in reducing jitter

Since realtime op erating system priorities have b een shown to decrease latency

it seems natural to assume that they may decrease jitter also If realtime

priorities do prove to signicantly reduce jitter application quality may b e improved

without exp ensive hardware upgrades or timeconsuming application tuning

Under heavy loads highsp eed networks should deliver multimedia frames faster

than traditional networks By delivering frames faster highsp eed networks will re

duce contention for the network decreasing the latency of frames waiting to b e de

livered by the network We would also exp ect this reduced contention to decrease

jitter under such conditions However under light loads the reduced latency from

highsp eed networks may not signicantly improve application quality

In order to test our hypotheses we lo oked at three p ossible p erformance eval

uation techniques analytic mo deling simulation and exp erimental measurement

Frankowski and Riedl found that analytically mo deling jitter is very dicult even

on a quiet singlehop network The contributions to jitter from the op erat

ing systems on b oth the sender and receiver and the contribution to jitter from the

network are dicult to capture mathematically Stein and Riedl had some success

in using simulation to evaluate the eects of jitter on audio conference quality

However according to Jain simulations are most eective when they are based on

previous measurement Unfortunately to the b est of our knowledge careful mea

surements of the contributions to jitter from highp erformance pro cessors realtime

priorities and highsp eed networks have not b een made Simulating the eects of such

comp onents on jitter may give rise to inaccurate or misleading results We use exp er

imental measurement to test our hypotheses ab out the eects of highp erformance

pro cessors priorities and highsp eed networks on jitter reduction Future research

may b e able to use our p erformance measurements as a basis for simulations

Although our hypotheses are simply stated the answers to these hypotheses are

not quite as simple as true or false The rest of this chapter explains why

Section lists related work in teleconferencing systems delay buering and real

time p erformance Section details the exp erimental design comp onents that were

common to all exp eriments Section explores the eects of pro cessor p erformance

on jitter Section examines the eects of realtime priorities on jitter Section

lo oks at the eects of highsp eed networks on jitter And Section summarizes the

results from this section

Related Work

Teleconferencing Systems

Several exp erimental teleconferencing systems have b een designed to explore telecon

ferencing p erformance issues such as jitter

Riedl Mashayekhi Schnepf Claypo ol and Frankowski developed SuiteSound

SuiteSound attempted to integrate supp ort for multimedia into the Suite program

ming environment They p erformed exp eriments to determine the network and CPU

load of the SuiteSound to ols including the eects of an algorithm that removes silence

from digitized sp eech

Hopp er developed the Pandora system to investigate the p otential for creating

and deploying a desktop multimedia environment based on advanced digital video

and audio technology Pandora oered a to ol b ox to those analyzing and

working on information by providing a exible communications system that allowed

eective interaction b etween a number of users

Jeay Stone and Smith developed a transp ort proto col that supp orts realtime

communication of audiovideo frames across campusarea packet switched networks

They demonstrated the eectiveness of their proto col by measuring the p erfor

mance of their proto col when transmitting audio and video across congested networks

Teleconferencing systems attempt to provide quality audio and video to groups of

users One comp onent to teleconferencing quality is jitter We provide exp erimental

results on the eects that system improvements will have on reducing jitter We also

present a quality mo del that can b e used to evaluate system conguration tradeos

in predicting teleconferencing quality from the users p ersp ective

Delay Buering

Research in delay buering has lo oked at ways to ameliorate the eects of jitter by

controlling the playout and delivery of frames at either the sender or the receiver

Ramjee Kurose Towsley and Schulzrinne compared the eects of four dierent

buering algorithms for adaptively adjusting the playout delay of audio packets over

a wide area network They found that an adaptive algorithm which explicitly

adjusts to the sharp spikelike increases in packet delay achieved the lowest rate of

lost packets

Stone and Jeay presented an empirical study of several buering p olicies for

managing the eect of jitter on the playout of audio and video in computerbased

conferences They evaluated a particular p olicy called queue monitoring by

comparing it with two p olicies from other literature They showed that queue mon

itoring p erforms as well or b etter than the other p olicies over the range of observed

network loads

As opp osed to the two pap ers ab ove that use receiverside buering Ferrari pre

sented a scheme for buering data at the network no des b etween the sender and

receiver He studied the feasibility of b ounding jitter in packetswitched wide

area networks with a general top ology He presented a buering scheme that is capa

ble of providing a signicant reduction in jitter with no accumulation of jitter along

the path of a channel and demonstrated that jitter control signicantly reduces the

buer space required in the network

Talley and Jeay presented a metho d of buering at the side of the sender in

stead of the receiver They presented a framework for transmission control

that describ es the current network environment as a set of sustainable bit and packet

transmissionrate combinations They empirically demonstrated the validity of adapt

ing b oth packet and bitrate using simple adaptation heuristics

Naylor and Kleinro ck developed a mo del for measuring the quality of an audio con

ference based on the amount of dropp ed frames and clientside buering They

used their mo del to investigate two adaptive receiverside buering schemes which

may b e used to achieve a smo oth playout Metho d E expands the buer to preserve

all incoming frames Metho d I ignores all late frames in order to preserve timing

We explore alternative means to delay buering to reduce jitter Sp ecically

we examine when hardware improvements might make delay buering techniques

unnecessary We extend the work of Naylor and Kleinro ck to develop a more general

mo del for multimedia application quality

Realtime Performance

Govindan and Anderson conjectured that the traditional op erating system goals of

fairness maximum system throughput and fast interactive resp onse may conict with

the needs of realtime applications such as continuous media and prop osed a new

pro cessor scheduling algorithm Jeay Stone and Smith made similar remarks

and provided exp erimental evidence Jeay and Stone and went so far as to

build an op erating system that supp orts realtime multimedia Khanna Serbree

and Zonowsky discussed the design implementation and p erformance of the SunOS

op erating system as a realtime system Gerhan presents the new realtime

enhancements to Microsoft Windows NT

We exp erimentally measure the eects of Solaris realtime priorities and the Unix

nice facility on jitter We compare the eects of using op erating system priorities

with pro cesses run under default priorities

Shared Exp erimental Design

The fundamental principle of science the denition almost is this the

sole test of the validity of any idea is experiment Richard P Feynman

We ran a series of exp eriments to test our hypotheses This section details the

design comp onents that were common to all exp eriments

Our exp eriments were all conducted on a singlehop LAN In our rst set of ex

p eriments we attempted to measure jitter contributions on a WAN However getting

tight condence intervals on WAN jitter proved extremely dicult Perhaps in the

future our LAN measurements may b e used in simulations to explore the eects of

a WAN on jitter

Each exp eriment simulated the transmission of a multimedia stream under various

conditions and measured the amount of jitter We used pairs of userlevel pro cesses

that sent and received UDP datagrams using Berkeley so cket IO The send pro cess

used an interval timer to initiate frame delivery The receive pro cess to ok times

tamps using the gettimeofday system call These timestamps are used to measure

the amount of jitter

A jitter measure must reect the extent to which the interarrival times vary

There are several classical statistical measures of variation that have b een used in

jitter research

Range The simplest measure of variability is the maximum delay b etween any

two consecutive frames Generally sp eaking more variability is reected in a

larger range The range also provides a maximum delay variance for providing

jitter guarantees to multimedia applications For this reason range has b een

used in some past jitter research

Variance The primary measure of variability determines the extent to which

each frame deviates from the mean frame interarrival time The standard way

to prevent values b elow the mean interarrival time from negating values ab ove

the mean interarrival time is to square them Dividing by the number of frames

gives the average squared deviation This is the standard denition of variance

and is used by several jitter researchers The formula for variance is

P

n

2

x x

i

i=1

V ar iance

n

Standard Deviation By taking the square ro ot of the variance we get the

same units as the frame delay This is the standard deviation the classical

measure of spread Several researchers have used the standard deviation of

packet interarrival times as a measure of jitter The formula for

standard deviation is

v

P

u

n

2

u

x x

i

i=1

t

S tandar dD ev iation

n

Absolute Deviation Variance and standard deviation have proven to b e very

sensitive to outliers Since human eyes and ears can smo oth over an o ccasional

glitch in an audio or video stream we do not want our jitter measure to b e

esp ecially sensitive to outliers For this reason some researchers have used

absolute deviation as a measure of jitter b ecause it is less sensitive to values far

from the mean The formula for absolute deviation is

n

X

x j jx Absdev

i

i=1

In addition there has b een jitter research that has incorp orated derived measures

of jitter

Gaps When playing out a multimedia stream a late frame will cause a gap

in the smo oth playout The total number of gaps and the number of gaps p er

second can provide a measure of how much jitter there is in the stream Gaps

as a measure of jitter has b een by several jitter researchers

Jitter Compensation In addition to determining buer size the jitter comp en

sation curve can b e used to measure jitter The jitter comp ensation curve is

a measure of the amount of delay buering needed to achieve a smo oth multi

media stream see Section for a complete description The more jitter in

a multimedia stream the larger the area under the jitter comp ensation curve

This jitter measure has b een used by

Range Variance Stddev Absdev Gaps Area

Range

Variance

Stddev

Absdev

Gaps

Area

Table Correlation Among Jitter Measures This table depicts the correlation co ecients for

dierent jitter measures Range is the maximum interarrival time Variance is the variance in

interarrival times Stddev is the standard deviation of interarrival times Absdev is the absolute

deviation of interarrival times Gaps are the number of playout gaps p er second with

microseconds of buering Area is the area under the jitter comp ensation curve The table is

symmetric ab out the diagonal

There are many more p ossible measures of jitter interquartile range mean length

of gaps median length and even secondorder statistics such as variance on the mean

length etc However such measures are less likely to depict the amount of jitter

meaningful to the multimedia application than the more direct metho ds we detailed

ab ove Because of this to the b est of our knowledge other researchers have not used

them to measure of jitter We do not consider these metho ds further

However we did want to compare the equivalence of jitter measures that had

b een used in previous research We recorded interarrival times for all exp eriments

in Sections through and computed jitter values for range variance standard

deviation absolute deviation gaps and jitter comp ensation We then computed the

Pearsons correlation co ecient for each pair of jitter measures For the gaps

jitter measure we assumed all late data is ignored and that there is a buer of

microseconds values used by other researchers Table gives the correlation

co ecient for each jitter measure pair

Range do es not correlate well with any of the other jitter measures Range mea

sures variation as the distance b etween the two most extreme values Variation de

p ends on more than just the extreme values as we can see from these two samples

fg and fg b oth have the same range but there is

less disp ersion in the rst sample

Gaps do es not correlate well with Absolute Deviation and it correlates only slightly

well with Area The number of gaps p er second dep ends up on the initial buer chosen

so dierent buer sizes might have dierent correlation results The rest of the jitter

measures correlate well to to extremely well with each other With the

exception of range the measurement of jitter chosen do es not not app ear to matter

We choose variance as our measure of jitter b ecause it is easy to understand and

compute

Audio and video have very dierent bandwidth needs The Sun audio device

records audio at a rate of bytessecond Video acquisition hardware typically

generates framesp ersecond and compressed frames with a resolution of x

require around megabitsp ersecond of bandwidth We wanted to see if either

packet size or packet rate changed the frequency andor magnitude of the interarrival

times

Pairs of packets tend to reect ab out the mean packet arrival time If a packet

arrives late the next packet usually arrives early by the same time that the previous

packet was late When interarrival times b ecome small the packet following a late

packet is unable to arrive an equal amount early b eing unable to arrive in fewer than

zero microseconds Figure depicts packet reection There are two multimedia

streams shown The top stream has a mean interarrival time of microseconds

The lower stream has a mean interarrival time of microseconds When the

packet rst packet arrives late in the stream the subsequent packet arrives an

equal amount early However when a packet arrives late in the microsecond

stream the subsequent packet arrives almost immediately b eing unable to reect

300000 160,000 microseconds 30,000 microseconds

250000

200000

150000

Reflection

100000 Interarrival Time (in microseconds)

50000 Cannot Reflect

0 1 2 3 4 5 6 7 8 9

Packet

Figure The Reection Eect The horizontal axis is the packet number The vertical axis the

interarrival time in microseconds The zigzag lines represent two multimedia streams with the top

having a mean interarrival time of microseconds and the b ottom having a mean interarrival

time of microseconds

an equal amount Mathematically the top stream will have a larger variance even

though they b oth have an equal chance of a having a packet arrive late This reection

eect is an artifact of the environment and do es not accurately indicate the frequency

and magnitude of late packets

We p erformed three miniexp eriments to test if either packet size or packet rate

changed the frequency andor magnitude of the interarrival times We subtract the

mean from each interarrival time and drop all p oints that are b elow zero This avoids

the reection eect and allows us to compare the frequency and magnitude of late

packets In the rst exp eriment we sent packets of three dierent sizes at three dier

ent rates audiorate microseconds and audiosize bytes videorate

microseconds and videosize bytes and midrate microsec

2.5e+09

2e+09

1.5e+09

1e+09 Variance (microseconds ^ 2)

5e+08 correlation -.02

0 33000 100000 160000

Time Between Packets (in microseconds)

Figure Jitter versus Packet Rate The horizontal axis is the time b etween packets The

vertical axis is the variance Each p oint represents an indep endent exp eriment The correlation is

onds and midsize bytes Figure depicts the results The correlation

b etween packet rate and variance is an extremely low

In the second exp eriment we sent packets all of the same size bytes at four

dierent rates microseconds microseconds microseconds and

microseconds Figure depicts the results The horizontal axis is the time

b etween packets The vertical axis is the variance The correlation b etween packet

rate and variance is an extremely low

In b oth exp eriments the correlation b etween packet rate and jitter was extremely

low This indicates that the amount of jitter in terms of number and magnitude of

late packets we would exp ect to see in a highrate video stream would b e the same

as the jitter from a lowerrate audio stream For all subsequent exp eriments we used

the audio rate in order to avoid the reection eect and to keep network and pro cessor

2.5e+09

2e+09

1.5e+09

1e+09 Variance (in microseconds ^ 2)

5e+08 correlation .09

0 30000 70000 110000 160000

Time Between Packets (in microseconds)

Figure Jitter versus Packet Rate for Byte Packets The horizontal axis is the time

b etween packets The vertical axis is the variance Each p oint represents an indep endent exp eriment

The correlation is

Workstation MHz SPECint

SLC

IPC

IPX

Sparc

Table Workstations used in Pro cessor Exp eriments

loads low We can then measure the eects that increased network and pro cessor load

had on jitter To simulate the transmission of audio we sent byte datagrams

every microseconds

Pro cessor Exp eriments

Pro cessor p erformance approximately doubles every year These highp erformance

pro cessors will probably have a huge eect on improving the quality of current mul

timedia applications and may enable new multimedia applications We explore the

eects of pro cessor p erformance on jitter one comp onent in multimedia application

quality

Sp ecic Exp erimental Design

We used four classes of Sun pro cessors SLC IPC IPX and Sparc Table

summarizes the workstation attributes The workstations were connected to an

Mbitssecond Ethernet We used a pro cess that increments a long integer variable in

a tight lo op a counter pro cess to induce pro cessor load

5e+10

4e+10

3e+10 Variance 2e+10

1e+10

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Load

Figure Sun SLC Jitter versus Receiver Load The horizontal axis is the pro cessor load as

rep orted by the Unix w command The vertical axis the the variance in interarrival times The

middle line is the least squares line t The outer two lines form a condence interval around

the line The correlation co ecient is

Jitter versus Pro cessor Load

We rst test whether increasing pro cessor load increases jitter We ran an increasing

number of counter pro cesses on the receiver and obtained the pro cessor load from

the Unix w command at the end of each exp eriment run The w command displays a

summary of the current activity on the system including the average number of jobs

in the run queue over the last and minutes Lastly we did a least squares line

t for the load versus variance and computed the correlation co ecient Figure

shows the results of our exp eriments on the Sun SLC

Figure shows the results of our exp eriments on all the Sun workstations listed

in Table Again we did a leastsquares line t for the load versus variance for

5e+10

4e+10 SLC SLC IPC IPC IPX IPX Sparc 5 3e+10 Sparc 5

2e+10 Variance (in microseconds ^ 2)

1e+10

0 0 1 2 3 4 5 6 7

Load

Figure Jitter versus Receiver Load The horizontal axis is the pro cessor load as rep orted by

the Unix w command The vertical axis the the variance in interarrival times The lines are the least

squares line ts The correlation co ecients range from to

each class of pro cessor The slop es for all pro cessors are p ositive meaning that an

increase in load results in an increase in jitter no matter how p owerful the pro cessor

In addition the slop es of the least squares line ts decrease as the pro cessors get

more p owerful This means that under loaded conditions more p owerful pro cessors

will contribute less jitter to a multimedia stream than will less p owerful pro cessors

Jitter versus Pro cessor Power

We observed that more p owerful pro cessors decrease jitter under high loads but what

happ ens under conditions of more normal load

In Figure the lines come to nearly the same p oint at a pro cessor load of

At a load of we could show no correlation b etween jitter reduction and pro cessor

(Gap between lines is 200 milliseconds)

multi day

multi night

single day

single night

single ded

0 1000 2000 3000 4000 5000 6000 7000

Time (seconds)

Figure Eects on Jitter There are multimedia streams represented in this picture each

by a horizontal line depicting the interarrival times Single indicates an exp eriment run in single

user mo de Multi indicates an exp eriment run in typical multiuser mo de Day indicates an

exp eriment run in the middle of the day Night indicates and exp eriment run at night Ded

indicates an exp eriment run on a dedicated network Each multimedia stream is oset from the

one b elow it by milliseconds The horizontal axis is the time in seconds

p ower so we sought to isolate our exp erimental environment from the outside world

to b etter observe the eects of the pro cessor Figure depicts a series of attempts to

do this The top line in the picture represents the interarrival times for an exp eriment

run in normal multiuser pro cessor mo de during the day

We wanted a quiet network to reduce eects of network on jitter We assumed

that most users do their work in the day so there should have b een less network trac

at night We recorded interarrival times at night to see if they app eared signicantly

dierent than those recorded during the day These times are depicted by the nd

line from the top of Figure The amount of jitter at night app ears no b etter than

the amount of jitter during the day We attribute this to the academic environment

in which we tested in which some students and faculty members are likely to work

in the early morning hours In addition many system backups are scheduled to b e

done at night in order to interfere less with the ma jority of the users during the day

We also wanted to reduce the eects of other workstation pro cesses on jitter We

ran an exp eriment in singleuser mo de during the day and recorded the interarrival

times depicted by the rd line from the top of Figure Still the amount of jitter

in singleuser mo de app ears only slightly b etter than the amount of jitter in multi

user mo de We could still show no correlation b etween jitter reduction and pro cessor

p ower

We then tried a combination of singleuser mo de and nighttime These interarrival

times are depicted by the nd line from the b ottom of Figure The amount of

jitter in this case is noticeably less than the amount of jitter from the multiuser mo de

exp eriment run during the day However we could still show no correlation b etween

jitter reduction and pro cessor p ower

In our last eort to remove as much jitter as p ossible we physically disconnected

our workstations and subnet from the surrounding networks and ran our exp eriments

in singleuser mo de Success These interarrival times are depicted by the b ottom

line in Figure This nearly at line represents a multimedia stream that is almost

jitterfree

Once we isolated the machines on a quiet network and ran our exp eriments in

single user mo de we were able to observe the eects dierent p ower pro cessors have

on jitter under light loads Figure depicts these results There is a strong cor

relation b etween increased pro cessor p ower and decreased jitter However having

dedicated workstations and networks is implausible in many realworld cases and

probably imp ossible for widearea networks such as the Internet

0.003

0.0025

0.002 correlation: .99

0.0015

0.001 1 / Variance (in microseconds ^ 2)

0.0005

0 0 10 20 30 40 50 60 70

Power (SPECint92)

Figure Jitter versus Power The horizontal axis is the SPECint result of the workstation

The vertical axis is variance in packet interarrival times The middle horizontal line is the least

squares line t The outer two lines form a condence interval around the line The correlation

co ecient is

Section Summary

Jitter correlates highly with pro cessor load Since multimedia applications are often

pro cessor intensive they often force pro cessors to run at a heavy load Moreover

in the past applications have tended to expand to ll or surpass available system

capacity making heavyload conditions likely in the future These heavily loaded

workstations will contribute to the jitter in multimedia applications

Fortunately under heavy loads faster pro cessors reduce jitter compared to slower

pro cessors However under normal load the reduction in jitter from the faster pro

cessors is overshadowed by the variance in average pro cessor and network loads To

fully determine the b enets of faster pro cessors to jitter it is imp erative to determine

if future multimedia application will in fact push pro cessor capacities to the limit

If so future multimedia application quality will b enet from faster pro cessors If not

application quality should b e improved by means other than pro cessor improvement

We can use our mo del from Section to determine if current and future pro cessor

capacities are b eing consumed by multimedia applications In addition as we show

in Sections and our mo del allows us quantitatively determine the b enet

to application quality from highp erformance pro cessors for to days and tomorrows

multimedia applications

Realtime Priority Exp eriments

From the previous exp eriments singleuser mo de and a dedicated network can nearly

eliminate jitter However in a multiuser system a dedicated pro cessor is seldom

available and a dedicated network is even more rare It seems unlikely that the

ab ove results alone will help users in most cases We turn to realtime priorities to

see how they compare to the b enets of a singleuser mo de pro cessor under realistic

load conditions

Sp ecic Exp erimental Design

We used Sun SPARC Classics running SunOS a Unix variant In Unix a pro cesses

priority determines how much CPU time the pro cess gets The kernel will decrease

the priority of pro cesses that have accumulated what it considers excessive CPU

time The priority of a pro cess is not the only thing that determines when a pro cess

will b e run To determine which pro cess should b e run next the scheduling mechanism

in the kernel uses a formula that takes into account each pro cesss priority how much

CPU time each pro cess has gotten recently and how long it has b een since each

pro cess has run SunOS extends Unix pro cess scheduling with realtime supp ort

A realtime pro cess runs in a separate scheduling class than normal pro cesses In

the default conguration a runnable realtime pro cess runs b efore any other pro cess

This gives realtime pro cesses the highest priority

Even without realtime extensions Unix provides the nice facility Nice allows

you to change the default priority of a pro cess Using nice it is p ossible to improve

the usermo de priority of a pro cess allowing it to run ahead of other eligible runnable

usermo de pro cesses Pro cesses using the nice facility may still suer long dispatch

latencies however Any pro cess that is susp ended waiting for IO will up on b eing

reactivated have a higher priority than any pro cess that is in usermo de regardless

of its nice value

We varied the pro cess priority from Default pro cesses were run without enhanced

priority Niced pro cesses were invoked using the setpriority system call as

an oset to determine the nal usermo de priority of the pro cess and RealTime

pro cesses were given xed priorities in the SunOS realtime scheduling class using

priocntl system call

As in Section we used a pro cess that increments a long integer variable in a

tight lo op to induce pro cessor load We ran separate exp eriments with one through

six of these counter pro cesses

3.5e+10

3e+10

2.5e+10

2e+10 Default

1.5e+10

Variance (in microseconds ^ 2) 1e+10

5e+09 Nice Real-Time

0 0 1 2 3 4 5 6

Load

Figure Jitter versus Receiver Load The horizontal axis is the pro cessor load as rep orted

by the Unix w command The vertical axis the the variance in interarrival times The lines are the

least squares line ts for default priorities niced priorities and realtime priorities as indicated

Results and Analysis

Figure compares the jitter for realtime nice and default priorities as pro cessor

load increases The slop es of the lines indicate how sensitive a pro cess running under

the given priority is to the eects of increases in pro cessor load The steep er the slop e

the more jitter the pro cess contributes as pro cessor load increases Both nice and real

time priorities have signicantly gentler slop es than default priority indicating that

nice and realtime pro cesses do not suer nearly as much jitter as do as default priority

pro cesses as pro cessor load increases

Figure is a closeup of the nice and realtime priorities in Figure The

most steeply slop ed line is for default priority The next most steeply slop ed line is

nice priority and the horizontal line is realtime priority Realtime priority has nearly

8e+08

7e+08

Default

6e+08

5e+08 Nice

4e+08

3e+08 Variance (in microseconds ^ 2) 2e+08 Real-Time

1e+08

0 0 1 2 3 4 5 6

Load

Figure Zo om of Jitter versus Receiver Load This graph is a closeup of Figure by

changing the upp er b ound on the vertical axis from e to e The horizontal axis is the

pro cessor load as rep orted by the Unix w command The vertical axis the the variance in interarrival

times The lines are the least squares line ts for niced priorities and realtime priorities as indicated

a at slop e indicating that realtime pro cesses show almost no increase in jitter as

pro cessor load increases Nice priority pro cesses however do suer from increased

jitter with increased pro cessor load as indicated by its steep slop e Notice that the

line for nice priority intersects the xaxis at ab out a load of two as opp osed to default

priority which intersects the xaxis around a load of one This indicates that even

under light loads nice priority pro cesses reduce jitter as opp osed to default priority pro cesses

Section Summary

Realtime priorities signicantly reduce jitter as pro cessor load increases The severity

of jitter that was observed when realtime priorities were used was extremely light

compared to the jitter without realtime priorities Realtime priorities show almost

no increase in jitter as pro cessor load increases Realtime pro cesses have priority

over other pro cesses allowing them to resp ond to multimedia data with less jitter

than do nice or default priority pro cesses Note however that realtime priorities

must b e used carefully A realtime pro cess can starve other pro cess of CPU time

including critical op erating system pro cesses We found this out the hard way when

we ran a counter pro cess in realtime mo de The counter pro cess consumed al l of the

CPU cycles not even allowing enough CPU time to kill the pro cess as the sup eruser

We were forced to reb o ot our computer which displeased our system administrator

and other users as our computer was a server In addition if many pro cesses are run

with realtime priorities the b enets in jitter reduction will most certainly b e reduced

compared with having just one pro cess with realtime To picture this assume that

every pro cess ran in realtime mo de All pro cesses would comp ete for CPU time in

the realtime queue and scheduling would o ccur just as it do es in the usual case The

b enets of having a realtime priority mo de would b e lost

Nice priorities signicantly reduce jitter as pro cessor load increases Under heavy

loads nice priority pro cesses had much less jitter than did default priority pro cesses

Even under light loads nice priorities signicantly reduced jitter versus default pri

orities The improved usermo de priority of a niced pro cess causes it to run ahead of

other eligible runnable usermo de pro cesses allowing the niced pro cess to resp ond to

multimedia data with less jitter However under heavy pro cessor loads nice priority

pro cesses did suer from some increased jitter while jitter under realtime pro cesses

remained nearly constant Even a nice priority pro cess will run after any newly

activated pro cess regardless of its nice value causing the niced pro cess to suer from

more jitter than would a realtime pro cess The newlyactivated pro cess will complete

its timeslice b efore the CPU reschedules and activates the niced pro cess When there

are more pro cesses this b ecomes more likely causing the niced pro cess to suer from

more jitter

The amount of jitter reduction from nice or realtime priorities dep ends up on the

pro cessor load We can use our mo del in Section to determine the pro cessor load

from multimedia applications As we show in Sections and with our mo del we

can also explore how the reduction in jitter from nice and realtime priorities b enets

the application quality as future multimedia applications push pro cessors to the limit

Network Exp eriments

From the exp eriments in Section and Section we know that highp erformance

pro cessors and realtime priorities aect jitter We next examine the eects that

highsp eed networks have on jitter

Sp ecic Exp erimental Design

We used pro cessor SGI Challenge L workstations Each workstation was connected

to three networks a Mbits Ethernet a Mbits Fibre Channel and a

Mbits HIPPI

We induced network load using netperf Netp erf is a b enchmark that can

b e used to measure the p erformance of many dierent types of networks It provides

tests for b oth unidirectional throughput and endtoend latency

Both HIPPI and Fibre Channel are p ointtopoint networks unlike Ethernet which

is a shared medium network Inducing network load b etween two workstations that

were not the sender and receiver did not increase jitter b ecause they did not share

the same physical wire Instead we ran the netperf and netserver pro cesses on

the same workstations as the sender and receiver resp ectively We ran the netp erf

pro cesses on dierent pro cessors than the sender and receiver by using the runon

command This minimized the jitter that might b e contributed by the pro cessor load

induced by the netperf pro cesses

Results and Analysis

Figure depicts the exp eriment results that show the interarrival times for the

three networks under two dierent conditions of network load The no load runs

are the exp eriments on a quiet network The load runs are the exp eriments run

with netperf The three no load lines are almost indistinguishable indicating

no correlation b etween jitter and network bandwidth However under high network

load there is a noticeable dierence in the interarrival times for the three networks

indicating there may b e a correlation b etween jitter and network bandwidth

Figure shows the relationship b etween jitter and network bandwidth for

loaded networks We exp ect that jitter decreases as network bandwidth increases

so we graphed the theoretical network bandwidth versus variance There are three

data p oints plotted one for each of Ethernet Fibre Channel and HIPPI The line

represents a least squares line t through the three data p oints The correlation co

ecient is indicating that there is a strong inverse relationship b etween jitter

and network bandwidth In other words as network bandwidth increases jitter de

creases Note that although the correlation co ecient is a high the fact that

there are only three data p oints is evident in the width of the condence intervals

around the least squares line t It would b e nice to have more data p oints in order

to strengthen the statistical signicance of our results To do this however we need

to have new networks on which to exp eriment

Section Summary

Under low network loads highsp eed networks do not signicantly reduce jitter Un

der low loads the network contributes little variation to the interarrival times for the

multimedia frames Most of the variance is caused outside the network rendering

Ethernet (no load)

Fibre Channel (no load)

HIPPI (no load)

Ethernet (load)

Fibre Channel (load)

HIPPI (load)

0 200 400 600 800 1000 1200 1400 1600 1800

Packet

Figure Interarrival Times for Dierent Networks and Network Loads There are multimedia

streams represented in this picture each by by a horizontal line depicting the interarrival times on

the indicated network The network was either either loaded or unloaded Each multimedia stream

is oset from the one b elow it by microseconds The horizontal axis is the packet number

5e-06

4e-06 HIPPI

3e-06

Ethernet

2e-06 1 / Variance (in microseconds ^ 2)

1e-06

Fibre Channel

0 0 100 200 300 400 500 600 700 800

Mbits/Second

Figure Jitter versus Network Bandwidth The horizontal axis is the network bandwidth

The vertical axis is variance The line is a least squares line t of jitter versus theoretical network

bandwidth for three networks Ethernet Fibre Channel and HIPPI The correlation co ecient is

any jitter reduction from the highsp eed networks insignicant

However under heavy network loads highsp eed networks signicantly reduce

jitter Under heavy loads network contention causes multimedia frames that are

ready to b e delivered to b e delayed Because network trac is often bursty the

delays are nondeterministic causing increased variation in the interarrival times for

the multimedia frames Highsp eed networks reduce the amount of time to deliver

the interfering network trac decreasing the variation in the multimedia interarrival

times

The amount of jitter reduction from highsp eed networks dep ends up on the net

work load We use our mo del from Section to determine the network load from

multimedia applications as the number of users increases Then using the results from

this section we can determine what eect jitter reduction from highsp eed networks

has on the application quality

Summary

Jitter hamp ers computer supp ort for multimedia Jitter is the variation in the end

toend delay for sending data from one user to another Jitter can cause gaps in

the playout of a stream such as in an audio conference or a choppy app earance to

a video display for a video conference There are several techniques that can b e

used to reduce jitter In this work we have exp erimentally measured the eects of

three jitter reduction techniques highp erformance pro cessors realtime priorities

and highsp eed networks

Jitter researchers have used a variety of metrics to measure the amount of jitter

in a multimedia stream However with the exception of range all previouslyused

jitter metrics are statistically similar Range is probably not an appropriate measure

of jitter b ecause it is very susceptible to outliers While outliers do create glitches in

an audio or video stream they can often b e smo othed over and ignored by human

eyes and ears

We nd highp erformance pro cessors realtime priorities and highsp eed networks

all signicantly reduce jitter under conditions of heavy load Multimedia applications

tending to b e resource intensive are likely to push pro cessors capacities to the limit

making conditions of heavy load likely As the growth in distributed collab orative

applications continues multimedia applications will push network bandwidths to the

limits also Thus highp erformance pro cessors realtime priorities and highsp eed

networks will all b e eective in reducing jitter

Computer systems continue to get faster while human p erceptions remain the

same Future system improvements may remove enough of the underlying jitter such

that applicationlevel jitter reduction techniques are unnecessary How far in the

future will this b e

In Section we see that as the area under the jitter comp ensation curve

decreased the required buer decreased From Figure if we had an area of

square microseconds or less we would require virtually no buering in order

to achieve acceptable quality From Figure we can predict that this will happ en

9

if jitter is square microseconds or less From our results in Sections and

we can predict how p owerful hardware must b ecome in order to achieve this

low jitter rate We can determine when this will b e if we assume b oth pro cessor

p ower and network bandwidth double each year as has b een the trend in the past

Figure depicts these predictions

For the next years hardware improvements alone will not reduce the eects of

jitter low enough to eliminate the need for application buering techniques However

if we implement realtime priorities in scheduling our multimedia stream we can

reduce jitter enough to eliminate the need for application buering to day

However improving jitter alone is not sucient to guarantee improving applica

tion quality In addition to jitter the application quality also dep ends up on latency

which decreases the application realism and data loss which reduces the application

7e+09

6e+09 Without Real-Time With Real-Time

5e+09

4e+09

3e+09

2e+09 Jitter (Variance in Interarrival Times)

Possible Threshold of Human Perception 1e+09

0 1998 1999 2000 2001 2002 2003 2004 2005 2006

Year

Figure Jitter versus Years The horizontal axis the year The vertical axis is the amount

of jitter There are two sets of predictions The top slightly curved line depicts the amount of

jitter in a system without realtime priorities The lowest slightly curved line depicts the amount of

jitter in a system with realtime priorities The horizontal line represent the threshold b elow which

application buering would not b e needed

resolution We have developed a mo del that allows us to evaluate the eects of jitter

reduction in the larger context of a users p erception of the multimedia application

Our mo del allows us to show how advances in networks and pro cessors will improve

application quality without tuning the op erating system or the application

Chapter

Application Metho d

Overview

In order to plan for the quality of computer enhancements we have developed a

metho d to apply our quality planning mo del to distributed collab orative multimedia

applications We start by studying the application from the p ersp ective of the user

The user describ es a distributed collab orative multimedia application and denes a

set of requirements that need to b e fullled if the application p erformance is to b e

acceptable We mo del the user application computer system and quality metric We

p erform some detailed exp eriments called micro experiments to measure the funda

mental comp onents of the application We test the accuracy of an analytic mo del

based on the micro exp eriments through larger exp eriments called macro experi

ments We use these macro exp eriments to calibrate the mo del Lastly we predict

the application quality as various mo del comp onents change Figure depicts our

metho d

We present the metho ds we use in applying our quality planning mo del to mul

timedia applications Our mo del is intended to lo cate the b ottlenecks in the p erfor

mance of distributed collab orative multimedia applications There are three classes

of solutions to this problem

Model Application Study Users Make Apps Quality Application U Req Predictions Jitter S Req Architecture Hardware Calibrate Model Perform Micro Perform Macro Experiments Experiments

Send Send Display Recv

Figure Quality Planning Metho d We have developed a metho d for applying our mo del to

distributed multimedia applications We start with an application develop our mo del p erform

micro and macro exp eriments and make quality predictions

Analytic Model An analytic mo del is a mathematical representation of a com

puter system Analytic mo dels are fast and exible making them inexp ensive

to develop and to run Analytic mo dels allow easy variation of system param

eters which provides insights into the eects of various parameters and their

interactions

Simulation A simulation is a software representation of a computer system

that may include application op erating system pro cessor and more Simula

tions often capture more system details than do analytic mo dels but are often

less useful for comparing tradeos among dierent parameters Simulation

measurements can often take a long time either to build the simulation or to

run it However simulation time can often compress wallclock time that is on

the order of days or weeks In these cases simulation runs can b e much faster

than measurement

Measurement Measurement is p ossible only if something similar to the pro

p osed system already exists When planning for future application p erformance

on future systems this is not p ossible When they are p ossible measurements

usually take longer than analytic mo dels but shorter than building and run

ning a simulation However there is greater variability in the time required

to p erform measurements than in the time required for simulation or analytic

mo deling Measurements are often the most b elievable to users which also

makes them useful for validating simulation or analytic mo dels

We use all three solutions where appropriate At the heart of our metho d is an

analytic mo del Analytic mo dels are most eective when they are based on actual

measurements We base our mo dels on careful measurements of fundamental

application comp onents in our micro exp eriments One of the concerns with analytic

mo dels is that they abstract away to o many details to b e useful for realworld appli

cation predictions In order to address this concern we p erform macro exp eriments

that compare our mo del predictions to realworld application p erformance in an at

tempt to test the accuracy of our mo dels If the realworld application we wish to

study is not available either b ecause the users are not at hand or the application has

not yet b een build we simulate the application p erformance in our macro exp eriment

and compare the measured simulation results to our mo del predictions

This Chapter is organized as follows Section presents related work in b ench

marks and capacity planning Section describ es the rst step of our metho d how

we study the application to obtain information for the mo del Section details

our mo del of the user application and computer system Section introduces the

exp eriments we use to measure fundamental application comp onents Section in

tro duces the exp eriments we use to test the accuracy of predictions we make based on

the fundamental comp onents Section briey mentions our metho ds for predict

ing application p erformance under dierent system congurations And Section

summarizes the imp ortant p oints of this chapter

Related Work

Benchmarks

benchmark v trans To subject a system to a series of tests in

order to obtain prearragned results not available on competitive systems

S KellyBo otle The Devils DP Dictionary

The pro cess of p erformance comparison for two or more systems by measurements

is called benchmarking and the workloads used in the measurements are called bench

marks Standard measures of p erformance provide a basis for comparison which can

lead to improvements and predict eectiveness under dierent applications If com

puter users ran the same programs day in and day out p erformance comparisons

would b e straightforward However as this will never b e the case systems must rely

on b enchmarks to predict p erformance of estimated workloads

As early as Lucas provided a survey categorizing b enchmarks His work

divided b enchmarks into seven categories

Timings Early computer systems were evaluated based on a comparison of pro cessor

cycle times and add times

Mixes The instruction mix is an attempt to provide a broader range of evaluation

than timings A frequency of execution is sp ecied and timings given appropri

ate weights An example of a previously used instruction mix is Gibson

Kernels A kernel program is a typical program that has b een partially co ded and

timed The kernel includes more parameters than mixes however they generally

do not include adequate IO considerations The Sieve of Eratoshenes is a

program sometimes used as a kernel

Models An analytic mo del is a mathematical representation of a computer system

The p erformance of the slotted Aloha network proto col is a wellknown example

of an analytic mo del

Benchmarks A b enchmark is a sp ecic program that is co ded in a sp ecic language

and executed on the machine b eing evaluated The Livermore Loops are one

of the oldest widelyused b enchmarks LINPACK has extensive use even up

until to day HINT promises to b e a b enchmark that measures workstation

capacity in a realistic manner

Synthetic Programs A synthetic program is a representation of a realworld pro

gram that is co ded and executed on the machine b eing evaluated Unlike b ench

marks synthetic programs are intended to represent an application that will b e

used on the machine Two p opular synthetic programs are Whetstone and

Dhrystone

Simulation Simulation uses software to represent some or all of a computer system

including the application and hardware Simulations can b e tracedriven event

driven or probabilistic such as in a Monte Carlo simulation

Monitor Monitoring is a metho d of collecting data on the p erformance of an exist

ing system Gprof is a metho d of collecting prole data on a running program

Etherfind is a metho d of collecting network information from an active net

work

Figure depicts our layout of his categories and the scop e of our quality planning

mo dels

Jain Hennessey and Patterson break b enchmarks into four categories

Real Programs Real programs are applications that users will actually run on the

system A system under such programs can b e observed to get a go o d indication

of p erformance under normal op erations Examples are language compilers such

as gcc acc and f textpro cessing to ols like TeX and Framemaker and CAD

to ols like Spice

Simulation

Analytic Models Quality Future Software and Hardware Planning Synthetic Programs Macro Monitoring Micro Current Software SPEC Dhryhstone Gprof

Livermore Loops Kernels HINT Sieve of Eratoshenes Linpack Benchmarks Mixes Timings Gibson Touchstone Etherfind

Current Hardware

Figure Scop e of Quality Planning Mo dels This picture depicts the scop e of our quality

planning mo dels in relation to b enchmarks Included as a subset of our quality planning mo dels are

our macro and micro exp eriments The vertical axes depict approximate ranges that the various

b enchmark categories span

Kernels Kernels are small key pieces from real programs Unlike real programs no

user runs kernel programs for they exist only for p erformance measurements

They are b est used to isolate p erformance of key system features Examples

are LINPACK and the Livermore Loops

Synthetic Benchmarks Synthetic have characteristics similar to those of a set of

real programs and can b e applied rep eatedly in a controlled manner They

required no realworld data les that may b e large and contain sensitive data

and can b e easily mo died without aecting op eration They often have builtin

measurement capabilities Examples are the Byte b enchmarks Whetstone

and Dhrystone

Toy Benchmarks Toy b enchmarks are small pieces of co de that pro duce results

known b efore hand They are small easy to type and can b e run on almost

any computer They may b e used in some real programs but often are not

Examples include the Sieve of Eratosthenes and Quicksort

SPEC the Standard Performance Evaluation Corp oration has sought to create

an ob jective series of synthetic b enchmarks SPEC is a nonprot corp oration

formed by leading computer vendors to develop a standardized set of b enchmarks

Founders including Ap olloHewlettPackard DEC MIPS and Sun have agreed on

a set of real programs and inputs that all will run The b enchmarks are meant for

comparing pro cessor sp eeds The SPEC b enchmark numbers are the ratio of the time

to run the b enchmarks on a reference system and the system b eing tested We use

SPEC results to make predictions in our quality planning mo del Performance results

from our research may also b e useful to other b enchmark researchers

Capacity Planning

Do not plan a bridges capacity by counting the number of people who

swim across the river today Heard at a presentation

The study of computer resources needed to meet exp ected computer demand is

called capacity planning Many exp erts agree that the main goal of capacity planning

is to maintain a balance b etween business growth and needs and explicit or implicit

service level ob jectives of computing supp ort In other words capacity planning is

a metho d used for pro jecting computer workload and planning to meet the future

demand for computing resources in a costeective manner

Tim Browning applies the concepts of capacity planning to computer systems

with a blend of measurement mo deling and analytic metho ds He provides business

oriented forecasts based on service level ob jectives chargeback and cost control

However capacity planning is a dicult task b ecause no clear economic framework

exists to do a costb enet analysis of information technology There have b een several

general frameworks established in an attempt to manage the growth of information

technology There are also many sp ecic capacity planning solutions designed

for sp ecic platforms and intended to plan for sp ecic workloads Generally it

is still easier to nd that a conguration will not supp ort a sp ecic level of service

than to predict it wil l

Snell describ es several commercial pro ducts that are designed for capacity plan

ning on the Internet Most of the to ols she describ es are high in cost and dicult

to deploy She cites integration of the the to ols into a companys environment and

ltering the data as ma jor obstacles

We develop a form of capacity planning that emphasizes the quality of the applica

tion as p erceived by the user enabling designers to tradeo application p erformance

and system cost

Study Application

Our metho d b egins by studying the application to obtain information on the users and

their requirements We start with the application users People interact with touch

Users Applications Quality User Requirements System Requirements Architecture

Hardware

Figure Quality Planning Mo del Our mo del of application p erformance incorp orates users

applications user requirements system requirements architecture and hardware In addition we

include a measure of application quality as p erceived by the user

sight and hearing We would like computers to come as close to reallife interaction

as p ossible even enhancing p ersonal interaction by allowing it across b oth time and

space The application is founded on a set of user requirements that need to b e

fullled for the application to b e eective for the user These include information

such as frame rate and frame size acceptable latency and jitter and tolerance of data

loss

Mo del

We use the information ab out the users and their requirements in our mo del Our

mo del for the quality of a distributed multimedia application incorp orates the user

application and hardware Figure depicts our mo del

Users The users of the application are those we used during the Study Applica

tion phase of our metho d as describ ed in Section Applciation users would like

to interactively view a D brain database in Sweden while working in Minnesota

synchronously train as a ghter pilot in a virtual battleeld with a thousand other

soldiers spread across the country or discuss research with a halfdozen scientists in

a computermediated meeting

Applications The applications are the software programs the users will run For

the examples under Users ab ove these would b e a ying interface to a D brain

database a DIS ight simulator and an audio conference

User Requirements The user requirements are the users interface to our mo del

The requirements they sp ecify may drive the selection of the underlying system in

order to make the application acceptable for the user

System Requirements The user requirements imp ose a series of requirements on

the system Some of these include network bandwidth disk throughput and pro cessor

p ower The metho d to determine the system requirements from the user requirements

dep ends up on the application and to some extent its implementation For instance

the workstation can make rendering frames the highest priority p ossibly forsaking

sending and receiving data to do so Or sending and receiving data can b e the top

priority at the exp ense of a lower display frame rate Data packets can b e compressed

b efore sending reducing network bandwidth but p ossibly increasing pro cessor load

from compression and decompression

Architecture Architecture is the structure of the distributed program which de

termines the lo cation of data and the distribution of the pro cessing Architecture can

greatly aect the application For example a multimedia application that supp orts

all users on one central mainframe would p erform dierently than one in which each

user had a dedicated workstation connected by a network

Hardware Given the system requirements and architecture the hardware needed

to supp ort the application can b e determined Hardware might range from a lowend

workstation with a T network up to a highp erformance workstation with a HIPPI network

Quality The variations in hardware architecture system requirements user re

quirements and the application all eect the application quality as p erceived by the

user The acceptability of the application to the user is determined by how closely

the application p erformance matches the user requirements We are developing a

quantitative measure of the dierence b etween application p erformance and user re

quirements We call this the application quality See Section for more details

As an brief example of the application of our mo del supp ose we wish to predict

the p erformance of a prop osed voice mail system that will allow a group of software

engineers browse their archived voicemail We rst determine the quality of the

audio required by the users either by MOS user testing or by an analogy to similar

applications The audio quality determines the user requirements The system re

quirements are derived from the user requirements with key system comp onents used

to examine tradeos For example we might vary the number of users the amount

of compression or the network proto col We choose an architecture and hardware on

which to analyze the system For example we might pick Sun Sparc workstations

connected via a base T Ethernet cable As describ ed in Section we build a

quality mo del based on the user requirements The system requirements architecture

and hardware are all used in the quality mo del to determine if the prop osed cong

uration is acceptable to the users We can then iterate on our metho d mo dify the

comp onent parameters and determine a new application quality

Micro Exp eriments

Exp eriments that measure pro cessor p erformance of the fundamental comp onents of

an application we call micro experiments We do micro exp eriments to allow us to

predict the eects of systems on applications built with those comp onents Some

fundamental comp onents for many multimedia applications include

Record data from the microphone or video co dec

Play data to the sp eakers

Render a frame to b e displayed

Display a frame to the screen

Read data from a disk

Write data to a disk

Compress data

Decompress data

Send a data packet to a client

Receive a data packet from a server

We use a pro cess that increments a long integer variable in a tight lo op to mea

sure the pro cessor load for the individual comp onents The use of this counter process

is depicted in Figure To obtain a baseline for our counter we run the counter pro

cess on a quiet machine This gives the pro cessor p otential for the machine depicted

by the single column on the far left For example a Sun IPX will count to million

p er second We then run the counter pro cess with each comp onent in the mo del

The dierence in the bare count and the comp onent count is the comp onentinduced

load For example the middle two columns show the count obtained if two counter

pro cesses were run simultaneously On a Sun IPX each counter would count to ab out

million p er second In the last two columns the pro cessor load of the send pro

cess would b e the count obtained on a bare machine less the count obtained by the

counter pro cess If the Sun IPX were sending data at a rate of a byte packet

every ms the count would b e ab out million in one second The dierence

b etween the million barecount and the million send count can b e converted

into milliseconds of pro cessor load resulting in a send load of ab out millisecond p er second

Barecount

Counter Counter Counter Counter Send

Figure The Counter Pro cess Our mo del of application p erformance incorp orates users

applications user requirements system requirements architecture and hardware In addition we

include a measure of application quality as p erceived by the user

To verify that the counter pro cess do es indeed accurately rep ort loads of pro cessor

b ound pro cesses with which it runs concurrently we ran the counter pro cess with

and other counter pro cesses Since we assume the counter pro cesses will have same

priorities we exp ect the counter value to b e count on bare machine x N

where N is the number of other counters running Figure shows the results of

this exp eriment The predicted values were within the condence intervals for all the

measured values

After carefully measuring the pro cessor load of each comp onent we can predict the

pro cessor load of an application built with those comp onents Changes in application

conguration or changes in hardware are represented by mo difying the individual

comp onents and observing how that aects p erformance

Macro Exp eriments

Exp eriments that measure p erformance of applications built with micro exp eriment

comp onents we call macro experiments We do macro exp eriments to test the accuracy

of micro exp erimentbased predictions of application p erformance

For example assume we have a twoperson audio conference that lasts for three

3.5e+08 predicted meaured 95% confidence interval

3e+08

2.5e+08

2e+08 Count

1.5e+08

1e+08

5e+07 0 1 2 3 4 5

Number of Extra Counters

Figure Count versus Number of Counters The vertical axis is the count obtained by the

counter pro cess The horizontal axis is the number of additional counters running The curve

represents the predicted value All p oints are shown with condence intervals The largest

interval is of the measure value The curving line is the predicted value The measurements were

taken on a Sun Sparc IPX

minutes Each comp onent of the audio conference record send receive and play

pro cesses the three minutes of audio conference data We predict the total pro cessor

load from our micro exp eriment measurements of the record send receive and play

loads In addition we predict the network load based on the audio data rate of the

workstations In our macro exp eriments we run a twoperson audio conference and

carefully measure the pro cessor and network load We then compare these measured

values to the predicted values in an attempt to test the accuracy of our predictions

metho ds

Predictions

By mo difying the fundamental application comp onents we can predict p erformance

on alternate system congurations This allows us to evaluate the p otential p erfor

mance b enets from exp ensive highp erformance workstations and highsp eed net

works b efore installing them Moreover we can investigate p ossible p erformance

b enets from networks and workstations that have not yet b een built

Our approach for evaluation of each alternative system is the same We mo dify

the parameters of our p erformance mo del to t the new system then evaluate the

resulting mo del to obtain p erformance predictions These analyses are intended to

provide a sense of the relative merits of the various alternatives rather than present

absolute measures of their p erformance

Our micro and macro exp eriments are done on only a handful of platforms How

ever we would like our predictions to b e accurate for untested platforms and even

future as yet unbuilt hardware In order to attempt these extrap olations we rely on

research in b enchmarks that compare the p erformance among systems and alternate

system congurations In particular we rely up on SPEC b enchmarks results to pre

1

dict the p erformance of application comp onents on untested workstations We rely

1

Many SPEC b enchmark results can b e found at the Performance Database Server PDS The

up on landmark studies in network and disk p erformance to predict p erformance on

alternate networks

Summary

The strengths in our mo del and implementation metho d include

Tradeos A exible way to compare the tradeo in p erformance for alternate

system congurations

Users A measure of p erformance from the p oint of view of the application

users

Validation Exp erimental conrmation that our metho ds of predicting pro ces

sor disk and network throughput the basis of our mo del are accurate

In the next three sections we apply our mo del to three distributed collab orative

multimedia applications

Audioconferences Multip erson synchronous audio conferencing

Flying d asynchronous scientic database browsing

Virtual Cockpit Distributed virtualreality ightcombat training

URL for the PDS is httpnetlibcsutkeduperformancehtmlPDStophtml

Chapter

Audio conferences

Overview

There is an increasing interest in the use of audio for computerbased communication

applications

Electronic mail includes audio along with text Multimedia editors en

hance text do cuments with audio annotations World Wide Web radio

stations spread audio across the world Movies containing audio in addition

to video are starting to grace consoles everywhere And audio conferences

synchronously link workstations

Generalpurp ose workstations can have advantages over the use of sp ecialized

hardware corp orate and academic environments have readyaccess to neces

sary hardware and teleconferencing can b e enhanced when computers are used

through the use of recordplayback onscreen sp eaker identication o or con

trol and subgroup communication

Audio and video streams similar to those in a teleconference are often inte

grated into larger distributed multimedia applications For example a shared

editor may allow several users to simultaneously collab orate on a do cument from

separate workstations Audio and video links coupled with the shared editor

enhance the editing pro cess by making it more like facetoface collab oration

Audio conferences frequently supp ort other collab orative tasks that are themselves

pro cessorintensive Therefore ecient pro cessor use is essential Our goal is to iden

tify improvements that reduce audio conference pro cessor load The ma jor contribu

tion of this section is an exp erimentallybased comparison of the pro cessor p erfor

mance b enets of ve p otential audio conference improvements

Faster processors How much do faster pro cessors b enet audio conference pro

cessor load

Faster communication How much do more ecient network proto cols and

faster network sp eeds b enet audio conference pro cessor load

Better compression How much do improved compression techniques b enet

audio conference pro cessor load

Hardware support How much do es Digital Signal Pro cessing DSP hardware

reduce audio conference pro cessor load

Silence deletion How much do es removing the silent parts from a conversation

reduce audio conference pro cessor load

We fo cus particularly on a comparison of the rst four areas to silence deletion

Without silence deletion in a unicast network environment network bandwidth will

2

increase ON where N is the number of audio conference participants However in

most conversations only one p erson sp eaks at a time Silence deletion detects silence

only transmitting the sound of the p erson who is presently sp eaking reducing the

network bandwidth increase to ON Although silence deletion algorithms take

additional pro cessing time they may yield a net savings by reducing the amount of

2

data that must b e communicated from ON to ON Therefore we hypothesize

silence deletion will improve the scalability of audio conferences more than any of the

other four improvements

Our analysis can help direct research in networks multimedia and op erating sys

tems to techniques that will have a signicant impact on audio conferences In addi

tion our approach may generalize to video and other forms of multimedia

This Chapter is organized as follows Section of this section describ es related

work in audio conference exp eriments measuring pro cessor load analyzing UDP us

ing silence deletion and audio compression Section introduces our mo del of an

audio conference Section details micro exp eriments that measure the pro cessor

load of each comp onent Section describ es macro exp eriments that test the ac

curacy of the predictions based on the micro exp eriments Section analyzes the

exp eriments and pro jects the results to future environments And Section sum

marizes the imp ortant contributions of this section

Related work

Audio conference Exp eriments

There has b een a variety of exp erimental audio conference work Casner and Deer

ing p erformed a widearea network audio conference using UDP multicast They

found disabling silence suppression increases average bandwidth and eliminates the

gaps b etween packets that give routers a chance to empty their queues They recog

nized that exp erimenters need b etter to ols to measure audio conference p erformance

Our mo del may b e one of the to ols they seek They conjectured that ubiquitous mul

ticast routing supp ort can greatly reduce network and pro cessor loads In addition

they describ ed several ongoing exp eriments in which readers can participate

Terek and Pasquale implemented an audio conference with an Xwindow server

They describ ed the structure and p erformance of their system In particular

they describ ed a strategy for dealing with realtime guarantees

Gonsalves predicted that without software or proto col overhead a three Mbps

Ethernet could supp ort simultaneous way Kbps conversations Thus if

our results show what is needed to enable the pro cessors to handle the conversation

loads the networks can

Riedl Mashayekhi Schnepf Frankowski and Claypo ol measured network loads of

audio conferences using silence deletion They found silence deletion signicantly

reduces network loads We analyze how silence deletion aects pro cessor loads

Frankowski and Riedl study the eects of delay jitter on the delivery of audio data

They developed a heuristic for managing the audio playout buer and compare

it to several alternative heuristics They found none of the heuristics is sup erior under

all p ossible arrival distributions

Measuring Pro cessor Load

Jeay Stone and Smith discussed a realtime kernel designed for the supp ort of multi

media applications They achieved some realtime guarantees through utilizing

close to eighty p ercent of the pro cessor Our results may indicate metho ds that can

trim audio conference pro cessor loads while achieving the same guarantees

Lazowska parameterized queuing network p erformance mo dels to assess the alter

natives for diskless workstations He used a counter pro cess to measure pro cessor

loads We use a similar pro cess see Section of the present do cument

Riedl Mashayekhi Schnepf Frankowski and Claypo ol measured the pro cessor

load of an audio conference with no o or control and no silence deletion using a

counter pro cess They showed the pro cessor loads quickly b ecome prohibitive

under increasingly large audio conferences We provide further analysis and a comp o

nent breakdown of the pro cessor load

Analyzing UDP

Cabrera measured throughput for UDP and TCP for connected Sun workstations

In analyzing the callstack for UDP in detail he determined the checksum at

the receiving end takes the most pro cessor time this checksum may not b e needed

for adequate quality sound

Kay and Pasquale measured delay at the pro cessor end for sending UDP pack

ets for a DEC They found checksumming and copying dominated the

pro cessing time for high throughput applications

Bhargava Mueller and Riedl divided communications delay in Suns implemen

tation of UDPIP into categories such as buer copying context switching proto

col layering Internet address translation and checksum implementation They

found so cket layering and connection were the most exp ensive categories We use

their analysis of their kernel buering techniques in dening our exp eriment

Using Silence Deletion

Rabiner viewed voice as a measure of energy and presented an algorithm for discov

ering the endp oints of words Henning Schulzrinne implemented an audio confer

encer with silence deletion We adapt and build up on their view of voice and

silence deletion algorithms

Digital Audio Compression

Pan surveyed techniques used to compress digital audio signals He discussed

ulaw Adaptive Dierential Pulse Co de Mo dulation ADPCM and Motion Picture

Exp erts Group MPEG compression techniques which are not sp ecically tuned

to human voice Kro on and Swaminathan describ ed techniques to improve p erfor

mance of Co de Excited Linear Prediction CELP Type co ders compression tech

niques tuned to human voice We predict the pro cessor load of audio conferences

+ + + + + Record Silence Send Receive Mix Play

Deleteion

Figure Audio conference Mo del Our mo del of audio conference pro cessor load that includes

recording from the audio device silence deletion sending and receiving packets mixing sound packets

that should b e played simultaneously and writing to the audio device

using such compression techniques

Mo del

Figure depicts our mo del of an audio conference pro cessor load Our mo del is

based on the comp onents of recording silence deletion sending receiving mixing

and writing Recording is the pro cessor load for taking the digitized sound samples

from the audio device Silence deletion is the pro cessor load for applying one of the

deletion algorithms to the record sample Sending is the pro cessor load for pack

etizing the sample and sending it to all other stations Receiving is the pro cessor

load for pro cessing all incoming packetized samples Mixing is the pro cessor load for

combining sound packets that arrive simultaneously Writing is the pro cessor load

for delivering the incoming samples to the audio device Our mo del asserts we can

predict audio conference pro cessor load from the sum of the ab ove comp onents

Silence deletion removes silent parts from sp eech Exp eriments have shown that

silence deletion substantially reduces network load for two reasons

In a typical N p erson conversation at any given time one p erson is talking and

N are silent With silence deletion only the talking p ersons packets are sent

each workstation must send only N of the packets on average Without silence

Sound

Threshold

Silence

Figure Absolute Silence Deletion Algorithm The zigzap line represents the energy of each

byte The horizontal line represents the soundsilence threshold Bytes with energy ab ove the

threshold are treated as sound Bytes with energy b elow the threshold are treated as silence

deletion all packets are sent each workstation sends N of the packets A linear

2

increase in participants results an N increase in network load

Pilot tests suggest ab out to of the digitized sp eech data can b e identied

as pauses b etween words or sentences Silence deletion may remove these pauses

Although silence deletion algorithms themselves take additional pro cessing time

they may yield a net savings in total pro cessor load by decreasing the pro cessor costs

of the communication

We consider four common silence deletion algorithms All four algorithms use

the energy contained in each byte

Absolute uses the average energy over chunks of bytes to determine silence It

removes all chunks with energy less than a threshold See Figure for a visual

representation of the Absolute algorithm

Ham an adaptation of an algorithm used by Ham radio removes chunks with

enough consecutive byte energies b elow a threshold Ham decreases a counter

for each byte with energy b elow the threshold When the counter reaches zero

the chunk is not sent on Any byte ab ove the threshold resets the counter

to a maximum value See Figure for a visual representation of the Ham algorithm

Counter

Silence

Figure Ham Silence Deletion Algorithm The zigzap line represents the counter used to

determine sound or silence When the counter reaches zero the sound bytes are treated as silence

Weight of Bytes

Time

Figure Exp onential Silence Deletion Algorithm The curved line represents the weighting of

each byte in determining sound or silence The more recent bytes to the left have a greater weight

than older bytes to the right

Exponential also uses the energy in each byte It removes all exp onentially

weighted byte energies that are b elow a threshold The most recent bytes are

weighted most heavily Exp onential decreases a counter value according to the

decay value When the counter drops b elow the threshold the bytes are not sent

on The decay determines the exp onential change When decay is small the

counter uctuates quickly When decay is large the counter uctuates slowly

See Figure for a visual representation of the Exp onential algorithm

Dierential uses the changes in energy of each byte to determine silence All

chunks with changes b elow a threshold are interpreted as silence The algorithm

keeps a counter of the number of nonchanges If there are to o few changes the

Silence

Silence

Figure Dierential Silence Deletion Algorithm The zigzap line represents the energy of

each byte The horizontal lines represent p erio ds where there was little change in the energy of

consecutive bytes Little or no change in sound energy is treated as silence Signicant change in

sound energy is treated as sound

counter gets higher than a threshold then dierential do es not send the chunk

on When a change is encountered the counter is reset to zero See Figure

for a visual representation of the Dierential algorithm

In our pilot tests the Absolute and Exp onential algorithms often yielded p o or

sound quality Absolute is eective when signaltonoise ratio is very high This may

o ccur in a recording studio or with very high delity magnetic tap e However it is

not practical in realworld situations Further MOS testing should b e done to

verify that Absolute and Exp onential do indeed yield p o or sound quality In the

meantime we concentrate on the p erformance of the Dierential and Ham silence

deletion algorithms

Micro Exp eriments

Design

Our micro exp eriments were designed to measure the pro cessor load of audio confer

ence comp onents We chose two Sun workstations the MHz SLC and the MHz

IPX to test if the comp onents of the audio conference scale with pro cessor sp eed

To obtain the pro cessor load for each comp onent of the mo del we ran the counter

pro cess describ ed in Section in conjunction with a pro cess for each of the comp o

nents of the audio conference see for more information

Read The read pro cess takes sound from the audio device User level pro cesses

access Suns audio device driver devaudio through open and read calls

The audio device delivers sound at a rate of bytes p er second By default

it delivers it in K chunks but this can b e changed by mo difying kernel variable

udio C siz e The audio device compresses bit sound samples into bit

a 7 b

ulaw

The packet size determines the read frequency The read pro cess sets a signal

and alarm to read from the audio device at the appropriate frequency The

pro cess reads until killed by the counter pro cess

Deletion The deletion pro cess measures the load of the deletion algorithms

Since we want to measure the load of the algorithms only we need to eliminate

reads from the disk Thus the deletion counter must follow a dierent paradigm

than the other pieces

The deletion pro cess forks a counter pro cess which waits until given the

kill signal to start Deletion then reads seconds of sound into RAM and

records the number of page faults from the getrusage command It then

sends a signal to the counter to start it counting Deletion applies the deletion

algorithms describ ed ab ove to the second sound buer After rep eating

pro cessing the appropriate number of iterations see Section on page

deletion kills the counter pro cess and records the page faults The page faults

are recorded to b e sure that paging activity is not measured along with the

deletion

Send The send pro cess sends UDP packets Send op ens and binds to a so cket

It sets a signal and alarm to send a packet every milliseconds Send

delivers packets until killed by the counter

A shell script starts the send pro cess and dummy receive pro cess to pull the

packets o the network

Receive The receive pro cess receives UDP packets Receive op ens and binds to

a so cket It blo cks on the so cket until a packet arrives Receive listens on the

so cket until killed by the counter

Add In an audio conference with more than two participants sound recorded

from two dierent stations may need to b e played simultaneously To do this

the ulaw soundbytes must b e converted to linear form added and converted

back to ulaw We use an ecient table lo okup to do b oth conversion

Add rep eatedly adds two arrays of sound together byte by byte The two bytes

to b e added are converted from ulaw to linear added and converted back to

ulaw When killed by the counter add rep orts the number of bytes it added

Play The play pro cess delivers sound to the audio device The packet size

determines the write frequency The read pro cess sets a signal and alarm

to read from the audio device at the appropriate frequency The pro cess reads

until killed by the counter

Data Collection

Since the counter pro cess measurements are sensitive to other pro cesses we p erformed

the exp eriments on machines in single user mo de In single user mo de the pro cessor

runs a bare minimum of system pro cesses and no other user pro cesses

We ran the counter pro cess for seconds to amortize startup costs To deter

mine the number we ran the counter pro cess for increasing times and recorded

the standard deviation of its counts The standard deviations level out just b efore

At this p oint the standard deviation is only of the mean Thus we chose

one second counter run as one iteration Pilot tests indicated that ve iterations

for each machine at each packet size were needed to achieve reasonable condence

intervals

The pro cess we used to measure the deletion algorithm was memory intensive to

avoid IO costs In order to avoid measuring unwanted paging we recorded the total

page faults during the exp erimentsThe number of page faults during the deletion

iterations is almost always three which we accepted as the baseline case We decided

to discard cases that had more than three pagefaults as they incur extra pro cessor

load from paging In our exp erimental runs this situation never o ccurred

The sound chunk size and the threshold levels determine how the deletion al

gorithms p erform We tuned them such that they deleted ab out to of live

audio conference sp eech in accordance to the sound pilot tests mentioned ab ove We

conrmed that the sp eech with the sound deleted was still very understandable

The kernel changes b etween large and small buers at various packet sizes has a

direct inuence on packet sending and receiving times User messages are

copied into buers in kernel space The kernel may use one large buer and one

copy for certain messages while it uses several smaller buers and several copies for a

slightly smaller message While the kernel is using the small buers there may b e a

steady continuous increase in time p er message as the message sizes increase When

the kernel stops using the smaller buers in favor of the larger buers there may b e a

break in the linear continuity To avoid these p ossible nonlinearities we collected data

on to byte packets which we assume covers most audio conference packet

sizes

Results

The second counts on bare machines are shown in Table

Figure shows the line equations obtained from the counter measurements for

the deletion algorithms on the IPX We have similar graphs for the SLC and for other

Machine Bare Count Left Endp oint Right Endp oint

SLC

IPX

Table Bare Counts for SLC and IPX with Condence Intervals Indicated by the Left

and Right Endp oints

comp onents of the mo del but to avoid redundancy we do present them here

Table shows the values for the line equations for each of the audio conference

comp onents for each machine type

The p erpacket and p erbyte terms ab ove p ertain to the equations Loadcomponent

perpacket perbyte bytes The equations are the pro cessor costs for each

comp onent of an audio conference from which we can pro ject the cost of a complete

audio conference see Section

Macro Exp eriments

Design

In order to test the accuracy of our mo del in predicting audio conference pro cessor

loads we measured the p erformance of a simple audio conferencer Speak Sp eak is

two p erson uses UDP can employ any of the ve deletion algorithms Absolute

Dierential Exp onential Ham or None and has little extra userinterface overhead

We used Internet Talk Radio ITR les rather than real conversants This made

our exp eriments more repro ducible and gave us a large conversation sample space

from which to choose Since the ITR les have one p erson sp eaking most of the

time the silence deletion algorithms typically deleted only of the packets As

the number of audio conference participants increased the one p erson sp eaking in

the ITR audio would reect the group communication characteristics less and less

12

Exp Diff Abs 10 Ham

8

6 Time in Milliseconds 4

2

0 0 500 1000 1500 2000 2500

Number of Mbytes Processed

Figure Pro cessor Time for Deletion Algorithms on the Sun IPX The four deletion algorithms

are shown for their time to pro cess seconds worth of sound All p oints are shown with

condence intervals

Op eration SLC p erpacket SLC p erbyte IPX p erpacket IPX p erbyte

Record

Absolute

Dierential

Exp onential

Ham

Send

Receive

Mix

Play

Table Values for Sun SLC and Sun IPX Line Fits for audio conference comp onents Units are

in milliseconds

However the actual audio data used in these exp eriments do es not matter since our

mo del is parameterized by the amount of silence deleted

We did exp eriments on the ve p ossible silence deletion metho ds on the SLC and

two such metho ds on the IPX A shell script initiated a remote Sp eak pro cess and a

lo cal Sp eak pro cess One two hundred second conversation was one data p oint We

rep eated each data p oint times We predict the load from the sp eak pro cesses by

using the micro exp eriment results From the conversation length the record size and

the sample rate we calculate the total packets read By proling the sound les with

the deletion algorithms we know the number and size of the packets sent received

and written Because sound only arrives from one other Sp eak pro cess there is no

mix comp onent

Results

Table shows the results of the seven macro exp eriments

Machine Algorithm Predicted Actual Left Endp oint Right Endp oint

SLC Absolute

SLC Dierential

SLC Exp onential

SLC Ham

SLC None

IPX Dierential

IPX None

Table Predicted and Measured Loads from the Macro Exp eriments with Condence

Intervals Indicated by Left and Right Endp oints Loads are in seconds Maximum load is

seconds

The discrepancy in predicted and actual values may b e due to the unforeseen costs

that o ccur when putting micro comp onents together In most cases the predicted

values are within of the actual values We therefore consider the pro jected results

presented in Section to b e signicant only if the dierences are larger than

Predictions

The pro cessor loads in the macro exp eriments compared to the pro cessor loads pre

dicted by adding the comp onents of the micro exp eriments suggests that our mo del

may b e a reasonable way to predict the pro cessor load of audio conferences It is

dicult and exp ensive to run exp eriments with many p eople on a large number of

workstations Instead from our micro exp eriments we extrap olate our results to con

versations that have more silence and audio conferences that have more participants

Furthermore we adjust the pieces of our mo del to compare the p erformance b enets

of four p otential audio conference improvements faster pro cessors faster communi

cation b etter compression and Digital Signal Pro cessing DSP hardware We also

compare the b enets of b etter silence deletion algorithms

All analysis can b e done for b oth the SLC and IPX and with any of the four silence

deletion algorithms To avoid rep etition we do our extrap olations on one machine

type IPX with one deletion algorithm Dierential We use the fastest pro cessor

we studied to improve the quality of our extrap olations to even faster machines Our

results are largely indep endent of the type of silence deletion

Increasing Participants

What happ ens when we have an increasing number of audio conference participants

We can extrap olate the loads at each workstation to an N p erson audio conference

The load dep ends up on the number of participants the total conversation time the

packet size the p ercentage of packets deleted and the p ercentage of sound removed

from the remaining packets The load without silence deletion do es not have the

deletion algorithm comp onent The load with silence deletion do es not have the

mix comp onent since we assume only one p erson sp eaks at a time and the deletion

algorithm removes all sound of those who are not sp eaking

Figure shows the predicted load for an audio conference without silence dele

tion and an increasing number of participants The audio conference comp onents are

displayed each line representing the sum of the particular piece and the pieces b elow

it The total load is the value of the line lab eled play since it is the sum of all the

comp onents The communication comp onents send receive and mix all increase

as the number of participants increases The mix comp onent increases the fastest

accounting for approximately of the pro cessor load with participants

Figure shows the predicted load for an audio conference with silence deletion

and an increasing number of participants Again the audio conference comp onents

are all displayed There is no mix comp onent since we assume only one participant

is sp eaking at a time The communication comp onents are only approximately of

40 play mix 35 receive send record

30

25

20

15 Load in Seconds (200 Max)

10

5

0 2 3 4 5 6 7 8

Number of Participants

Figure Sun IPX Pro cessor Load Without Silence Deletion The graph reads from the b ottom

Each piece is the sum of the pieces b elow it Thus the total load is indicated by the play piece at

the top The maximum load is seconds

14

12

10

8

6 Load in Seconds (200 Max)

4

play receive 2 send delete record

0 2 3 4 5 6 7 8

Number of Participants

Figure The IPX pro cessor load with silence deletion The graph records from the b ottom

Each piece is the sum of the pieces b elow it Thus the total load is indicated by the play piece at

the top Since only one p erson sends packets at a time there is no mix comp onent The maximum

load is seconds

the total load for all audio conferencing The record and play from the audio device

account for ab out of the audio conference load which seems disprop ortionately

high compared to the communication comp onents The silence deletion comp onent is

also large accounting for just under of the audio conference load Section

investigates the eects of reducing this comp onent through software optimization or

DSP hardware

Figure shows the total loads with silence deletion and the total loads without

silence deletion Here and in all subsequent graphs only the total loads are displayed

For three or more participants silence deletion reduces pro cessor load compared to

not using silence deletion

40

35

30

25

20

15 Load in Seconds (200 Max)

10

5 with deletion without deletion 0 2 3 4 5 6 7 8

Number of Participants

Figure Comparison of pro cessor loads on a Sun IPX with and without silence deletion Only

the total load pro cessor is indicated The maximum load is seconds

Increasing Silence

What happ ens when the conversation has more silence in it The amount of silence

will vary as conversation characteristics such as sp eakers and topics change The

p ercentage of bytes in packets after deletion has an inconsequential eect on pro cessor

load b ecause most cost is p erpacket but the p ercentage of packets after deletion

may have a greater eect

Figure shows the result of our extrap olations to an increasing amount of si

lence For a two p erson audio conference the pro cessor load without silence deletion

is always less than the load with silence deletion Thus for a two p erson audio confer

ence there is never a pro cessor b enet from using silence deletion even when of

the packets are removed This is b ecause it takes less pro cessor time to send a packet

once than it do es to remove silence from it For and p erson audio conferences

the load without silence deletion is always greater than the load with silence deletion

Thus for audio conferences of or more any silence deletion results in a reduction in

pro cessor load over no silence deletion The largest part of this eect is that there is

only one p erson sp eaking at a time and even basic silence deletion algorithms delete

all of the sp eech of silent participants

Potential Audio conference Improvements

We adjust the comp onents of our mo del to compare the p erformance b enets of four

p otential ways technology could b e used to improve audio conference p erformance

faster pro cessor faster communication b etter compression and Digital Signal Pro

cessing DSP hardware

Faster Pro cessors What happ ens when the pro cessors b ecome faster Faster pro

cessors decrease the p erbyte times and the p erpacket times of all comp onents Fig

ure shows the eects of faster pro cessors for a p erson audio conference Notice

that even the conversations without silence deletion use less than of the pro cessor

40

35 2 person with deletion 4 person with deletion 8 person with deletion 2 person without deletion 30 4 person without deletion 8 person without deletion

25

20

15 Load in Seconds (200 Max)

10

5

0 0 20 40 60 80 100

Percent Packets Removed

Figure Aect of p ercentage of packets removed on conversations with silence deletion for

and participants For comparison conversations without silence deletion are also shown

depicted by the horizontal lines

30

without deletion with deletion 25

20

15

10 Load in Seconds (200 Max)

5

0 1 2 3 4 5 6 7 8

Factor of Increase in Processor Speed

Figure The eects of faster pro cessors on audio conferences with and without silence deletion

The number of participants is xed at Maximum load is seconds

with an times faster pro cessor With a fast enough pro cessor the load from audio

conferences may b e insignicant with or without silence deletion Similar results will

hold for to participants

Faster Communication What happ ens when the network proto cols b ecome more

ecient A more ecient network proto col decreases the amount of pro cessor time

required to send and receive the sound packets Figure shows the predicted eect

of an times faster network proto col Since the pro cessor load of the network proto col

is so slight the audio conference pro cessor load do es not improve much despite a

signicant decrease in network proto col load

What happ ens when multicasting is used in place of unicasting With multicas

ting only one send is required for each sound packet regardless of the number of

participants Thus the send comp onent decrease by a factor of N However the

40 without deletion, current protocol without deletion, faster protocol 35 with deletion, current protocol with deletion, faster protocol

30

25

20

15 Load in Seconds (200 Max)

10

5

0 2 3 4 5 6 7 8

Number of Participants

Figure Audio conference pro cessor load with and without silence deletion with an times

faster network proto col The pro cessor loads under the current network proto col is shown for

comparison With silence deletion the two lines overlap Maximum load is seconds

40 without deletion, unicast without deletion, multicast 35 with deletion, unicast with deletion, multicast

30

25

20

15 Load in Seconds (200 Max)

10

5

0 2 3 4 5 6 7 8

Number of Participants

Figure Audio conference pro cessor load with and without silence deletion for multicasting

The loads under unicasting are shown for comparison With silence deletion the two lines overlap

Maximum load is seconds

receive mix and play comp onents remain unchanged Figure shows the pre

dicted eect of multicasting on audio conferences with and without silence deletion

Since the pro cessor load for sending packets is small compared to the mix and play

comp onents multicast routing reduces pro cessor load only slightly

Better Compression How do forms of compression other than silence deletion

aect audio conference pro cessor loads Compression reduces the packet size but not

the number of packets The silence deletion algorithms we used are fairly simple

We assume they are a lower b ound on the pro cessor complexity of most compression

and decompression algorithms We estimate the pro cessor load of compression from

the pro cessor load of silence deletion and use a range of values for the amount audio

is compressed Audio conference loads with other forms of compression will have

35 10% 20% 30% 40% 30 50% 60% 70% 80% 25 90% silence deletion

20

15 Load in Seconds (200 Max) 10

5

0 2 3 4 5 6 7 8

Number of Participants

Figure The eects of other forms of compression on audio conference pro cessor load For

comparison pro cessor load with silence deletion is displayed The maximum load is seconds

fullsized packets for recording and compressing and reduced packets for sending

receiving mixing and writing In addition they will have an additional uncompressing

comp onent

Figure shows the predicted eects of other forms of compression Only algo

rithms that compress sound bytes by or more p erform b etter than silence deletion

for more than p eople It app ears unlikely that compression algorithms b etter than

this can b e exp ected to run in realtime Audio conference pro cessor load using com

pression scales worse than audio conference pro cessor load using silence deletion in

every case

Digital Signal Pro cessing Hardware In Section we observe that the si

lence deletion comp onent accounts for almost half the pro cessor load A DSP chip

might completely remove the load of silence deletion from the pro cessor Such hard

40 without deletion software deletion 35 hardware deletion at user-level hardware deletion in kernel

30

25

20

15 Load in Seconds (200 Max)

10

5

0 2 3 4 5 6 7 8

Number of Participants

Figure Audio conference pro cessor load with silence deletion done in hardware compared

with software and without silence deletion The two dierent hardware deletion designs having a

zerocost user level call or having the hardware in the kernel are presented The maximum load is

seconds

ware silence deletion could b e available at the user level through memory mapping

p erforming silence deletion at nearzero pro cessor cost Or the chip could b e in the

kernel removing silence b efore the user makes the record call reducing the number of

bytes in each record call Figure shows the predicted eects of the two hardware

silence deletion designs Hardware silence deletion always reduces pro cessor load com

pared to no silence deletion In addition hardware silence deletion halves pro cessor

load compared to software silence deletion There is no signicant dierence b etween

the two hardware deletion implementations

What happ ens when we have b oth compression and silence deletion in hardware

DSP chips could completely remove the load of silence deletion and compression from

the pro cessor Since we observed that a zerocost user level call to the DSP or the

35 software compression hardware compression software silence deletion hardware silence deletion 30 hardware compression and hardware silence deletion

25

20

15 Load in Seconds (200 Max) 10

5

0 2 3 4 5 6 7 8

Number of Participants

Figure Audio conference pro cessor load with hardware silence deletion and hardware com

pression For comparison software and hardware silence deletion and software and hardware com

pression are shown All hardware implementations have the DSP chip in the kernel Maximum load

is seconds

DSP in the kernel p erform similarly Figure we only present data for the DSP

in the kernel Figure shows the pro cessor load for hardware silence deletion and

compression Having b oth silence deletion and compression in hardware decreases

total pro cessor load by

In Section we noted that in conversations without silence deletion the mix

ing comp onent accounts for over of the pro cessor load Hardware mixing can

b e done with a DSP chip or on a multichannel audiob oard When mixing is done

in hardware the pro cessor do es not have to mix The pro cessor plays all incom

ing sound packets to the audio device so the load of writing then scales with the

number of participants Figure shows the predicted eects of hardware mixing

Hardware mixing signicantly reduces nonsilence deletion audio conference pro cessor

40 software mixing, without silence deletion hardware mixing, without silence deletion 35 with silence deletion

30

25

20

15 Load in Seconds (200 Max)

10

5

0 2 3 4 5 6 7 8

Number of Participants

Figure Audio conference pro cessor load with packet combining done in hardware compared

with packet combining done in software The silence deletion conversations are unaected by hard

ware mixing Maximum load is seconds

loads However such loads still increase rapidly with the number of participants

b ecoming larger than the loads under silence deletion at four or more p eople

We observed in the micro exp eriments Section that the audio comp onent

app eared comparatively large Measurement of three classes of Suns audio devices

show that the audio device eciency is improving disprop ortionately to pro cessor

sp eed Figure Perhaps the audio device will eventually b e made to record and

play as eciently as sending and receiving packets a fold increase Figure

shows the pro cessor load for an times faster audio device For audio conferences

with silence deletion a faster audio device can reduce total Sun pro cessor load by

almost onehalf

70

SLC IPC 60 IPX

50

40

30

Time Per Chunk in Milliseconds 20

10

0 0 500 1000 1500 2000 2500 3000 3500 4000

Chunk Size

Figure Record times p er byte for dierent classes of machines We would exp ect the slop es

to scale according to the MHz clo ck sp eeds but they actually scale in ab out a

ratio The improved p erformance on the IPC and IPX may b e due to b etter audio hardware The

dashed lines represent predictions b eyond the measured data

40

without silence deletion, current device 35 without silence deletion, faster device with silence deletion, current device with silence deletion, faster device 30

25

20

15 Load in Seconds (200 Max)

10

5

0 2 3 4 5 6 7 8

Number of Participants

Figure Eects of an times faster audio device on pro cessor loads For comparison loads

under the current device are shown Maximum load is seconds

12

10

8

deleted 15% deleted 60% deleted 90% 6

Load in Seconds (200 Max) 4

2

0 1 2 3 4 5 6 7 8

Number of Participants

Figure Audio conference pro cessor load with dierent amounts of silence deleted The

is based on a fairly p o or algorithm is the maximum observed during our pilot tests We

consider deletion to b e an upp er b ound on silence deletion algorithms The maximum load is

seconds

Improved Silence Deletion Algorithms The ab ove exp eriments and equations

all used a silence deletion algorithm that removed only of the packets Our pilot

tests suggest that theoretically deletion algorithms could remove as much as of

sp eech packets Figure shows the predicted eects of increasing the amount of

silence deleted from the sp eakers sp eech Improving the deletion algorithm to remove

as much as of the packets improves the pro cessor load by only approximately

The biggest reduction in load is in removing the silence of the nonsp eakers

doing b etter than that improves pro cessor load only a little

Quality

The ab ove predictions are all concerned with the pro cessor load of an audio conference

However having an adequate pro cessor capacity to meet the predicted audio confer

ence load do es not guarantee that the user will nd the audio conference acceptable

How do es the user p erceive the audio What is the audio conference quality

In order to apply our quality mo del to a audio conference under various system

congurations we must determine the region of acceptable audio conference qual

ity predict jitter predict latency and predict data loss

The Region of Acceptable Audio conference Quality To determine the re

gion of acceptable audio conference quality we need to dene acceptable limits for

audio conferences along each of the latency jitter and data loss axes

According to fewer than gaps in an audio stream playout and millisec

onds or less of delay resulted in acceptable audio quality Audio conference quality is

then the Euclidean distance from the origin to a p oint represented by delay millisec

onds normalized over and the p ercentage of audio gaps normalized over Any

quality value under is considered acceptable But how do we predict latency and

data loss

The presence of jitter often presents an opp ortunity for a tradeo among latency

and data loss Buering an applicationlevel technique for ameliorating the eects

of jitter can comp ensate for jitter at the exp ense of latency Transmitted frames are

buered in memory by the receiver for a p erio d of time Then the receiver plays out

each frame with a constant latency achieving a steady stream If the buer is made

suciently large so that it can hold all arriving data for a p erio d of time as long as

the tardiest frame then the user receives a complete steady stream However the

added latency from buering can b e disturbing so minimizing the amount of

delay comp ensation is desirable

Another buering technique to comp ensate for jitter is to discard any late frame

at the exp ense of data loss Discarding frames causes a temp oral gap in the playout

of the stream Discarding frames can keep playout latency low and constant but

as little as gaps in the playout stream can also b e disturbing In the case of

audio sp eech the listener would exp erience an annoying pause during this p erio d In

the case of video the viewer would see the frozen image of the most recently delivered

frame

Naylor and Kleinro ck describ e two p olicies that make use of these buering tech

niques the EPolicy for Expanded time and the IPolicy for late data Ignored

Under the Ep olicy frames are never dropp ed Under the Ip olicy frames later

than a given amount are dropp ed Since it has b een observed that using a strict

EPolicy tends to cause the playout latency to grow excessively and that dropping

frames o ccasionally is tolerable we use the IPolicy as a means of examining

needed jitter comp ensation for a multimedia stream

The Ip olicy leads to a useful way to view the eects of jitter on a multimedia

stream Figure depicts the tradeo b etween dropp ed frames and buering as

a result of jitter We generated the graph by rst recording a trace of interarrival

times We then xed a delay buer for the receiver and computed the p ercentage of

frames that would b e dropp ed This represents one p oint in the graph We rep eated

this computation with buers ranging from to milliseconds to generate the

curved line The graph can b e read in two ways In the rst we choose a tolerable

amount of dropp ed frames the horizontal axis then follow that p oint up to the

line to determine how many milliseconds of buering are required In the second we

choose a xed buer size the vertical axis then follow that p oint over to the line to

determine what p ercent of frames are dropp ed In Figure if we wish to restrict

the amount of buering to ms then we must drop ab out of the frames since

that is how many will b e more than ms late on average For an KBps audio

stream consisting of byte frames p er second this equates to dropping one

frame every seconds On the other hand if we wish to not drop any frames we

250

200

150

100 Buffering (Milliseconds)

50

0 0 2 4 6 8 10

Dropped Frames (Percent)

Figure Jitter Comp ensation This picture depicts the amount of buering needed for a given

number of dropp ed frames The horizontal axis is the p ercentage of dropp ed frames The vertical

axis is the number of milliseconds of buering needed

have to buer for over ms

How much buering should we choose We normalize the axes from gaps

to ms buering The b est quality value in the jitter comp ensation curve is

the closest p oint to the origin along the curve In practice there is no way for an

application to determine exactly what buer will give it this p oint However there

are heuristics that consider past jitter in determining the most appropriate buer

size for the future jitter We assume a heuristic can provide an application with

a nearoptimal buer size We would like to know how much latency is added from

buering at this p oint It seems natural to assume that as the area under the jitter

comp ensation curve gets larger the amount of buering at the closest p oint along the

curve gets larger We hypothesize that there is a strong correlation b etween the area

1.4e+07

1.2e+07

1e+07 corelation 0.96

8e+06

6e+06

4e+06 Buffer Size (in microseconds)

2e+06

0

-2e+06 0 50000 100000 150000 200000 250000 300000 350000 400000 450000

Area (in milliseconds ^ 2)

Figure Jitter Comp ensation Area versus Buer Size The horizontal axis is the area under

the jitter comp ensation curve The vertical axis is the buer size in microseconds The p oints are

each a separate exp eriment run The middle line is the least squares line t The outer two lines

form a condence interval around the line The correlation co ecient is

under the jitter comp ensation curve and the buer size If this hypothesis is true we

can determine the latency attributed to buering from the jitter comp ensation area

We tested our hypothesis by generating jitter comp ensation curves for all data p oints

from the exp eriments detailed in Section Section and Section We then

computed the area under each curve We computed the buer size by normalizing

the axes as describ ed in Section and nding the lowest Euclidean distance to

the origin along the curve We plotted buer size versus area and computed the

correlation co ecient Figure depicts these results There is a high correlation

b etween jitter comp ensation area and buer size

We can predict the optimal amount of buering if we know the area under the

jitter comp ensation curve How can we determine the area under the jitter comp en

sation curve As the amount of jitter exp erienced by the system gets larger more

buering should b e required to alleviate the eects of jitter and more gaps should

app ear in the multimedia stream Thus the area under the jitter comp ensation curve

should get larger as jitter increases We hypothesize that there is a strong correlation

b etween the area under the jitter comp ensation curve and the variance in the packet

interarrival times If this hypothesis is true then we can predict the area under the

jitter comp ensation curve from the amount of jitter Then we can predict the buer

size from the jitter comp ensation area We tested our hypothesis by computing jitter

from all data p oints from the exp eriments detailed in Section Section and

Section We then compared these jitter values to the areas under the jitter com

p ensation curves from our previous hypothesis We plotted area versus jitter and

computed the correlation co ecient Figure depicts these results There is a

high correlation b etween jitter and comp ensation curve area Note that streams with

the most jitter the rightmost p oints are all outside the condence interval curves

Further work might b e needed to determine the relationship b etween highjitter mul

timedia streams and the area under the jitter comp ensation curve Fortunately our

predictions deal almost exclusively with streams with far less jitter than the streams

represented by those rightmost p oints

From the ab ove graphs we have

B uf Ar ea

Ar ea J itter

Substituting the equation for Area into the equation for Buer we have

B uf J itter

This last equation allows us to accurately estimate the optimal buer size for an

application given the amount of jitter

1.4e+07

1.2e+07

1e+07

8e+06

6e+06 Area (in milliseconds ^ 2)

4e+06

corelation 0.95

2e+06

0 0 1e+10 2e+10 3e+10 4e+10 5e+10 6e+10

Variance (in microseconds ^ 2)

Figure Jitter versus Jitter Comp ensation Area The horizontal axis is the jitter variance in

packet interarrival times The vertical axis is the area under the jitter comp ensation curve The

p oints are each a separate exp eriment run The middle line is the least squares line t The outer

two lines form a condence interval around the line The correlation co ecient is

Predicting Jitter From our results in Sections and we know the relation

ship b etween load and jitter for faster pro cessors and networks We hypothesize that

a high network load along with a high pro cessor load increases jitter by the sum

of the jitter from the network and pro cessor While p erhaps seeming obvious our

hypothesis will b e false if pro cessor jitter and network jitter are not indep endent If

some of the jitter attributed to the pro cessor is actually due to jitter from the network

andor some of the jitter attributed to the network is actually due to jitter from the

pro cessor adding the jitter from each comp onent will result in more jitter than the

system actually exp eriences However if our hypothesis is true we can then add

the predicted jitter from each comp onent in predicting jitter for systems with b oth

comp onents

We tested our hypothesis with an exp eriment We rst computed the jitter we

would predict on a system with a loaded pro cessor and loaded network This pre

diction is based on the amount of jitter attributed from the pro cessor as obtained in

results from Section and from the network as obtained in results from Section

We next exp erimentally measured the jitter with a loaded pro cessor and a loaded

network for each of the Ethernet Fibre Channel and HIPPI networks We then com

pared the predicted results to the actual results Table depicts this comparison

The predicted jitter values are within of the actual jitter values It seems appro

priate to add the jitter attributed to pro cessor load alone with the jitter attributed

to network load alone to predict the jitter attributed to pro cessor load and network

load together

Predicting Latency We can predict the amount of latency from the jitter com

p ensation buer by using predictions on the amount of jitter In addition to the

buering latency there is the additional latency from the sender pro cessing the net

work transmitting and the receiver pro cessing From our micro exp eriments we know

the latency from recording and playing audio and the latency attributed to sending

Network Pro cessor Network Predicted Actual

Ethernet

Fibre Channel

HIPPI

Table Predicted versus Actual Jitter This table depicts total jitter predictions based on

the amount of jitter contributed by the pro cessor and network Pro cessor is the amount of jitter

contributed by a loaded pro cessor Network is the amount of jitter contributed by a loaded

network Predicted is the sum of the Pro cessor and Network values Actual is the amount

of jitter measure during our exp eriments

and receiving packets We can compute the latency from the network based on

the frame size and network bandwidth To predict the total latency we add the laten

cies from recording the audio frame sending the audio frame to the client receiving

the audio frame from the receiver buering in the jitter comp ensation curve and

playing the audio frame

Predicting Data Loss In order to predict data loss we need to identify what form

data loss may take and when data loss may o ccur In general data loss can take many

forms such as reduced bits from enco ding dropp ed frames and lossy compression For

an audio conference we assume data loss only in the form of dropp ed frames or reduced

frame rate We assume data loss under three conditions

Voluntary As describ ed in Section an application may chose to discard

late frames in order to keep playout latency low and constant and we assume

the audio conference uses heuristics to discard enough frames to achieve the b est

quality

Saturation When either the network or the pro cessor do not have sucient

capacity to transmit data at the required frame rate data loss o ccurs For

example if the network has a maximum bandwidth of Mbps and the audio

conference required Mbps there will b e a data loss We can compute

when systems reach capacity based on our previous work measuring pro cessor

capacities and theoretical network bandwidths Although

the actual network bandwidth can often b e less than the theoretical network

bandwidth using the theoretical bandwidth gives us an upp er b ound

on network utilization If future work measures the actual network bandwidth

appropriate for an application in can b e used in place of the theoretical network

bandwidths we use

Transmission Loss In our previous exp eriments we found that typically ab out

packets on the average are lost when the network is running under max

imum load We assume a maximum lost data rate of ab out due to

network transmission

Predicting Quality At last We have built and tested the accuracy of an exp eriment

based mo del that will allow us to explore audio conference quality under dierent

system congurations We can quantify how eectively to days computer systems

supp ort multiperson audio conferences We can predict when to days systems will

fail due to to o many users or to o much load on the pro cessors or networks We can

see how much using realtime priorities will help audio conference quality We can

evaluate the b enets of exp ensive highp erformance pro cessors and highsp eed net

works b efore installing them We can even investigate p ossible p erformance b enets

from networks and pro cessors that have not yet b een built Lets go exploring

We predict application quality for three scenarios highp erformance pro cessors

and highsp eed networks increasing users and increasing load

HighPerformance Pro cessors and HighSp eed Networks Our results in Sec

tions and showed that b oth highp erformance pro cessors and highsp eed net

works reduce jitter However which reduces jitter more And more imp ortantly

which improves application quality more

We assume we have twenty audio conference participants In Section we use

our mo del to evaluate quality for a variable number of users but here we evaluate

a p ossible audio conference conguration that has interesting quality predictions We

compute quality under two dierent scenarios In the rst pro cessor load remains

constant while the network bandwidth increases In the second network bandwidth

remains constant while pro cessor p ower increases Figure shows these predic

tions For twenty users increasing the pro cessor p ower to a SPECint of or

greater results in acceptable audio conference quality At no time do es increasing the

network bandwidth result in an acceptable quality In this scenario we conclude that

pro cessor p ower inuences audio conference quality more than do es network band

width

Number of Users While to days computer systems may struggle to supp ort even

twenty audio conference participants tomorrows pro cessor improvements promise to

supp ort more and more users But how many more How do more and more audio

conference users aect application quality Figure depicts the predicted eects

of increasing users on audio conference quality We predict audio conference quality for

three dierent audio conference congurations a lowend workstation with a typical

network Sun IPX and Ethernet a midrange workstation with a fast network Sun

Sparc and Fibre Channel and a highp erformance workstation with a highsp eed

network DEC Alpha and HIPPI As we saw in Section to days typical work

stations and networks have trouble supp orting a small number of audio conference

participants However workstations such as Sun Sparc s connected by fast networks

such as a Fibre Channel can supp ort up to users Very highp erformance worksta

tions such as a DEC Alpha connected by highsp eed networks such as a HIPPI can

supp ort over users

5 Vary Network Vary Processor Ethernet and Sun IPX

4

3 Fibre Channel HIPPI Quality

2

SGI Indigo 2 DEC Alpha Server Unacceptable Quality 1 Acceptable Quality

0 0 200 400 600 800 1000

SPECint92 or Mbps

Figure Audio conference Quality versus Pro cessor or Network Increase The horizontal axis

is the SPECint p ower of the workstation or the network Mbps The vertical axis is the predicted

quality There are two scenarios depicted In the rst the pro cessor p ower is constant equivalent to

a Sun IPX SPECint while the network bandwidth increases This is depicted by the solid

curve In the second scenario the network bandwidth is constant equivalent to an Ethernet

Mbps while the pro cessor p ower increases This is depicted by the dashed curve The horizontal

line marks the limit b etween acceptable and unacceptable audio conference quality

10

8

6 IPX, Ethernet Sparc 5, Fibre Channel DEC Alpha, HIPPI Quality

4

2

Unacceptable Quality Acceptable Quality

0 10 20 30 40 50 60 70 80 90 100

Users

Figure Audio conference Quality versus Users The horizontal axis is the number of users

The vertical axis is the predicted quality There are three scenarios depicted In the rst the

pro cessors are Sun IPXs connected by an Ethernet In the second the pro cessors are Sun Sparc s

connected by a Fibre Channel In the third the pro cessors are DEC Alphas connected by a HIPPI

The horizontal line marks the limit b etween acceptable and unacceptable audio conference quality

Pro cessor and Network Load Audio conferences with many users are resource

intensive forcing pro cessors and networks to run at a heavy loads In addition au

dio conference streams are often integrated into larger distributed multimedia appli

cations In the past applications have tended to expand to ll or surpass available

system capacity As system capacities increase audio conference users will demand

higher frame rates and b etter resolution making heavyload conditions likely in the

future We predict the eects of increasing load on audio conference quality

Figure depicts the predicted eects of load on audio conference quality There

are three classes of systems depicted A traditional system has Sun IPXs connected

by an Ethernet A midrange system has Sun Sparc connected by a Fibre Channel

A highend system has DEC Alphas connected by a HIPPI The predictions for audio

conference quality are almost identical for the three systems We saw in Section

that the pro cessor is more crucial than network for audio conference quality Increas

ing pro cessor load has a larger eect on decreasing audio conference quality than do es

improving the network sp eed and pro cessor p ower

Figure also depicts Sun IPXs connected by an Ethernet but using realtime

priorities instead of default priorities shown by the b ottom line With realtime prior

ities audio conference quality do es not suer from increased jitter from the pro cessor

as pro cessor load increases For conditions of increasing load realtime priorities have

a greater eect on improving quality than do faster pro cessors and faster networks

Summary

Our goal is to identify improvements that reduce audio conference pro cessor load We

developed a mo del for audio conference pro cessor load and presented exp erimentally

based analysis on the eects on pro cessor load from improving each comp onent of the

mo del In addition we explored the aects of system improvements on audio confer

ence quality

2

IPX, Ethernet Sparc 5, Fibre Channel 1.5 DEC Alpha, HIPPI IPX, Ethernet, Real Time

Unacceptable Quality 1

Quality Acceptable Quality

Real Time Priorities

0.5

0 0 2 4 6 8 10

Load

Figure Audio conference Quality versus Load The horizontal axis is the pro cessor load

The vertical axis is the quality prediction There are four system congurations depicted In the

rst the pro cessors are Sun IPXs connected by an Ethernet In the second the pro cessors are Sun

Sparc s connected by a Fibre Channel In the third the pro cessors are DEC Alphas connected

by a HIPPI In the fourth the pro cessors are again Sun IPXs connected by an Ethernet but they

are using realtime priorities instead of default priorities The upp er horizontal line marks the limit

b etween acceptable and unacceptable audio conference quality

Based on our analysis of individual comp onents silence deletion app ears to im

prove the scalability of audio more than all other improvements This result holds

even when using compression and even for ten times faster pro cessors networks and

Digital Signal Pro cessing DSP hardware Further the simple silence deletion al

gorithms we studied achieve of the p otential b enet from silence deletion for

audio conference pro cessor load Simple silence deletion algorithms are sucient b e

cause the biggest reduction in pro cessor load from silence deletion is in removing the

silence of the nonsp eakers which even the basic algorithms do well

Sending and receiving packets are already relatively lowcost so reducing the cost

further has relatively low b enet Better network proto cols faster networks and even

multicasting reduce load only slightly with or without silence deletion Techniques

based on DSP hardware alone do not scale as well as silence deletion alone However

DSP based silence deletion and compression together scale b etter than any other

technique Silence deletion done in a DSP chip can reduce pro cessor load by

Compression and silence deletion in a DSP chip can reduce pro cessor load by an

additional

In applying our quality mo del we found that for some congurations highp erformance

pro cessors will improve audio conference quality more than will highsp eed networks

However the typical Ethernet network is the b ottleneck in application quality as the

number of users increases Thus even highp erformance pro cessors will not su

ciently improve application quality as the growing number of users saturate existing

networks The traditional Ethernet network will b ecome the b ottleneck in approx

imately a year when workstation p erformance has doubled and may even b e the

b ottleneck for networks of to days highp erformance workstations

When multimedia applications are running under conditions of increasing load

realtime priorities have a greater eect on improving quality than do faster pro

cessors and faster networks In fact hardware improvements alone will not reduce

jitter enough to eliminate the need for application buering techniques However

for multimedia on a Lo cal Area Network LAN realtime priorities can reduce jit

ter enough to eliminate the need for application buering to day On a Wide Area

Network WAN esp ecially the Internet realtime priorities will not b e available on

all routers reducing the eectiveness of realtime priorities in reducing In this case

buering techniques may still b e needed

Chapter

Flying through the Zo omable

Brain Database

Overview

Gradually in lab oratories around the world neuroscientists from diverse disciplines

are exploring various asp ects of brain structure Since there is so much research to b e

done on the nervous system neuroscientists must sp ecialize An undesirable result of

this sp ecialization is that it is dicult for individual neuroscientists to investigate how

their results t together with results from other scientists Moreover they sometimes

duplicate research eorts since there is no easy way to share information

To enhance the work of neuroscientists we prop ose a zo omable database of images

of the brain tissue We b egin with the acquisition of d structural maps of the nervous

system using higheld highresolution magnetic resonance imaging MRI The MR

images show the entire brain in a single dataset and preserve spatial relationships

b etween structures within the brain However even high resolution MRI cannot show

individual cells within the brain We therefore anchor confo cal microscop e images to

these d brain maps Because of their higher resolution the confo cal images are

of smaller regions of the brain Many such images are montaged into larger images

1

by aligning cellular landmarks b etween images These montages are then aligned

with structural landmarks from the MR images so the highresolution images can

b e anchored accurately within the MR image In addition to the image data other

types of brain data can b e linked to the d structure

The brain database will b e embedded in three dimensions The user starts a

typical investigation by navigating through the MR images in a coarse d mo del of

the brain to a site of interest The user then zo oms to higher resolution confo cal

images embedded in the MRI landscap e This realtime navigating and zo oming we

call ying

The scientic value of the data and the distributed nature of the database imp ose

a series of userlevel requirements that the ying interface should satisfy

Wide Spread Use We estimate the number of users who may b e interested in

using the database to b e half the number of members of the So ciety for

Neuroscience in the average day to b e work hours the average work

week to b e days and the average work month to b e weeks We can predict the

average number of simultaneous database users based on some p ossible usage

amounts

Uses the Database Simultaneous Database Users

hourday

hourweek

hourmonth

Twentyfour Bit Color As often as p ossible the database will express exp er

imental data in its purest form so exp ensive raw data can b e analyzed and

reanalyzed by researchers worldwide Confo cal microscop e images include up

1

We have created a WWW server that includes an example of a montage of confo cal micrographs

at httpwwwcsumneduResearchneural

to three layers of eight bit grayscale with each layer representing one brain

chemical The nal images include bits of color all imp ortant to the scien

tists who view it

Three Dimensions Since the anchoring MR images are three dimensional y

ing must b e allowed in all three dimensions This necessitates computing two

dimensional frames from the three dimensional images The computation can

take place by two dierent metho ds

Remote Flying In remote image pro cessing the server do es the image com

putation and transfers only a d frame to the client We estimate Remote

Flying will require sending Mbits a megapixel of data p er frame Re

mote ying shifts the image pro cessing load from the client to the server

Lo cal Flying In lo cal image pro cessing the server transfers the d data and

the client do es the image computation We estimate Lo cal Flying will

require sending Mbits the d region of the brain of data p er frame

Lo cal ying shifts the image pro cessing load from the server to the client

Smooth Navigation Flying must b e very smo oth even over a varied non

dedicated network This necessitates a motion picture quality rate of frames

p er second and jitter control to comp ensate for network variance

Adaptability Flying must adapt to a wide variety of resources including varying

pro cessor and disk types and nondedicated variable bandwidth networks

This Chapter is organized as follows Section of this section describ es related

work in scientic visualization neuroscience compression network p erformance and

disk p erformance Section introduces our mo del of an audio conference Section

details micro exp eriments that measure the pro cessor load of each comp onent Sec

tion analyzes the exp eriments and pro jects the results to future environments

And Section summarizes the imp ortant contributions of this section

Related Work

Scientic Visualization

Hibbard Paul Santek Dyer Battaiola and VoidrotMartinez designed an interac

tive scientic visualization application They were seeking to bridge the barrier

b etween scientists and their computations and allow them to exp eriment with their

algorithms

Elvins describ ed ve foundation algorithms for volume visualization in an intu

itive straightforward manner His pap er included references to many other

pap ers that give the algorithm details

Singh Gupta and Levoy showed that sharedaddressspace multiprocessors are

eective vehicles for sp eeding up visualization and image synthesis algorithms

Their article demonstrated excellent parallel sp eedups on some wellknown sequential

algorithms

We investigate p erformance of an application that will use techniques developed

in other scientic visualization applications In particular we use p erformance results

from Singh et al and may implement some of the algorithms that Elvins describ es

Neuroscience

Carlis et al present the database design for the Zo omable Brain Database

We have developed data mo dels for novel neuroscience One mo del fo cuses on MR

and Confo cal microscopy images the connections b etween them and the notions of

macroscopic and microscopic neural pathways in the brain We have implemented

this connections data mo del in the Montage DBMS and have p opulated the schema

with neuroscience data

Kandel Schwartz and Jessel discussed in detail the fundamentals b ehind neural

science

Slotnick and Leonard had an extensive photo atlas of an albino mouse forebrain

The ideas of a zo omable digitized rat brain came from atlases such as this one

The Zo omable Brain Database is based on neuroscience fundamentals At its heart

is a digital brain atlas Our work presents network pro cessor and disk p erformance

analysis in accessing the digital images

Compression

Wallace presented the Joint Photographic Exp erts Group JPEG still picture stan

dard He describ ed enco ding and deco ding algorithms and discusses some rela

tions to the Motion Picture Exp erts Group MPEG

Patel Smith and Rowe designed and implemented a software deco der for MPEG

video bitstreams They gave p erformance comparisons for several dierent bit

rates and platforms They claimed that memory bandwidth was the b ottleneck for

the deco der not pro cessor sp eed They also gave a cost analysis of three dierent

machines for doing MPEG

We expand compression research by providing careful pro cessor load measure

ments of JPEG compression and decompression on a Sun IPX In addition we predict

the eects of compression on network pro cessor and disk p erformance

Network Performance

Hansen and Tenbrink investigated gigabyte networks and their application to sci

entic visualization They explored various system top ologies centered around

the HIPPI switch They analyzed network load under some p ossible visualization

applications and hypothesize on the eects of compression

Lin Hsieh Du Thomas and MacDonald studied the p erformance characteris

tics of several types of workstations running on a lo cal Asynchronous Transfer Mo de

ATM network They measured the throughput of four dierent application pro

gramming interfaces API They found the native API achieves the highest through

put while TCPIP delivers considerably less

+ +

Read Render Send No Compression

+ + +

Network Compression Read Render Compress Send

+ + + +

Read Decompress Render Compress Send

Network and Disk Compression

Figure Flying Mo del Our mo del of the ying software used by the server that includes

reading rendering and sending Compression reduces the data b efore sending Decompression

expands the data b efore rendering

Disk Performance

Ruwart and OKeefe describ e a hierarchical disk array conguration that is capable of

sustaining MBytessecond transfer rates They show that virtual memory

management and striping granularity play key roles in enhancing p erformance

Mo del

We have three dierent mo dels for the Flying software depicted in Figure

No Compression Flying with no compression has three pro cessor load comp o

nents read is the pro cessor load for reading the image from the disk render is

the pro cessor load for computing the d frame from the d image and send is

the pro cessor load for sending the frame to the user

Network Compression Flying with network compression has four pro cessor

load comp onents read is the pro cessor load for reading the image from the

disk render is the pro cessor load for computing the d frame from the d

image compress is the pro cessor load for compressing the frame and send is

the pro cessor load for sending the compressed frame to the user

Network and Disk Compression Flying with network compression and disk

compression has ve pro cessor load comp onents read is the pro cessor load for

reading the compressed image from the disk decompress is the pro cessor load

for decompressing the image render is the pro cessor load for computing the d

frame from the d image compress is the pro cessor load for compressing the

frame and send is the pro cessor load for sending the compressed frame to the

2

user

Micro Exp eriments

We can predict the pro cessor throughput required for ying by using the pro cessor

load for individual comp onents as in Section We obtain the ying throughput for

a MHz Sun IPX to obtain a baseline for predictions to faster machines We reuse

the appropriate microexp eriment results from Section and that were used

for the audio conference application and run exp eriments to measure the comp onents

new to the ying application The comp onents we reuse are send receive read

and write The new comp onents are compress and decompress

We obtained the pro cessor load for doing JPEG compression and decompression

3

We mo died the source co de for cjpeg and djpeg to p erform compression and de

2

A clever image computation algorithm might do the calculation on the compressed images

3

The ocial archive site for this software is ftpuunet Internet address or

The most recent released version can always b e found there in directory graphicsjp eg

The particular version we used is archived as jp egsrcvtarZ

0.003 decompression compression

0.0025

0.002

0.0015 Millisecond / Byte

0.001

0.0005

0 20 40 60 80 100

JPEG Quality

Figure Pro cessor load for JPEG Compression and Decompression versus Compressions Qual

ity The vertical axis is pro cessor time in millisecondsbyte The horizontal axis is the JPEG quality

setting All p oints are shown with condence intervals

compression in separate pro cesses We followed exp erimental techniques that were

identical to those used to obtain the pro cessor loads for the previous comp onents

As in Section we used a counter pro cess that incremented a double variable in a

tight lo op to measure the pro cessor load of the JPEG comp onents We compressed

and decompressed images to and from PPM les see the fo otnote in section

The indep endent variable in our measurements was the JPEG compression qual

ity Figure depicts the pro cessor load for compression and decompression versus

quality All p oints are shown with condence intervals

Using linear regression we can derive the JPEG pro cessor load in milliseconds

p er bit of the original PPM image

JPEG msecbit

compression

decompression

Predictions

Networks

Since the database is distributed the ab ove userlevel requirements determine the

network requirements For one user we can predict the network load that ying will

induce under the image pro cessing types describ ed in Section

Flying Type Mbitssecond

Remote

Lo cal

We can determine the minimal network required for ying using pro jected network

bandwidths The following table lists the Optical Carrier OC network rates

Proto col Mbitssecond

OC

OC

OC

OC

We infer from the ab ove two tables that we need more bandwidth than even an

OC network can deliver to supp ort just one remote ying user Since lo cal ying

app ears to b e very dicult under all pro jected network bandwidths we will assume

remote ying in the rest of the ying predictions We also assume a ying rate of

one hour p er week The network bandwidth will increase further with additional

simultaneous users Figure shows the load predictions for a variable number of users

100000

10000

1000

100 total bandwidth

Mbits/second OC-24 OC-12 OC-3

10

1

0 50 100 150 200

Number of Simultaneous Users

Figure Bandwidth versus Number of Simultaneous Users The curve is the total bandwidth

required The horizontal lines are various Optical Carrier OC network bandwidths On this and

subsequent graphs the Mbitssecond axis is in log scale

100000

10000

1000

100 average bandwidth per server

Mbits/second OC-24 OC-12 OC-3

10

1

0 50 100 150 200

Number of Servers

Figure Bandwidth versus Servers The curving is the average bandwidth p er server The

horizontal lines are OC network bandwidths

To counteract the huge bandwidth requirements we can increase the number of

servers If we assume that each user ies through the database for one hour p er week

we have an average of simultaneous users see section Figure shows the

load predictions for a variable number of servers

With servers there is on average one server p er connected user We take

to b e the upp er limit on the number of servers But with servers each connected

to clients by OCs each server can only supp ort one user Clearly there is a need

to nd ways to reduce network load

Compression

We turn to compression to reduce network bandwidth If each frame is compressed

b efore sending the network bandwidth will b e reduced

We assume JPEG compression for ying The Joint Photographic Exp erts Group

JPEG have b een working to establish the rst international compression standard

for continuoustone still images b oth grayscale and color Although JPEG is

intended as a still picture standard it has greater exibility for video editing than

do es MPEG and is likely to b ecome a de facto intraframe motion standard as well

The quality factor in JPEG lets you trade o compressed image size against quality

of the reconstructed image the higher the quality setting the closer the output image

4

will b e to the original image and the larger the JPEG le A quality of will

generate a quantization table of all s eliminating loss in the quantization step but

there is still information loss in subsampling as well as roundo error

Since the viewed images are to b e as close to the original data as p ossible we

assume a quality of must b e used for ying compression Pilot tests show that

5

JPEG with a quality of reduces the size of PPM images by ab out In all

subsequent predictions we assume a compression ratio of

Figure shows our predictions of the eects of compression versus the number

of simultaneous users Figure shows our predictions of the eects of compression

versus the number of servers

With compression and servers we can satisfy the userlevel network band

width requirements for simultaneous users However doing compression and

decompression induces additional pro cessor load see Section

4

Quality values b elow ab out generate byte quantization tables which are considered optional

in the JPEG standard Some commercial JPEG programs may b e unable to deco de the resulting

le

5

Jef Poskanzers PBMPLUS image software can b e obtained via FTP from exp ortlcsmitedu

contribpbmplustarZ or ftpeelblgov pbmplustarZ

100000

10000

1000

100 total bandwidth without compression

Mbits/second total bandwidth with 70% compression OC-24 OC-12 OC-3 10

1

0 50 100 150 200

Number of Simultaneous Users

Figure Bandwidth versus Simultaneous Users The top curve depicts the total bandwidth

without compression The lower curve depicts the total bandwidth with compression The

horizontal lines are OC network bandwidths

100000

10000

1000

100 average bandwidth without compression

Mbits/second average bandwidth with 70% compression OC-24 OC-12 OC-3 10

1

20 40 60 80 100 120 140 160 180 200

Number of Servers

Figure Bandwidth versus Servers This graph depicts the average bandwidth p er server for

a variable number of servers The top curving line represents bandwidth without compression The

lower curving line represents bandwidth with compression The horizontal lines are OC network bandwidths

Disks

Compression may also b e used for reducing the size of images on disk The most

obvious eect is the decrease in storage space A fullymapp ed mouse brain will

take approximately terabits of data while a rat brain will take approximately

terabits of data A compression rate would reduce the required storage space to

ab out terabits and terabits resp ectively

Compressed disk images will also increase the disk ying throughput With an

image compression of disk drives can supply over times more framessecond

Using the disk bandwidths obtained from we can determine the disks required

for meeting the requirements for ying throughput Recent studies at the Army High

Performance Computing Research Center at the University of Minnesota AHPCRC

have measured the p erformance of single disk arrays When going through a

le system normal IO runs at ab out Mbitssecond Direct IO which bypasses

the le system buer cache and b eams data directly to the users application buer

runs at Mbitssecond on a nonfragmented le Direct IO and striping across

multiple disk arrays can achieve up to N times Mbitssecond where N is the

number of arrays strip ed over The AHPCRC currently strip es over or arrays

pro ducing transfer rates around Mbitssecond for arrays In the year time

frame they exp ect to see the single array sp eed go to Mbitssecond and p ossibly

Gbitssecond for large le transfers If the physical design matches users needs

eectively the database retrieval rate may approach the maximum disk rate but

applications cannot exceed this retrieval sp eed

Figure shows the disk drive throughputs compared to the increase in bandwidth

as simultaneous users increase At least direct IO disk arrays are required to meet

the ying requirements for one user However with compression one users ying

need can b e met with direct IO disk arrays

Figure shows the disk rates compared to the decrease in bandwidth as servers

increase A disk array will not satisfy even a single users ying requirements

100000

10000

1000

100

Mbits/second total bandwidth without compression total bandwidth with 70% compression 8 direct io disk arrays 4 direct io disk arrays 10 1 direct io disk array

1

0 50 100 150 200

Number of Simultaneous Users

Figure Bandwidth versus Simultaneous Users The Mbitssecond axis is a log scale The

increasing curves are bandwidths without compression and with compression The horizontal lines

are all disk throughput rates

100000

10000

1000

100

Mbits/second average bandwidth without compression average bandwidth with 70% compression 8 direct io disk arrays 4 direct io disk arrays 10 1 direct io disk array

1

20 40 60 80 100 120 140 160 180 200

Number of Servers

Figure Bandwidth versus Number of Servers The decreasing curves are bandwidths with

compression and without compression The horizontal lines are disk rates

An disk array will satisfy user ying requirements only with compression and

servers Without compression an disk array will need servers to satisfy user

ying requirements A disk array will need compression and or more servers to

meet ying requirements

A drawback of compression is that it will decrease pro cessor frame throughput

Compressed images on disk must b e uncompressed b efore the pro cessor can p erform

image computation to generate the d frame And compressing b efore sending takes

additional pro cessor time What are the pro cessor requirements for ying

Pro cessors

To estimate the eects of new highp erformance graphical workstations on the pro

cessor load for image calculation we use the results rep orted in They rep ort

Compression Op erations Secondsframe Mbitssecond

None read render send

Network read render compress send

Network and Disk read decompress render compress send

Table Flying Throughput This table shows the predicted Sun IPX pro cessor loads with three

ways to use compression for ying

that image calculation from a xx voxel volume data set the same size we

assume for our d region takes ab out seconds p er frame on a MHz Silicon

Graphics Incorp orated SGI Indigo workstation using Levoys raycasting algo

rithm and ab out a second p er frame using a new shearwarp algorithm We will

assume a pro cessor load of one second p er frame for an Indigo workstation

We compare the pro cessing p ower of the Sun IPX to the SGI Indigo by comparing

their p erformance under the Systems Performance Evaluation Co op erative SPEC

b enchmark integer suite The SPECint value for a Sun IPX is and the SPECint

value for the MHz Indigo is Roughly the Indigo is times faster than

IPX so we assume the IPX takes seconds to do the d image calculation from the

d region

We can now predict the ying throughput for the Sun IPX pro cessor server Ta

ble gives the Sun IPX pro cessor throughput predictions for the ab ove metho ds

for the server to provide ying describ ed in Section

Network compression slightly reduces pro cessor frame throughput Disk com

pression however reduces pro cessor frame throughput by more than Most

imp ortantly it would take a pro cessor over ninehundred times faster than a Sun

IPX to satisfy the ying requirements for even one user Even an SGI Indigo with

pro cessors would still only have a ying throughput of Mbitssecond not even

enough for user Clearly there is a need to nd ways to increase pro cessor ying

Comp onent Software Load Hardware Load

read uncompressed image seconds seconds

read compressed image seconds seconds

decompress image seconds

compress frame seconds

render frame seconds seconds

send uncompressed frame seconds seconds

send compressed frame seconds seconds

Table Flying Comp onent Loads This table shows Sun IPX pro cessor loads induced by various

ying comp onents when done in software and hardware

throughput

One such metho d may b e sp ecialized hardware There are copro cessors that

p erform JPEG compression and decompression Similarly many computers have

sp ecialized graphics rendering hardware To estimate the pro cessor load for using

sp ecialized hardware we assume that accessing sp ecialized hardware is equivalent to

one kernel call and that calls can take place in blo ck sizes of Kbytes We also

assume that all hardware is suciently fast to keep up with the pro cessor Table

analyzes the Sun IPX pro cessor improvements from using such hardware for each

ying comp onent op erating on one frame Table gives the pro cessor throughput

predictions for an SGI Indigo workstation using hardware supp ort for the ying

comp onents

We can now predict the pro cessor throughput required for ying Figure

shows the load predictions versus simultaneous users and Figure shows the load

predictions versus servers Flying throughput for a Sun IPX would b e at the very

b ottom of these graphs Even a pro cessor MHz SGI Indigo using compression

cannot satisfy user ying requirements for even one user We therefore consider the

Compression Op erations Secondsframe Mbitssecond

None read render send

Network read render compress send

Network and Disk read decompress render compress send

Table Hardware Flying Throughput Possible SGI Indigo pro cessor loads induced by the

three forms of ying when the pro cessor is equipp ed with sp ecialized ying hardware

use of sp ecialized hardware and kernel supp ort for reading compressing calculating

and sending

Quality

So far we have explored the necessary network disk and pro cessor requirements to

completely satisfy our user requirements But when user requirements cannot b e

completely satised there is often a tradeo in system choices For example if more

compression is applied bandwidth requirements for network and disk are reduced

but the pro cessor may b ecome overloaded with the added decompression cost We

apply the mo del presented in Section to predict the quality for ying on dierent

system congurations In order to apply our quality mo del we must determine

the region of acceptable ying quality predict jitter predict latency and

predict data loss

To determine the region of acceptable ying quality we need to dene acceptable

limits for ying along each of the latency jitter and data loss axes According to Jeay

and Stone delays of milliseconds or under are acceptable for a video conference

We assume this is a lower b ound on the acceptable latency for ying For data

loss research in remote teleop erator p erformance has found that task p erformance is

virtually imp ossible b elow a threshold of frames p er second We also assume

fewer than Mbitssecond of data provides to o little data to b e useful and more

100000

10000

1000

100 Mbits/second

10 total bandwidth without compression total bandwidth with 70% compression SGI Indigo 2 with flying hardware 20 processor SGI Indigo 2 without compression 20 processor SGI Indigo 2 with compression 1

0 50 100 150 200

Number of Simultaneous Users

Figure Bandwidth versus Simultaneous Users The upward sloping curves are total band

widths with compression and without compression The horizontal lines are pro cessor ying through

put rates The top horizontal line is the predicted ying throughput of a pro cessor SGI Indigo

with sp ecialized ying hardware

100000

10000

1000

100 average bandwidth without compression

Mbits/second average bandwidth with 70% compression SGI Indigo 2 with flying hardware 20 processor SGI Indigo 2 without compression 20 processor SGI Indigo 2 with compression 10

1

20 40 60 80 100 120 140 160 180 200

Number of Servers

Figure Bandwidth versus Number of Servers The upward sloping curves are the average

bandwidths p er server with compression and without compression The horizontal lines are pro cessor

ying throughput rates The top horizontal line is the predicted ying throughput of a SGI Indigo

with sp ecialized ying hardware

900 1 processor

800

700

600

500

400

Latency in Milliseconds 300 unacceptable quality

4 processors 200 acceptable quality

8 processors 100 20 processors 16 processors

16-bit color 15 frames/s half-flying quarter-flying 0 0 100 200 300 400 500 600 700

Data Reduction in Mbits/second

Figure Client Application Quality This graph depicts a measure of quality for the client

The horizontal axis is the number of Mbitssecond of data reduction received by the client The

vertical axis is the latency added by the client The p oints are SGI Indigo s clients with dierent

numbers of pro cessors The curve represents an acceptable level of quality all p oints inside the curve

will have acceptable quality while p oints outside will not Note that the clients are not equipp ed

with any sp ecial ying hardware

than Mbitssecond framessecond and bits of color provides no more

useful data As describ ed in Section the presence of jitter often presents an

opp ortunity for a tradeo among latency and data loss We predict jitter latency

and data loss as in Section Section and Section resp ectively

To compute application quality we use the metho ds describ ed in Section

Figure depicts our quality predictions The individual p oints are all SGI Indigo

clients with a dierent number of pro cessors In this gure we have not assumed

any sp ecialized ying hardware We assume that the servers will b e able to provide

the bandwidth requested by all the clients to simplify the computation

1.2

without compression acceptable quality 1

0.8 with 70% compression

0.6 Quality

0.4

0.2 server bandwidth capacity reached

0 5 10 15 20 25 30

Number of Clients

Figure Quality versus Clients Clients are pro cessor SGI Indigo workstations with

no sp ecialized hardware The server is an SGI Indigo workstation with sp ecialized hardware see

Section The arrows indicate p oints at which the server can no longer keep up with the

bandwidth requests by the clients At this p oint p erceived application p erformance decreases as

depicted by the increasing quality values

Figure depicts quality versus the number of clients for b oth ying with com

pression and without Clients are assumed to b e pro cessor Indigo s without

sp ecialized hardware The server is assumed to b e an SGI Indigo workstation with

sp ecialized hardware see Section The arrows indicate p oints at which the

server can no longer keep up with the bandwidth requests by the clients At this

p oint application p erformance decreases as depicted by the increasing quality val

ues Thus for fewer than clients compression decreases clientside quality mostly

b ecause of the latency increase from the clients decompressing the images However

for or more clients compression increases application quality b ecause the server can

meet the bandwidth requirements of more clients

Note that this graph depicts the quality tradeos with one particular system

conguration Dierent system congurations may have dierent quality tradeos

Summary

Since the users of the Zo omable Brain Database will b e distributed across the country

the system stress from ying will b e spread over the underlying network If the

network top ology app ears as a backbone with six Network Access Points NAPs

like the prop osals for the national data highway we can determine the bandwidth

required for each link

We assume

An equal distribution of users among servers

servers each directly connected to the backbone and

compression

We can use the analysis from previous sections to determine the system require

ments

Comp onent Mbitssecond

Backbone

NAPs

Individual Connections

Pro cessor Servers

Figure depicts the network for the constructed Zo omable Brain Database

Unfortunately some of the prop osals for the national data highway have bandwidths

like those in gure

We can only hop e to y on such a network by reducing the userlevel ying

requirements describ es a number of mo dications that can b e done to the user 7 Gigabits 7 Gigabits 7 Gigabits

43 Gigabits 7 Gigabits 7 Gigabits

7 Gigabits

Figure Bandwidth requirements for the Zo omable Brain Database system on a top ology

similar to a p ossible national data highway 50 Mbits 50 Mbits 50 Mbits

155 Mbits 50 Mbits 50 Mbits

50 Mbits

Figure A prop osed national data highway network In order to allow ying in this environ

ment userlevel requirements must b e relaxed

requirements that will ease the system requirements The server can reduce the frame

resolution in each dimension by onehalf halfying to a onefourth quarterying

reducing the data needed by th and th resp ectively Likewise fewer bits of

color decrease the data needed for each pixel The table b elow summarizes the system

b enets of several of these mo dications

Reduction

Mo dication IO Pro cessor Network

bit color

halfying

quarterying

framessec

Pro cessing and sending bit color reduces pro cessor and network requirements

by Yet many neuroscientists may nd scientic analysis dicult under such

conditions In half or quarterying one frame pixel represents or display pixels

resp ectively Thus network data rates are reduced by and resp ectively but

at the exp ense of image quality Sending fewer frames p er second reduces all system

comp onents However motion displayed at framessecond has a much rougher ow

than do es motion displayed at framessecond

The data reduction from the ab ove mo dications can also b e combined If we

mo dify the userrequirements to allow bit color framessecond and ying

the system requirements for the Zo omable Brain Databases would app ear as in Fig

ure These reductions in user requirements make it p ossible to use the prop osed

infrastructure to achieve some of our goals but will not b e suitable for some neuro

science analysis

All of our predictions have done the image computation at the server remote

ying as opp osed to image computation done at the client local ying But re

mote ying while inducing signicantly less bandwidth than lo cal ying increases

the amount of server computation With many database users the server computa 15 Mbits 15 Mbits 15 Mbits

88 Mbits 15 Mbits 15 Mbits

15 Mbits

Figure Bandwidth requirements for the Zo omable Brain Database system on a top ology

similar to a prop osed national data highway User requirements have b een reduced to ease system

requirements

tion may b ecome prohibitive As client workstations b ecome increasingly p owerful

they may b e able to readily p erform the needed computations at a sucient ying

rate reducing server load In other environments server load has proven critical for

p erformance

Advances in networking such as those describ ed in this pap er are needed to enable

a new era of neuroscience Researchers will b e able to directly access highquality

pristine data from all areas of the brain Hyp otheses will b e formulated evaluated

and scientically tested through interaction with data collected by scientists on the

other side of the world As we have shown realizing this dream fully will require

gigabits p er second of sustained throughput p er user and terabits p er second of

aggregate bandwidth

Chapter

The Virtual Co ckpit

Overview

We have applied our mo del to the sp ecications and preliminary p erformance results

of a ight simulator called the Virtual Co ckpit The Virtual Co ckpit is a lowcost

manned ight simulator of an FE built by the Air Force Institute of Technology

A soldier ies the Virtual Co ckpit using the handson throttle and stick while the

interior and outthewindow views are viewed within a headmounted display The

Virtual Co ckpit research was undertaken to build an inexp ensive system for use as a

tactics trainer at the squadron level that could participate in a Distributed Interactive

Simulation DIS

DIS is a virtual environment b eing designed to allow networked simulators to inter

act through simulation using compliant architecture mo deling proto cols standards

and databases The DIS system must b e exible and p owerful enough to supp ort

increasingly large numbers of simulators The Advanced Research Pro jects Agency

ARPA estimates the need for exercises with simulators

In addition to latency jitter and lost data DIS simulators have an additional

applicationsp ecic measure that aects application quality DIS simulators use dead

reckoning algorithms to compute p osition Each simulator maintains a simplied

representation of other simulators and extrap olates their p ositions based on their

last rep orted states When a simulator determines that other simulators cannot

accurately predict its p osition within a predetermined threshold it sends a state

up date packet The state up date contains the correct p osition and orientation as well

as velocity vectors and other derivatives that the other simulators can use to initiate

a new prediction Figure depicts the dierence b etween the actual ight path of

a simulator and the dead reckoning ight path computed by the other simulators

Deviation Exceeds Threshold

Update Received Actual Path

Dead Reckoning Path

Figure Actual Path versus Dead Reckoning Path The solid line represents the actual ight

path of the simulator The grey line represents the ight path as computed by the other simulators

using a simple deadreckoning algorithm based only on direction and velocity The dashed line

represents a time when the p erceived path deviated from the actual path by more than a preset

threshold At this time a packet up dating p osition orientation and direction is sent and the dead

reckoning resumes from this new p osture

Dead reckoning creates an additional quality measure sp ecic to DIS applications

Missed Updates If a simulator is unable to send the up date or pro cess an

incoming up date the accuracy of the simulation decreases

Figure shows the eects of missed up dates on accuracy The horizontal line is

the p osition threshold set at meter When the inaccuracy surpasses the threshold

an up date is sent bringing inaccuracy back to Inaccuracy increases one meter

on average for each up date missed These missed up dates are shown by the taller

Dead Reckoning Accuracy vs. Time 3 low-order high-order missed 1 update

2 missed 2 updates

thrshld Meters of Inaccuracy

0 1 2 3 4

Seconds

Figure Dead Reckoning Accuracy The horizontal line is the p osition threshold set at meter

The triangularshap es represent areas of inaccuracy When they reach the threshold an up date is

sent bringing inaccuracy back to Inaccuracy increases by as much as a meter on the average for

each up date missed These missed up dates are shown by the taller triangles

triangles The total inaccuracy for the simulation is represented by the sum of the

areas of the triangles The larger the sum the more inaccurate the simulation As we

would exp ect consecutive missed up dates aect accuracy more than nonconsecutive

missed up dates There are two dierent simulations depicted in the graph One

uses a dead reckoning algorithm that do es more computation highorder while the

other uses a dead reckoning algorithm that do es less computation loworder The

total accuracy is indep endent of the dead reckoning algorithm used changing the

dead reckoning algorithm only aects the number of up dates sent and not the total

accuracy

Our ob jective is to enhance our understanding of the p erformance b ottlenecks that

arise in the design of a multiperson distributed multimedia application Among

the factors we investigate are the p erformance of the Virtual Co ckpit on existing

networks and pro cessors and the eects of highsp eed networks and highp erformance

pro cessors on Virtual Co ckpit quality

This Chapter is organized as follows Section of this section describ es related

work in Distributed Interactive Simulation and geographic information systems Sec

tion introduces our mo del of the Virtual Co ckpit Section details micro ex

p eriments that measure the pro cessor load of each comp onent Section describ es

macro exp eriments that test the accuracy of the predictions based on the micro ex

p eriments Section analyzes the exp eriments and pro jects the results to future

environments And Section summarizes the imp ortant contributions of this sec

tion

Related work

Distributed Interactive Simulation

Distributed Interactive Simulation DIS is a virtual environment b eing designed to

allow networked simulators to interact through simulation using compliant architec

ture mo deling proto cols standards and databases DIS exercise involve the

interconnection of a number of simulators in which simulated entities are able to in

teract within a computer generated environment The simulators may b e present in

one lo cation or b e distributed geographically and the communications are conducted

over the network The DIS system must b e exible and p owerful enough to supp ort

increasingly large numbers of simulators The Advanced Research Pro jects Agency

ARPA estimates the need for exercises with users

We study the p erformance of a ight simulator the Virtual Co ckpit designed to

participate in DIS exercises In particular we examine the network and pro cessor

requirements as the number of users scales to

Geographic Information Systems

A Geographic Information System GIS is a computer system capable of assembling

storing manipulating and displaying geographically referenced information ie data

identied according to their lo cations GIS are often pro cessor and network inten

sive GIS research includes ways to eectively distribute pro cessing time on high

p erformance parallel computer systems while obtaining enough delity to supp ort

realistic simulations and developing network software architectures to supp ort

large scale virtual environments

GIS are the basis for many vehicle simulators We use p erformance numbers from

the GIS used by the Virtual Co ckpit Software Systems MultiGen in our exp eriments

Mo del

The fundamental comp onents for the Virtual Co ckpit are send a dead reckoning

up date packet receive up date packets up date dead reckoning status read GIS in

formation from the database p erform dead reckoning render the frame and display

1

the frame The fundamental comp onents of the Virtual Co ckpit are depicted in Fig

ure Send is the pro cessor load for sending up dates to the other soldiers Receive

is the pro cessor load for receiving up dates Update is the pro cessor load for up dating

dead reckoning status structures Read is the pro cessor load for reading from a le

Reckon is the pro cessor load for doing a dead reckoning computation to determine

the p osition of another soldier Render is the pro cessors load for generating a frame

to b e displayed Display is the pro cessor load for displaying the frame on the screen

Send is done once for each up date packet sent Receive and Up date are done once

for each up date packet received Read Render and Display are done once p er frame

Reckon is done once p er soldier p er frame

Micro Exp eriments

After carefully measuring the pro cessor network and disk loads induced by each

comp onent we can predict the load of an application built with those comp onents

1

We assume that the user interface contributes an insignicant amount of pro cessor load

Send Receive Display

Read Per Frame Reckon Per Soldier Receive Render Per Packet Per Frame Send Update Display

Per Packet Per Packet Per Frame

Figure Virtual Co ckpit Mo del The d b oxes ab ove all contain a fundamental comp onent of

the Virtual Co ckpit The larger dashed b oxes place the fundamental comp onents into conceptual

groups

Changes in application conguration or changes in hardware are represented by mo d

ifying the individual comp onents and observing how that aects the application p er

formance

We reuse the appropriate microexp eriment results from Section Section

and that were used for the audio conference and ying applications and run ex

p eriments to measure the comp onents new to the virtual co ckpit The comp onents

we reuse are send receive read and write The new comp onents are update

reckon render and display

We ran our exp eriments on a dedicated network of SGI Personal Iris workstations

Each workstation had a MHz IP R pro cessor Mbytes RAM and Kbytes

cache

We use a highorder dead reckoning algorithm as an upp er b ound on the pro cessor

load required see Section

We compare the load for sending and receiving unicast packets to that of multicast

so we could b etter analyze the b enets of multicast in Section Predictions b elow

The up date packet size is bytes as dened by the DIS standard

We followed exp erimental techniques that were identical to those used to obtain

the pro cessor loads for the previous comp onents As in Sections and we used

a counter pro cess that incremented a double variable in a tight lo op to measure the

pro cessor load for the mo del comp onents

Table gives the pro cessor times for the Virtual Co ckpit comp onents All p oints

are shown with a condence interval The times for multicast send and unicast

send are indistinguishable at the condence level The time for multicast receive

is signicantly more than the time for unicast receive

Although we present only the measurements of the pro cessor load in the past

workstation p erformance has scaled with pro cessor p erformance We assume that

pro cessor p erformance will continue to reect the workstation p erformance

Macro Exp eriments

We ran macro exp eriments for the Virtual Co ckpit mo del for simulated soldiers

Each soldier ran a Virtual Co ckpit on an SGI Personal Iris one workstation p er

soldier The workstations were connected by an Ethernet Each Virtual Co ckpit sent

up date packets p er second the average for most vehicles in a typical DIS exercise

Our micro exp eriments measured the pro cessor load for the fundamental comp o

nents of the Virtual Co ckpit Using these measurements we can predict the number

of o ccurrences of each of the fundamental comp onents in the macro exp eriments For

example in a second twosoldier simulation we would exp ect each soldier to send

and receive up date packets Table gives the predicted versus actual count for

all the Virtual Co ckpit comp onents for an exp eriment with soldiers Predicted

are the predicted numbers of times each comp onent o ccurred p er second based on

the p erformance of the individual comp onents Actual are the actual numbers of

times each comp onent was recorded p er second in the exp eriment Other macro ex

Comp onent msec Low High Per Unit

Display frame

Read byte

Render frame

Up date soldier

Compute soldier

Unicast Receive up date

Multicast Receive up date

Unicast Send up date

Multicast Send up date

Table Pro cessor load breakdown for the fundamental comp onents of the Virtual Co ckpit

Send is the pro cessor load for sending up dates to the other soldiers Receive is the pro cessor load

for receiving up dates Up date is the pro cessor load for up dating dead reckoning status structures

Read is the pro cessor load for reading from a le Reckon is the pro cessor load for doing a dead

reckoning computation to determine the p osition of another soldier Render is the pro cessors load

for generating a frame to b e displayed Display is the pro cessor load for displaying the frame on the

screen Low and High are the lower and upp er b ounds resp ectively from the condence interval

Comp onent Predicted Actual

Receive

Send

Up date

Compute

Render

Display

Read

Table Comp onent predictions for a Virtual Co ckpit simulation with Soldiers Predicted

are the predicted p erformance numbers based on the p erformance of the individual comp onents

Actual are the actual p erformance numbers rep orted in the exp eriment All predicted values are for

one second

p eriments with soldiers had similar results but we do not show them here to

avoid redundancy

All of the predicted p erformance results are within of the actual p erformance

results We regard future comparisons of predicted p erformance results that com

bine the Virtual Co ckpit comp onents into alternate congurations to b e signicantly

dierent if they vary by more than

Our micro exp eriments give us the p erformance results for the fundamental comp o

nents of the Virtual Co ckpit Our macro exp eriments test the accuracy of combining

the values from our micro exp eriments into accurate Virtual Co ckpit predictions

Predictions

By mo difying the fundamental Virtual Co ckpit comp onents we can predict p erfor

mance on alternate system congurations This allows us to evaluate the p otential

p erformance b enets from exp ensive highp erformance pro cessors and highsp eed net

works b efore installing them Moreover we can investigate p ossible p erformance

b enets from networks and pro cessors that have not yet b een built

Our approach for evaluation of each alternative system is the same We mo dify

the parameters of our p erformance mo del to t the new system then evaluate the

resulting mo del to obtain p erformance predictions These analyses are intended to

provide a sense of the relative merits of the various alternatives rather than present

absolute measures of their p erformance

Networks

We rst investigate the network bandwidth requirements for a DIS simulation Net

work bandwidth requirements dep end up on the frequency with which each Virtual

Co ckpit broadcasts its p osition In the absence of dead reckoning each simulator

would have to send an up date every frame for a maximum of up dates p er second

However with dead reckoning the average rate is up dates p er second with a

maximum of up dates p er second

Figure depicts the predicted network bandwidth versus the number of soldiers

in a Virtual Co ckpit exercise The upward sloping lines are the predicted network

bandwidth for up dates sent at the frame rate without dead reckoning the maximum

up date rate with dead reckoning and average up date rate with dead reckoning The

three lines with the lower slop es all use multicast routing as sp ecied by the DIS

standard The line with the steep est slop e represents the network bandwidth

required for a unicast simulation at an average up date rate The horizontal lines are

the maximum bandwidths for a Mbps T Mbps Ethernet and Mbps ATM

network

Without dead reckoning a T network b ecomes saturated at ab out soldiers and

an Ethernet at ab out soldiers With dead reckoning and the maximum up date

rate a T can supp ort nearly soldiers and an Ethernet can supp ort nearly

soldiers With dead reckoning and the average up date rate a T can supp ort ab out

1000

ATM 100

Ethernet 10

T1 1 Mbits / Second

0.1

unicast (3 updates/sec) frame rate (30 updates/sec) maximum (17 updates/sec) average (3 updates/sec)

1 10 100 1000 10000

Soldiers

Figure Network Bandwidth versus Soldiers This graph shows the predicted network band

width as the number of soldiers increases The upward sloping lines are the predicted bandwidth

for the frame rate maximum rate and average up date rate The steeply slop ed line is the network

bandwidth required for unicast with an average up date rate The horizontal lines are the maximum

bandwidths for a T Ethernet and ATM network Both the horizontal axes and vertical axes are

in log scale

soldiers and an Ethernet can supp ort over soldiers We use the maximum

up date rate for situations in which we are concerned ab out p eak network bandwidth

We use the average up date rate in situations in which we are concerned ab out network

throughput

Multicast is crucial for soldier scalability Even at the average up date rate a

Virtual Co ckpit exercise using unicast rapidly exceeds the bandwidth available on T

and Ethernet networks

Pro cessors

Existing networks are capable of supp orting many soldiers but what ab out existing

pro cessors We investigate the p erformance results Virtual Co ckpit exercises with

more p owerful pro cessors

We compare the p erformance of the SGI Personal Iris to other pro cessors by com

paring p erformance results from the Systems Performance Evaluation Co op erative

SPEC b enchmark integer suite In past work we have found SPEC results cor

relate inversely with execution times for pro cessorintensive tasks Since the largest

Virtual Co ckpit comp onent render is largely pro cessorintensive we assume that

the Virtual Co ckpit pro cessor p erformance will correlate well with SPEC results

The SPEC int value for a SGI Personal Iris is and the SPEC int value for

the MHz SGI Indigo is Roughly the Indigo is times faster than the

Personal Iris so we assume it do es all the Virtual Co ckpit comp onents measured in

Section times faster

Swartz et al did exp eriments to measure the eects of frame rate on on the ability

to p erform tasks through Unmanned Aerial Vehicles UAVs UAVs are used

to conduct a variety of reconnaissance missions with human op erators interpreting

the transmitted imagery at ground stations The exp eriments found that human

p erformance with frames p er second is signicantly b etter than frames p er second

but not consistently worse than frames p er second Therefore we assume a rate of

1000

display receive send 100

Indigo2

Iris

0.1 Seconds of Processor Load

0.01

0.001 1 10 100 1000 10000

Soldiers

Figure Pro cessor Load versus Soldiers This graph depicts the predicted pro cessor load in

seconds on a SGI Personal Iris versus the number of soldiers The graph reads from the b ottom

The load of each piece the sum of the pieces b elow it Thus the total pro cessor load is indicated by

the display line at the top The horizontal lines are the maximum pro cessor throughputs for an

SGI Personal Iris and an SGI Indigo Both axes are in log base scale

frames p er second in our pro cessor load predictions

We break the pro cessor load into three pieces send receive and display see Fig

ure for the comp onents in each piece Figure depicts the predicted pro cessor

load in seconds on a SGI Personal Iris versus the number of soldiers The graph reads

from the b ottom The load of each piece is the sum of the pieces b elow it Thus the

total pro cessor load is indicated by the display line at the top The horizontal lines

are the maximum pro cessor throughputs for SGI Personal Iris and SGI Indigo s as

indicated

The pro cessor is the b ottleneck The SGI Personal Iris is unable to supp ort even

the minimal frame rate for any soldiers Pro cessors signicantly more p owerful than

SGI Personal Iris are needed to realize the armys goals of hundreds and thousands

of distributed soldiers interacting in a simulation Indigo s p owerful workstations

can supp ort adequate frame rates while communicating with s of other soldiers

Quality

If we have a system with the more p owerful pro cessors required for acceptable frame

rate we can investigate the predicted user p erception of the Virtual Co ckpit what

is the application quality In order to apply our quality mo del we must determine

the region of acceptable quality predict jitter predict latency and predict

data loss

To determine the region of acceptable quality we need to dene acceptable limits

along each of the latency jitter and data loss axes

Latency is the time from when an up date is sent until it is pro cessed and

displayed by the other simulators The DIS steering committee had dened

latency as to milliseconds as acceptable for DIS applications We

use milliseconds as the maximum acceptable latency for the Virtual Co ckpit

Jitter is the variance in the latency We assume jitter is the maximum

acceptable for the Virtual Co ckpit

For data loss we note that research in remote teleop erator p erformance states

that task p erformance is virtually imp ossible b elow a threshold of frames p er

second We use frames p er second as the minimum acceptable frame rate

The conventional rate of frames p er second would provide a more realistic

simulation p ossibly inuencing training eectiveness We assume that frame

rates ab ove frames p er second provide no more useful data

For missed up dates we assume the same thresholds of meter p osition and

degrees of accuracy used in SimNet We assume the simulation must b e

accurate to b e acceptable

10 Quality Unacceptable Quality 1 Acceptable Quality

0.1 10 100 1000 10000

Soldiers

Figure Virtual Co ckpit Quality versus Soldiers The middle line represents predicted appli

cation quality as the number soldiers increases The upp er and lower lines represent b est and worst

case quality scenarios resp ectively The horizontal line marks the acceptableunacceptable quality

limit

We predict jitter latency and data loss as in Section Section and

Section resp ectively

To compute application quality we use the metho ds describ ed in Section

We examine the application quality for Virtual Co ckpit exercises with SGI Indigo s

b ecause SGI Personal Iris were incapable of achieving the minimum acceptable frame

rate Figure depicts our quality predictions

SGI Indigo s provides acceptable quality for Virtual Co ckpit exercises with ab out

soldiers Notice that in Figure we had predicted that the Indigo s could

supp ort soldiers when we analyzed the frame rate alone Application quality

is determined by latency jitter and missed up dates as well as frame rate Latency

jitter and missed up dates all increase as the number of soldiers increase

HighPerformance Pro cessors The quality of the Virtual Co ckpit was improved

by using highp erformance pro cessors Is there further b enet from higherp erformance

pro cessors

We assume the network has sucient bandwidth to handle all necessary up dates in

order to minimize the eects of the network We compare the quality of the Virtual

Co ckpit with SGI Personal Irises and SGI Indigo s to the quality of the Virtual

2

Co ckpit with pro cessors times more p owerful than the Indigo

Figure shows the quality predictions for the Virtual Co ckpit with dierent

pro cessors The top curve is an SGI Personal Iris The second curve is an SGI Indigo

The b ottom curve is a pro cessor times more p owerful than the Indigo The

horizontal line represents the acceptable quality limit The knee in the curve for

the x pro cessor is where the pro cessor decreases the frame rate in order to handle

the up dates from the other soldiers

Highp erformance pro cessors are crucial for acceptable Virtual Co ckpit quality

SGI Personal Iris are unable to deliver acceptable application quality More p ow

erful SGI Indigo s can deliver acceptable application quality for up to soldiers

xs provides b etter application quality than Indigo s and can deliver acceptable

application quality for up to soldiers

HighSp eed Networks With the Virtual Co ckpit running on pro cessor times

more p owerful than the SGI Indigo a T network will b ecome saturated while

supp orting just s of soldiers How much quality b enet will then b e gained

from a highsp eed network We compare the quality of the Virtual Co ckpit with an

Ethernet to that of the Virtual Co ckpit with an ATM network The ATM network

transmits the up date packets faster Mbitssecond versus Mbitssecond for

an Ethernet Past work has found jitter and missed up dates in the ATM network

2

Pro cessor p erformance has approximately doubled every year for the last years If this

trend continues the x pro cessor will come along in ab out years

100

Iris Indigo 2 15x

10 Quality

Unacceptable Quality 1 Acceptable Quality

Processor decreases frame rate

0.1 1 10 100 1000 10000

Soldiers

Figure Virtual Co ckpit Quality versus Soldiers The three curves represent the quality

predictions for three dierent pro cessors The top curve is an SGI Personal Iris The second curve

is an SGI Indigo The b ottom curve is a pro cessor times more p owerful than the Indigo The

horizontal line represents the acceptable quality limit Both the horizontal and vertical axes are in

log scale

Quality Predictions vs. Soldiers 100

Ethernet ATM

10 ATM

saturated

Unacceptable Quality 1

Quality Acceptable Quality

Ethernet saturated

0.1

Processor decreases frame rate

0.01 1 10 100 1000 10000

Soldiers

Figure Virtual Co ckpit Quality versus Soldiers The two curves represent the quality predic

tions for an Ethernet and an ATM network The horizontal line represents the acceptable quality

limit Both the horizontal and vertical axes are in log scale

are the same as jitter and missed up dates in the Ethernet We assume jitter

and missed up dates remain the same in highsp eed networks In order to make the

network bandwidth more signicant we assume the maximum up dates p er second

Figure shows the quality predictions for the Virtual Co ckpit with dierent

networks The top curve is the quality predictions for an Ethernet The lower curve

is the quality predictions for an ATM The steep increase in the Ethernet curve o ccurs

when the Ethernet b ecomes saturated At this p oint the Virtual Co ckpit b egins to

increasingly miss up dates The rst b end in the ATM curve o ccurs when the pro cessor

must decrease the frame rate in order to pro cess all up dates The second b end in the

ATM curve o ccurs when the ATM b ecomes saturated

Highsp eed networks are unimp ortant for the Virtual Co ckpit quality until existing

networks reach saturation The quality prediction curves for the Ethernet and the

ATM are indistinguishable until the Ethernet b ecomes saturated At this p oint the

ATM network greatly increases scalability The network as the present b ottleneck in

application quality has b een removed by the DIS use of dead reckoning and multicast

Summary

Multip erson distributed multimedia applications stress all parts of a computer sys

tem We have developed a quality planning mo del for distributed multimedia appli

cations that allows us to investigate p otential b ottlenecks in application quality We

have applied our mo del to the Virtual Co ckpit a ight simulator used in Distributed

Interactive Simulation DIS DIS is a virtual environment within which simulators

may interact through simulation at multiple networked sites In order for DIS to

b e eective for military training simulation exercises must supp ort up to

soldiers

In applying our mo del to the Virtual Co ckpit we discovered

Highp erformance pro cessors are crucial for acceptable Virtual Co ckpit qual

ity Lowend pro cessors are unable to deliver acceptable application quality

More p owerful pro cessors can deliver acceptable application quality for s

of soldiers

Highsp eed networks are unimp ortant for Virtual Co ckpit quality until existing

networks reach saturation T and Ethernet networks will saturate after

generations of pro cessor improvement The network as the present b ottleneck

in application quality has b een removed by the DIS use of dead reckoning and

multicast

Dead reckoning greatly b enets the Virtual Co ckpit quality primarily due to its

reduction of network load The small up date message size plus the infrequent

number of up dates reduces the network load over a scheme where all simulators

up date their p osition with each frame With dead reckoning a T can supp ort

s of simulators an Ethernet can supp ort s of simulators and an ATM

can supp ort s of simulators

Multicast is a crucial to maintain reduced network bandwidth Even at the

average up date rate a Virtual Co ckpit exercise using unicast rapidly exceeds

the bandwidth available on T and Ethernet networks

We depict the ab ove conclusions in Figure The gure shows our measure of

quality for the Virtual Co ckpit The user dened acceptable latency jitter data loss

and missed up dates determine a region of acceptable application quality depicted by

the shaded region Each p oint is an instantiation of the application and the under

lying computer system All p oints inside the shaded region have acceptable quality

while those outside the region do not There are four soldier Virtual Co ckpit in

stantiations shown SGI Personal Iris with Ethernet SGI Personal Iris with ATM

SGI Indigo s with Ethernet and SGI Personal Iris with Ethernet and unicast Only

the Indigo s with Ethernet have acceptable application quality The ATM do es not

signicantly improve the quality of the Virtual Co ckpit with the Personal Iris Using

unicast instead of multicast greatly decreases the application quality for the Personal

Iris with Ethernet Note that the graph do es not show the fourth axis of quality

missed up dates b ecause of the diculty in depicting four dimensions We chose to

eliminate the missed up dates axis b ecause the predicted number of missed up dates is

constant for all system congurations

SGI Personal Iris, Ethernet, Unicast Data Loss Acceptable Quality Limit of Acceptable Application Instance Data Loss SGI Personal Iris, ATM (3 frames/sec) SGI Personal Iris, Ethernet Jitter Limit of Acceptable Jitter (10%)

SGI Indigo 2, Ethernet

Limit of Acceptable Latency Latency

(300 ms)

Figure Virtual Co ckpit Quality The user denes the acceptable latency jitter and data loss

These values determine a region of acceptable application quality depicted by the shaded region All

p oints inside the shaded region have acceptable quality while those outside the region do not Four

Virtual Co ckpit instantiations are shown as p oints in this space Note that the graph do es not show

the fourth axis of quality missed up dates b ecause of the diculty in depicting four dimensions

Chapter

Conclusions

Most of the work p eople do is in groups People communicate b est when they can draw

pictures and use voice and b o dy language Computers supp orting collab orative work

can provide realistic p ersontop erson communication by using multimedia Multi

media applications can also provide collab oration in synthetic environments through

virtual reality Todays explosive growth in fast networks and p owerful workstations

has the p otential to supp ort and even enhance group work through multimedia

There are at least four reasons why multimedia is an imp ortant area for computer

science research

Teleconferences Multimedia has greatly enhanced the richness of computer

based conferencing In the b eginning there was only email as a form of Internet

communication So on after there was the ttybased talk where two users could

type synchronously on a split computer screen Emoticons were developed to

allow users to express their emotions in symbols in addition to words

Today audio and video enable p eople to use voice inection facial expressions

and gestures to assist communication making teleconference more lifelike

Environments Multimedia can enable environments that are not available in

the real world For instance imagine a virtual reality environment that allows

scientists to explore the surface of Mars through multimedia data fed back from

a rob ot on the planets surface And as shown in Section soldiers may so on

train for combat on a virtual battleeld reducing the costs and risks asso ciated

with live combat training

Communication Multimedia brings the opp ortunity for communication with

computers where previously communication was imp ossible For instance mul

timedia can enable p eople with visual impairment to op erate with a computer

through gesture recognition and voice synthesis

Fun Multimedia makes computers fun Multimedia applications have much

more widespread app eal than do traditional textbased applications Users who

would otherwise not deal with computers will b e attracted by multimedia bring

ing money into computer industry and more researchers into computer science

Despite the real and p otential b enets of multimedia there are several obstacles to

overcome in designing multimedia applications and systems Multimedia and multi

user applications are more resource intensive than traditional textbased singleuser

applications In addition multimedia applications have dierent p erformance re

quirements than do textbased applications Textbased applications are sensitive to

delay and loss while multimedia applications are sensitive to delay and jitter The

b ottlenecks to textbased application p erformance might lie in those comp onents that

induce delay while the b ottlenecks to multimedia applications might lie in the those

comp onents that induce the jitter New techniques must b e developed to identify

b ottlenecks in multimedia application p erformance

We have developed a quality planning metho d for distributed collab orative mul

timedia applications that allows us to investigate p otential b ottlenecks in application

quality At the heart of our metho d is a mo del that allows us the predict the appli

cation p erformance from the users p ersp ective Our mo del takes into account the

comp onents fundamental to multimedia applications delay jitter and data loss Our

mo del allows us to investigate application b ottlenecks by b eing adjustable to

Number of Users Explicit supp ort for collab oration in our computer to ols has

the p otential to supp ort and even enhance traditional group work Our mo del

allows us see how quality changes as the number of collab orating users increases

Applications Our mo del can analyze new distributed multimedia applications

by altering the user requirements often without further micro or macro exp er

iments

Hardware and Architectures Our mo del can b e adapted to hardware and meth

o ds of pro cessing that are not available for exp erimentation either b ecause of

prohibitive costs or b ecause they are not yet built Changing the mo del ar

chitecture and hardware comp onents do es yield less accurate predictions than

predictions based on our micro exp eriments However additional micro and

macro exp eriments can b e incrementally p erformed as new hardware or archi

tectures are explored

Quality Metrics The quality metric we have presented in Section has several

useful characteristics it has the prop erties of a linear mathematical metric and

it incorp orates the three fundamental multimedia quality comp onents latency

jitter and data loss However there can b e many p ossible quality metrics for

a given application Fortunately the rest of our mo del is indep endent of the

quality metric chosen If new metrics are developed they can b e used in place

of our quality metric

When computing the amount of jitter in a multimedia stream jitter researchers

have used a variety of metrics However as our work in Section shows nearly all

previouslyused jitter metrics are statistically similar Thus jitter researchers can use

a jitter metric most suitable to their research without fear of dierent result b eing

obtained from an alternate metric

Using the mo del in our metho d we can explore the p erformance tradeos for a va

riety of multimedia applications as the underlying computers systems change Among

the system changes we can examine are changes in capacity such as faster pro ces

sors networks or disks and scheduling such as realtime priorities or delay buering

We have applied our mo del to three emerging applications audio conferences ying

and the virtual co ckpit There are three general trends that we have identied in

analyzing these applications

Processors are King Pro cessors are the b ottleneck in p erformance for many

multimedia applications Audio conferences Flying and the Virtual Co ckpit all

saw a dramatic increase in application p erformance with an increase in pro cessor

p ower

Networks are Queen Networks with more bandwidth often do not increase the

quality of multimedia applications For Audio conferences a faster network did

very little to improve application quality In Flying more network bandwidth

did increase the p erformance for one user but once that users requirements

were met there was little b enet from more bandwidth For the Virtual Co ck

pit more bandwidth did not noticeably aect application quality at all but it

did allow more simultaneous users to train for combat when existing networks

b ecame saturated Networks with more bandwidth do not b enet fewp erson

multimedia applications but serve only to increase the scalability of the appli

cations by allowing more simultaneous users

Share the Burden Application capacity requirements are not equally distributed

across computer systems Performance for many multimedia applications can

b e improved greatly by shifting capacity demand from computer system comp o

nents that are heavily loaded to those that are more lightly loaded And shifting

capacity demand is crucial as the number of application users increases For

Audio conferences silence deletion transfered load from the network to the pro

cessor While this decreased application quality for two audio conference users

it greatly increased application quality for three or more users For Flying

application p erformance was totally unacceptable unless capacity demand was

shifted from the pro cessor to sp ecialized hardware Lo cal ying which would

have shifted capacity demand from the server to the clients required far to o

much network bandwidth to b e feasible in the foreseeable future For the Vir

tual Co ckpit dead reckoning shifted capacity demand dramatically from the

network enabling current networks to supp ort the tens of thousands of soldiers

required for eective combat training

Our ob jective in identifying application b ottlenecks is to understand the system

limits that will prevent applications from meeting users needs After identifying each

b ottleneck we explore ways to reduce the eect of the b ottleneck through improving

system resources We then explore the new b ottlenecks that arise in the enhanced

system Our analysis at each stage is likely to overstate system p erformance b ecause

we assume maximum p ossible p erformance of each system comp onent However

the b ottlenecks we identify are likely to b e b ottlenecks in practice and the design

principles suggested by the analysis should ameliorate these b ottlenecks in practice

To conclude the ma jor contributions of this thesis are

A multimedia application quality mo del and implementation metho d that en

ables the prediction of application p erformance and evaluation of system design

tradeos

Detailed p erformance predictions for three distributed collab orative multimedia

applications an audio conference a ying interface to a D scientic database

and a collab orative ight simulator called the Virtual Co ckpit

The eects of system improvements on the p erformance of these multimedia applications

A demonstration of measures of jitter that have b een used by jitter researchers

showing all reasonable choices of jitter metrics are statistically similar

Exp erimentbased studies of the eects of highp erformance pro cessors real

time priorities and highsp eed networks on jitter

Predictions on the imp ortance of application jitter reduction techniques in the future

Chapter

Future Work

The research in this thesis fo cuses on a new rapidly expanding area of computer

science As such it paves the way into an exciting new area of computer science

allowing access to several new areas for future research

The taxonomy given in Section is an ongoing do cument Moreover there are

several areas of the current taxonomy that supp ort multimedia application p erfor

mance research that we have not had the opp ortunity to examine Most notably

we would like to see a more extensive investigation of the eects of scheduling and

reservation on multimedia application p erformance In addition our taxonomy can

p ossibly supp ort a cost analysis of the tradeos b etween network bandwidths disk

data rates and pro cessor throughputs Such an economic analysis could determine

what system conguration will b e the most economical for improving computer sup

p ort for multimedia

For some applications there is p otential for interaction eects among the quality

events For example d graphics applications have multiple factors aecting users

p erception of ob jects and dierent combinations of requirements may yield satisfac

tory results Such applications may even have a nonconvex region of acceptable

quality Future research into new quality metrics appropriate for these applications

may b e required

Applications that have changing user requirements present another challenge For

example users doing remote problemsolving via a video link may want to maximize

frame rate at the exp ense of frame resolution while they are identifying the lo cation

of the problem Once the problem is lo cated they may want to maximize frame

resolution at the exp ense of frame rate p erhaps even wanting a still image to b est

identify the problem As presented our mo del do es not allow sp ecication of dy

namic user requirements One p ossible solution would b e to apply a separate quality

planning mo del to each set of user requirements sp ecied The mo del that had the

p o orest quality for a given system conguration could then b e examined more closely

to determine the application quality b ottleneck within

Systems that need to supp ort more than one simultaneous application present

another challenge Our mo del currently assumes the computer system is dedicated to

the application we are analyzing We might further develop our techniques to deter

mine where p otential b ottlenecks might arise when several applications are running

at once

For the Virtual Co ckpit an interesting pro cessing decision arises in the pro cessing

of dead reckoning up dates Dead reckoning allows a tradeo among three factors

network bandwidth pro cessor load and simulation accuracy After a dead reckoning

computation a simulator will have the p erceived p osition of other simulators This

p erceived p osition will b e inaccurate up to a preset threshold Lower thresholds

provide a more accurate representation of other simulators but increase the number of

up dates required causing more network load Higher order deadreckoning decreases

the number of up dates sent but increases pro cessor load Our mo del could b e used to

determine the eect on application quality for the various dead reckoning parameters

p ossibly recommending a sp ecic accuracy and algorithm to use

For our predictions we assume that pro cessor p erformance will continue to reect

workstation p erformance If other workstation comp onents in particular the data

bus do not continue to scale with pro cessor p erformance future work would include

extending our mo del to discover which workstation comp onents are b ottlenecks to

application quality In addition for some predictions esp ecially those for Flying

see Section we assumed sp ecialized hardware to shift capacity demand from the

pro cessor A careful study of the eects of ying hardware will determine b ottlenecks

in current prop osed ying systems

We studied the eects of realtime priorities on jitter when used at b oth the

sender and receiver Realtime priorities may not signicantly reduce jitter when

used at only the sender or at only the receiver If this is true realtime priorities may

only b e eective in reducing jitter for a lo cal area network without an intervening

router On a Wide Area Network esp ecially the Internet realtime priorities may

not b e available on all routers reducing the eectiveness of realtime priorities on the

sender and receiver In this case buering techniques will b e needed

Another p ossible p otential jitter reduction technique would b e a realtime net

work proto col Future work includes exp eriments to measure the eects of realtime

network proto col on jitter However network delivery of data within strict jitter

b ounds do es not signicantly help the sender and receiver They must still worry

ab out internal jitter due to queuing in the op erating system Stamping out jitter in

the network do es not eliminate the need for jitter management co de in the hosts

In Section we made predictions on the magnitude of jitter on a LAN in the

future MOS studies or other appropriate techniques can b e used to determine the

actual threshold of human p erception of jitter The threshold for human p erception of

jitter in an audio stream is probably dierent than the threshold of human p erception

of jitter in a video stream as the eye is b etter able to smo oth over o ccasional glitches

in video than the ear is to account for glitches in audio

The layout of multimedia data on disks and disk scheduling strategies may b e

crucial for reducing jitter for some multimedia applications Database management

systems DBMS that manage multimedia data may further impact the amount of

jitter seen by an application that accesses that data Future work would involve

exploring the eects of dierent disk strategies and DBMS techniques for reducing

jitter

This thesis presents a metho d for determining application quality given user re

quirements and system congurations Although the analysis presented here to ok

many years to complete we can dream of packaging our knowledge into a fast exible

to ol usable by application developers and visionaries Imagine a our mo del wrapp ed

up in a smo oth sexy graphical user interface The user enters application p erfor

mance parameters at an appropriate level of detail p erhaps by choosing a bit rate or

the frame rate frame size and color or even by choosing the most pleasing picture

from a multimedia sample The user enters system parameters with similar ease

sp ecifying raw pro cessing p ower and network bandwidth or selecting a workstation

and network from a list of computer system hardware Our mo del the heart of

the to ol determines the quality that the application would have given the sp ecied

information The output results app ear as rich and varied as the graphs shown in

this thesis with users able to quickly and easily identify the p otential b ottlenecks in

application p erformance by varying the hardware number of users and application

requirements The b enets of packaging our mo del in this way would b e enormous

Our mo del would save developers countless amounts of time and money by quickly

b eing able to identify p otential p erformance b ottlenecks b efore the undertaking of

exp ensive timeconsuming system design and implementation More imp ortantly

our mo del would enable new forms of interaction for all computer users including

ourselves We would immerse ourselves in computer applications rich with graphics

audio and video and interact with each other across b oth time and space nally

realizing the full p otential of computers and multimedia

Bibliography

Sparcstation audio programming Technical rep ort Sun Microsystems

Richard C Allen Chris Bottcher Phillip Bording Pat Burns John Conery

Thomas R Davies James Demmel Chris Johnson Lakshmi Kantha William

Martin Georey Parks Steve Piacsek Dan Pryor Tamar Schlick MR Strayer

Verena M Umar Rob ert Voigt Jerrold Wagener Dave Zachmann and John

Ziebarth The Computational Science Education Project CSEP January

The URL for this b o ok is httpcsepphyornlgovcsephtml This was

in the rst complete textb o ok available on the World Wide Web

David P Anderson Ralf Guido Herrtwich and Carl Schaefer SRP a resource

reservation proto col for guaranteedperformance communication in the Internet

Rep ort UCBCSD University of California Berkeley Computer Science

Division Berkeley CA USA February

David P Anderson ShinYuan Tzou Rob ert Wahbe Ramesh Govindan and

Martin Andrews Supp ort for continuous media in the DASH system In Proc

th Intl Conf Distributed Computing Systems ICDCS Paris France

May June IEEE

Sally Apgar New weapon in retailing Technology Minneapolis Star Tribune

August

Ronnie T Apteker James A Fisher Valentin S Kisimov and Hano ch Neishlos

Video acceptability and frame rate IEEE Multimedia pages Fall

Barb eris and Pazzaglia Analysis and optimal design of a packet voice receiver

IEEE Transactions on Communication February

John Bates and Jean Bacon Supp orting interactive presentation for distributed

multimedia applciations Multimedia Tools and Applications March

Salvador Bayarri Marcos Fernandez and Mariano Perez Virtual reality for

driving simulation Communications of the ACM May

Bharat Bhargava Enrique Maa and John Riedl Communication in the Raid

distributed database system International Journal on Computers and ISDN

Systems

David R Boggs Jerey C Mogul and Christopher A Kent Measured capacity

of an Ethernet Myths and reality In Proceedings of the SIGCOMM Conference

August

M Borwn J Foote G Jones K Sparck Jones and S Young Op envocabulary

sp eech indexing for voice and video mail retrieval In Proceedings of the

Fourth ACM International Multimedia Conference Boston MA November

ACM

Tim Browning Capacity Planning for Computers Academic Press

LuisFilip e Cabrera Edward Hunter Michael J Karels and David A Mosher

Userpro cess communication p erformance in networks of computers IEEE

Transactions on Software Engineering January

J Carlis J Riedl A Georgop oulos G Wilcox R Elde J H Pardo K Ugur

bil E Retzel J Maguire B Miller M Claypo ol T Brelje and C Honda A

zo omable DBMS for brain structure function and b ehavior In International

Conference on Applications of Databases June

Stephen Casner and Stephen Deering First IETF Internet audio cast ACM

SIGCOMM pages

Todd Cavalier Ravinder Chandhok James Morris David Kaufer and Chris

Neuwirth A visual design for collab orative work Columns for commenting

and annotation In Proceedings of IEEE HICSS

D Clark S Shenker and L Zhang Supp orting realtime applications in an

integrated services packet network Architecture and mechanism Computer

Communication Review July

M Claypo ol J Riedl J Carlis G Wilcox R Elde E Retzel A Georgop oulos

J Pardo K Ugurbil B Miller and C Honda Network requirements for D

ying in a zo omable brain database IEEE JSAC Special Issue on Gigabit

Networking June

Mark Claypo ol Jo e Hab ermann and John Riedl The eects of high

p erformance pro cessors realtime priorities and highsp eed networks on jitter

in a multimedia stream Technical Rep ort TR University of Minnesota

Department of Computer Science June

Mark Claypo ol and John Riedl Silence is golden The eects of silence deletion

on the CPU load of an audio conference Technical Rep ort TR University

of Minnesota Department of Computer Science

Mark Claypo ol and John Riedl Silence is golden The eects of silence deletion

on the CPU load of an audio conference In Proceedings of IEEE Multimedia

Boston May

Mark Claypo ol and John Riedl Analysis of pro cessor p ower versus network

bandwidth Technical Rep ort TR University of Minnesota

Mark Claypo ol and John Riedl A quality planning mo del for distributed mul

timedia in the virtual co ckpit In Proceedings of ACM Multimedia pages

November

The DIS Steering Committee The DIS vision a map to the future of dis

tributed interactive simulation Technical rep ort Institute for Simulation and

Training May

American Technology Corp oration HyperSonic Sound System HSS A New

Method of Sound Reproduction July The URL for this do cument can

b e found at httpwwwatcsdcomHTMLsoundhtml

Digital Equipment Corp oration Intel Corp oration and Xerox Corp oration The

Ethernet A lo cal area network data link layer and physical layer sp ecication

September

Standard Performance Evaluation Corp oration SPEC primer July The

SPEC primer is frequently p osted to the newsgroup compb enchmarks SPEC

questions can also b e sent to sp ecncgacupp ortalcom

HJ Curnow and BA Wichmann A synthetic b enchmark The Computer

Journal

Jay Devore and Roxy Peck Statistics The Exploration and Analysis of Data

Wadsworth Inc second edition edition

Stanford Diehl Forging a new medium Byte July

Spiros Dimolitsas Franklin L Corcoran and John G Phipps Jr Impact of

transmission delay on ISDN videotelephony In Proceedings of Globecom

IEEE Telecommunications Conference pages Houston TX Novem

b er

Dick Dobson Make access and the web work together Byte Octob er

Jack J Dongarra Performance of various computers using standard linear equa

tions software Technical Rep ort CS University of Tennessee February

To obtain a p ostscript copy send email to netlibornlgov with message

b o dy send p erformance from b enchmark

Chip Elliot Highquality multimedia conferencing through a longhaul packet

network In Proceedings of the First ACM International Conference on Multi

media pages New York NY

TT Elvins A survey of algorithms for volume visualization Computer Graph

ics August

Hans Eriksson MBONE The multicast backbone Communications of

the ACM August A FAQ on mbone can b e found at

ftpveneraisiedumbonefaqtxt

D Ferrari A Gupta and G Ventre Distributed advance reservation of real

time connections Lecture Notes in Computer Science

Domenico Ferrari Delay jitter control scheme for packetswitching internet

works Computer Communications July

JW Forgie and CW McElwain Some comments on NSC note eects

of lost packets on sp eech intelligibility Technical Rep ort Network Sp eech

Compression Note MIT Lincoln Lab oratory March

Daniel Frankowski and John Riedl Hiding jitter in an audio stream Tech

nical Rep ort Technical Rep ort University of Minnesota Department of

Computer Science

Krzysztof Frankowski Denition by Professor Krzysztof Frankowski Computer

Science Department University of Minnesota May

Steve Gillmor Notes op ens up to the web Byte Octob er

Timoth A Gonsalves Packetvoice communication on an ethernet lo cal com

puter network an exp erimental study Communications of the ACM pages

Walter Goralski and Gary Kessler FIBRE CHANNEL Standards Applica

tions and Products December The URL for this do cument can b e found

at httpwwwhillcompersonnelgckbre channelhtml

R Govindan and D Anderson Scheduling and IPC mechanisms for continuous

media ACM Operating Systems Review Octob er

Mathew Gray Internet Statistics Growth and Usage of the Web and

the Internet The URL for this do cument can b e found at

httpwwwmitedupeoplemkgraynet

Rick Grehan NT in real time Byte Octob er

Mike Guidry and Mike Strayer Computational Science for the Phys

ical and Life Sciences January The URL for this b o ok is

httpcsepphyornlgovguidryphysphysroothtml This course is

completely online with lectures delivered electronically from the network and

student exercises turned in as Web html do cuments

JL Gustafson and QO Snell HINT A new way to measure computer p er

formance In Proceedings of the th Annual Hawaii International Conference

on System Sciences volume pages IEEE Computer So ciety Press

January Technical rep ort available via ftp from ftpsclameslabgov

Jo e Hab ermann and John Riedl Using realtime priorities to eliminate jitter

in a multimedia stream Technical rep ort University of Minnesota Department

of Computer Science January

Matti Hamalainen Andrew B Whinston and Svetlana Vishik Money in elec

tronic commerce Digital cash electronic fund transfer and ecash Communi

cations of the ACM June

Charles Hansen and Stephen Tenbrink Impact of gigabit network research on

scientic visualization IEEE Computer May

John H Hartman and John K Ousterhout The Zebra strip ed network le

system ACM Transactions on Computer Systems August

Edward P Harvey and Richard L Shaer Capability of the distributed interac

tive simulation networking standard to supp ort high delity aircraft simulation

In Proceedings of the th InterserviceIndustry Training Systems Conference

John L Hennessy and David A Patterson Computer Architecture A Quanti

tative Approach Morgan Kaufmann Publishers Inc

William L Hibbard Brian E Paul David A Santek Cahrles R Dyer Andre L

Battaiola and MarieFrancoise VoidrotMartinez Interactive visualization of

earth and space science computations IEEE Computer July

A Hopp er Pandora an exp erimental system for multimedia applications

ACM Operating Systems Review April

JJ Garcialunaaceves HP Dommel Flo or control for multimedia conferencing

and collab oration January

Satoru Iai Takaaki Kurita and Nobuhiko Kitawaki Quality requirements for

multimedia communcation services and terminals interaction of sp eech and

video delays In Proceedings of Globecom IEEE Telecommunications Con

ference pages Houston TX November

Magid Igbaria and Snehamay Banerjee Computer capacity planning manage

ment Denitions and metho dology Journal of Information Technology

Ra j Jain The Art of Computer Systems Performance Analysis John Wiley

and Sons Inc

K Jeay D Stone and D Poirier Yartos kernel supp ort for ecient pre

dictable realtime systems In Joint IEEE Workshop on RealTime Operating

System and Software and IFACIFIP Workship on RealTime Programming

pages May

K Jeay D Stone and F Smith Kernel supp ort for live digital audio and

video Computer Communications

K Jeay D Stone and F Smith Transport and display mechanisms for

multimedia conferencing across packetswitched networks Computer Networks

and ISDN Systems July

K Jeay D L Stone T Talley and F D Smith Adaptive b esteort delivery

of audio and video data across packetswitched networks In rd International

Workshop on Network and Operating System Support for Digital Audio and

Video November

Sanjay K Jha and Bruce R Howarth Capacity planning of LAN using network

management In Proceedings of Conference on Locacl Computer Networks pages

Washington DC

Saimin Jin Dhadesugo or R Vaman and Divyendu Sina A p erformance

mangement framework to provide b ounded packet delay and variance in packet

switched networks Computer networks and ISDN Sytems pages

September

Rick Jones Netperf HewlettPackard The netp erf home page can b e

found at httpwwwcuphpcomnetperfNetp erfPagehtml

Henry C Lucas Jr Performance evaluation and monitoring Computing Sur

veys September

Kalkbrenner Quality of servie qos in distributed hypermedia systems In

Proceedings of the nd International Workshop on Principles of Ducment Pro

cessing

ER Kandel JH Schwartz and TM Jessel Principles of Neural Science rd

Ed Appleton Lange Norwalk CT

Jonathan Kay and Joseph Pasquale Measurement analysis and improvement

of UDPIP throughput for the DECstation Technical rep ort University

of California at San Diego

S Khanna M Serbree and J Zolnowsky Realtime scheduling in SunOS

In Proceedings of the Winter Usenix Conference

Kleinro ck and Naylor Stream trac communication in packet switched net

works Destination buering considerations IEEE Transactions on Communi

cations COM December

Peter Kro on and Kumar Swaminathan A highquality multirate realtime

CELP co der IEEE JSAC Selected Areas in Communications June

Edward D Lazowska John Zahorjan David R Cheriton and Willy

Zwaenepo el File access p erformance of diskless workstations Transactions

on Software Engineering August

S Leer M McKusick M Karels and J Quarterman The Design and Imple

mentation of the BSD Unix Operating System AddisonWesley Publishing

Company

Meng jou Lin Jenwei Hsieh David Du and James MacDonald Performance of

highsp eed network IO subsystems Case study of a Fibre Channel network

In Proceedings of Supercomputing November

Meng jou Lin Jenwei Hsieh David HC Du Joseph P Thomas and James A

MacDonald Distributed network computing over lo cal ATM networks ATM

LANs Implementation and Experience with An Emerging Technology

Michael R Macedonia Michael J Zyda David R Pratt Paul T Barham

and Steven Zeswitz NPSNET A network software architecture for large scale

virtual environments Presence Octob er

Vahid Mashayekhi Janet Drake WeiTek Tsai and John Riedl Distributed

collab orative software insp ection IEEE Software September

Vahid Mashayekhi Chris Feulner and John Riedl CAIS Collab orative Asyn

chronous Insp ection of Software In The Second ACM SIGSOFT Symposium on

the Foundations of Software Engineering Asso ciation of Computing Machinery

December

Michael J Massimino and Thomas B Sheridan Teleoperator p erformance with

varying force and visual feedback In Human Factors pages March

W Dean McCarty Steven Sheasby Philip Amburn Martin R Stytz and Chip

Switzer A virtual co ckpit for a distributed interactive simulation IEEE Com

puter Graphics and Applications January

FM McMahon The livermore fortran kernels A computer test of numeri

cal p erformance range Technical Rep ort UCRL Lawrence Livermore

National Lab oratory University of California December

Evi Nemeth Garth Snyder and Scott Seebass Unix System Adminstration

Handbook Prentice Hall

Judith S Olson Gary M Olson and David K Meader What mix of video

and audio is useful for remote realtime work In Proceedings of CHI

Proceedings of the Conference in Human Factors in Computing Systems pages

Davis Yen Pan Digital audio compression Digital Technical Journal

Spring

Craig Partridge Gigabit Networking AddisonWesley

Pastmaster Smilies Unlimited June The URL for this do cument can b e

found at httpwwwczwebcomsmilieshtm

Ketan Patel Brian C Smith and Lawrence A Rowe Performance of a software

mp eg video deco der In Proceedings of ACM Multimedia Anaheim CA

Stephen Travis Pope A taxonomy of computer music Computer Mu

sic Journal This article can b e found at httpwww

mitpressmiteduComputerMusicJournalEdNotesTaxonomy

L R Rabiner and M R Sambur An algorithm for determining the endp oints

of isolated utterances The Bel l System Technical Journal February

Lynda Radosevich Retailer swings its partners to edi Computerworld

September

Ramachandran Ramjee Jim Kurose Don Towsley and Henning Schulzrinne

Adaptive playout mechanisms for packetized audio applications in widearea

networks In Proceedings of the th Annual Joint Conference of the IEEE

Computer and Communications Societies on Networking for Global Communci

ation Volume pages Los Alamitos CA USA IEEE Computer

So ciety Press

Andy Reinhardt Building the data highway Byte March

John Riedl Vahid Mashayekhi Jim Schnepf Mark Claypo ol and Dan

Frankowski SuiteSound A system for distributed collab orative multimedia

IEEE Transactions on Know ledge and Data Engineering August

Lawrence A Rowe and Brian C Smith A continuous media player In rd

International Workshop on Network and OS Support for Digital Audio and

Video

Radhika R Roy Networking contraints in multimedia conferencing and the role

of ATM networks ATT Technical Journal JulyAugust

Thomas M Ruwart and Matthew T OKeefe Storage and interfaces In

Proceedings of the Storage and Interfaces Santa Clara CA January

James Schnepf Vahid Mashayekhi John Riedl and David Du Computer sup

p orted participative mediarich education Educational Technology Review

H Schulzrinne S Casner R Frederick and V Jacobson Rtp A transp ort

proto col for realtime applications RFC January

Henning Schulzrinne Voice communications across the internet A network

voice terminal Technical rep ort University of Massachussetts Department of

Electrical Engineering August

Shashi Shekhar Sivakumar Ravada Greg Turner Douglas Chubb and Vipin

Kumar Load balancing in high p erformance GIS Partitioning p olygonal maps

In Proceedings of the International l Symposium on Large Spatial Databases

John F Sho ch and Jon A Hupp Measured p erformance of an Ethernet lo cal

network Communications of the ACM December

Jaswinder Pal Singh Ano op Gupta and Marc Levoy Parallel visualization al

gorithms Performance and architecural implications IEEE Computer

July

B M Slotnick and C M Leonard A Stereotaxic Atlas of the Albino Mouse

Forebrain Sup erintendent of Do cuments US Government Printing Oce

Washington DC

Ben Smith The Byte unix b enchmarks Byte pages March

Brian C Smith and Lawrence A Rowe A new family of algorithms for ma

nipulating compressed images IEEE Computer Graphics and Applications

September

Monica Snell Internetsavvy capacityplanning software helps guide network

growth LAN Times September The URL for this journal can b e found

at httpwwwlantimescom

Michael V Stein and John Riedl The eects of transp ort metho d on the

quality of audio conferences with silence deletion Technical rep ort University

of Minnesota Department of Computer Science June

Donald Stone and Kevin Jeay An empirical study of delay jitter management

p olicies Multimedia Systems

Merryanna Swartz and Daniel Wallace Eects of frame rate and resolution

reduction on human p erformance In Proceedings of ISTs th Annual Con

ference Munich Germany

T Talley and K Jeay Twodimensional scaling techniques for adaptive rate

based transmission control of live audio and video streams In Proceedings of

the Second ACM International Conference on Multimedia pages

Octob er

Tanenbaum Networks Prentice Hall

R Terek and J Pasquale Exp eriences with audio conferencing using the X

window system UNIX and TCPIP In Proceedings of the Summer Usenix

Conference pages

Rob ert H Thomas Harry C Forsdick Terrence R Crowley Richard W Schaaf

Raymond S Tomlinson Virginia M Travers and George G Rob ertson Dia

mond A multimedia message system built on a distributed architecture IEEE

Computer December

Don Tolmie and Don Flanagan HIPPI Its Not Just for Supercomputers Any

more Los Alamos National Lab oratory The URL for this do cument can

b e found at httpnoconelanlgovlanpHIPPI Data Commhtml

Claudio Topolcic Exp erimental internet stream proto col version STI I

RFC Octob er

Dinesh C Verma Hui Zhang and Domenico Ferrari Delay jitter control for

realtime communication in a packet switching network IEEE Computer pages

Gregory K Wallace The JPEG still picture compression standard Communi

cations of the ACM April

Peter Wayner Inside the NC Byte November

RP Weicker Dhrystone A synthetic systems programming b enchmark Com

munications of the ACM Octob er

Duminda Wijesekera and Jaideep Srivastava Quality of service metrics for

continuous media Protocols for Multimedia Systems pages

Brian L Wong Capacity Planning for Solaris Servers Prentice Hall

David KY Yau and Simon S Lam Adaptive ratecontrolled scheduling for

multimedia applications In Proceedings of the Fourth ACM International Mul

timedia Conference Boston MA

JA Zebarth Let me b e me In Proceedings of Globecom IEEE Telecom

munications Conference pages Houston TX November

Hong jiang Zhang Chien Yong Low and Stephen W Smoliar Video parsing

and browsing using compressed data Multimedia Tools and Applications

March