A Software-Only Video Pro duction Switcher for the MBone



Tina Wong Ketan Mayer-Patel David Simpson Lawrence A. Rowe

Computer Science Division

University of California, Berkeley

ftwong,kpatel,davesimp,[email protected]

ABSTRACT single camera that mainly fo cuses on the sp eaker,

and o ccasionally pans to show other materials such

In this pap er, we describ e the design and imple-

as slides on the overhead pro jector, a demo running

mentation of a software video pro duction switcher,

onaworkstation, or memb ers of the lo cal audience.

vps, that improves the quality of MBone broad-

This single camera approach is the most common

casts. vps is mo deled after the broadcast televi-

con guration seen in low-budget, small-scale broad-

sion industry's studio pro duction switcher. It pro-

casts on the MBone.

vides sp ecial e ects pro cessing to incorp orate audi-

We are working on to ols to improve the quality

ence discussions, add titles and other information,

and simplify the pro duction and control of MBone

and integrate stored videos into the presentation.

broadcasts. This pap er describ es the design and

vps is structured to work with other MBone con-

implementation of a software-only videoproduction

ferencing to ols. The ultimate goal is to automate

switcher vps that can b e used to improve the

the pro duction of MBone broadcasts.

quality of an MBone broadcast. vps is mo deled

after a studio production switcher [9] used in the

1 INTRODUCTION

broadcast television industry. A studio pro duction

switcher is a custom-designed hardware device that

Live programs are pro duced and broadcast world-

provides an array of real-time editing and sp ecial ef-

wide on the Internet MBone using IP [1]

fects functions. A director can select one of several

and the MBone [2] conferencing to ols e.g., vic [8],

picture sources e.g., cameras, videotap es, and still

vat [7], wb [3], sdr [5] etc.. Some examples are

image displays to b e the output. Other sources are

the NASA Space Shuttle Missions, conference pre-

generated by the device by applying sp ecial e ects

sentations e.g., Sixth International WWW Confer-

pro cessing to one or more streams such as inserting

ence, and livemusic p erformances. These broad-

titles into a picture, sup erimp osing one picture on

casts usually have audiences ranging from tens to

another, chroma-keying, and wiping or fading from

hundreds of viewers distributed world-wide.

one picture to another.

Wehave broadcast the weekly Berkeley Multime-

vps will enhance the quality of an MBone broad-

dia and Graphics Seminar on the MBone since early

cast by providing e ects available in a hardware

1995. The seminar is pro duced using the LBL/UCB

switcher. Sp eci call y,wewant to display lo cal and

MBone to ols vic and vat to capture and transmit

remote audience discussions and feedback, add ti-

video and audio streams, resp ectively. The shared

tles and credits, integrate stored analog and dig-

whiteb oard to ol wb is used to distribute p ostscript

ital videos into the presentation, and incorp orate

slides. A second wb is used by the broadcast direc-

sp ecial e ects to improve the visual images and

tor hereafter, director to communicate with par-

retain audience attention. Ultimately, our goal is

ticipants in order to debug problems with the trans-

to automate the pro duction pro cess byintegrating

mission and monitor video and audio quality.We

vps with a broadcast management system [13] that

are also testing a o or control to ol qb [6] to facili-

maintains ro om, equipment and broadcast con gu-

tate question asking. The current broadcast uses a

rations, observes eventschedules, and launches and



monitors the MBone to ols required to pro duce a

T. Wong is supp orted by a GAANN fellowship. 2

production drawing tool switcher INTERNET wb vps MBONE

video capture STUDIO vic rtpgw MBONE video transcoding

vat qb mbr audio capture floor control

recording tool

Figure 1: vps in an MBone Broadcast.

broadcast. Figure 1 shows how vps ts into the current system. It is organized as follows. Section

context of a typical MBone broadcast. The Stu- 2 presents an example of vps in use. Section 3 de-

dio MBone is a lo cal domain network connecting scrib es the vps software architecture. The imple-

the pro cesses required to pro duce the broadcast. It mentation of vps is describ ed in section 4. Section

can supp ort high data rates e.g., 5 to 30 Mbs and 5 talks ab out future work and section 6 summarizes

go o d quality video streams e.g., MJPEG video. the pap er.

The public MBone is the Internet and it runs at a

considerably lower sp eed.

2 AN EXAMPLE SCENARIO

A hardware pro duction switcher is a highly de-

We describ e how vps can improve the qualityofan

velop ed technology that could b e used in an MBone

MBone broadcast by illustrating its use through an

broadcast. However, this solution has several limi-

example scenario.

tations. First, video sources must b e converted to a

Before b eginning the scenario, we rst describ e

switch-sp eci c analog format b efore b eing passed to

the two GUI interfaces to vps: the director's con-

the switcher and converted back to a digital format

sole and the speaker's console, shown in Figures 2

and encapsulated as RTP data [11] after pro cessing

and 3, resp ectively. The director's console pro duces

so that it can b e sent to the MBone. vps avoids

the content of the broadcast. The main window of

these conversions by op erating on video streams in

this console has an editor area at the top and a pre-

the RTP representation. Second, a hardware sys-

view area at the b ottom. The director uses the edi-

tem is not extensible. vps is designed in a mo dular

tor area to cho ose a sp eci c e ect editor and to con-

manner to allow new e ects to b e added to the sys-

gure the parameters of that e ect. The preview

tem. Third, vps can b e controlled by other soft-

area shows thumbnails of video sources including

ware to automate decisions by a director through

the results of applying an e ect. The director can

reactive software heuristic technologies. Finally,

click onathumbnail to see more information ab out

a hardware switcher has only one user interface.

that video. The output window of the director's

vps can b e op erated by many GUIs ranging from

console shows the current video b eing broadcast.

a simple interface designed for a sp eaker to a so-

This window also describ es the broadcast multicast

phisticated interface designed for a skilled director.

session, if applicable. The sp eaker's console allows

Moreover, interfaces can b e customized for di erent

the sp eaker to incorp orate stored videos into the

users.

lecture. It has a preview area similar to the direc-

This pap er describ es the design of vps including

tor's console. The sp eaker clicks on a thumbnail

the GUI interface and the implementation of the 3

Figure 2: The Director's Console.

to select and and bring up a VCR-like player to remote lo cations also have digital cameras attached

playback a video. to their workstations, and the sp eaker has several

The following shows how vps can b e used in this videos to accompany her lecture. The director uses

1

scenario. Supp ose a seminar is b eing conducted on these videos to pro duce the content of the broad-

the Berkeley campus. Students on campus attend cast. He previews them in the preview area of the

the seminar in the lecture ro om, and remote viewers director's console. He also monitors the broadcast

join in virtually bywatching the broadcast on the with the output window. New video sources can b e

MBone. added at any time during the broadcast. For ex-

ample, a remote viewer who joins late can still b e

a vps source and part of the lecture broadcast.

Beginning a Lecture

A short time b efore the lecture starts, the direc-

tor switches from a still image that identi es the

program to a picture showing the sp eaker. He uses

the cut editor of the director's console to select this

picture and switch sources. He then uses the subti-

tle editor to insert the seminar title and the sp eaker

name onto the picture. After a minute or so, the di-

Figure 3: The Sp eaker's Console.

rector removes the titles by switching to the original

picture. Figure 4 shows screen shots that illustrates

the op ening of the lecture.

Viewing Sources and Monitoring the Broadcast

1

Supp ose there are two cameras in the seminar ro om:

These videos might b e stored on a video le server or

replayed on a VCR.

one fo cusing on the sp eaker and one facing the au-

dience. Supp ose further that participants at several 4

Figure 4: Pro ducing Op ening Phase of Lecture.

Playing Stored Video Incorp orating Audience Discussions

At some p oint in the lecture, the sp eaker wants A lo cal audience memb er raises his hand to notify

to show a video. She uses the sp eaker's console the sp eaker that he has a question. A few remote

to select and play the video. Dep ending on her viewers also indicate their desire to ask a question

pace, she can use the VCR-like controls to play, or commentbyentering a request into the o or con-

stop, rewind, fast-forward or restart the video. trol to ol. The o or mo derator signals a remote

The director's console provides the same playback viewer to ask her question. The director notices

facilities so the director can assume this task. See that a video of this remote viewer is available b e-

Figure 5 for a screen shot of the VCR-like player. cause it is b eing previewed on the director's console.

The o or control to ol might send a \grant o or"

message to all to ols. The director's or sp eaker's 5

Figure 6: Incorp orating Audience Discussions.

3 SOFTWARE ARCHITECTURE

This section describ es the vps architecture. vps is

comp osed of multiple pro cesses: a video le server,

a broadcaster, one or more e ects fx processors

managed byanfx server, and two user inter-

faces. These pro cesses exchange control messages

and transmit video data to each other using RTP

over IP Multicast on the Studio MBone, and receive

data from the public MBone. Figure 8 illustrates

the pro cesses in this software architecture.

vps is decomp osed into multiple pro cesses in or-

Figure 5: VCR-like Player.

der to build a distributed system which can uti-

lize more resources, facilitate future extensions in

console can displayathumbnail in the preview

e ects pro cessing, and make mo di cations to the

area when it receives this message if it includes a

user interfaces. The system is implemented with

video source or still image. vps could automat-

the Continuous Media To olkit CMT [4] whichis

ically switch to a stream that showed the sp eaker

describ ed in more detail in section 4.

and questioner in side-by-side windows. This exam-

ple illustrates automatic switching. This action can

3.1 Studio MBone

also b e invoked manually by using the picture-in-

The Studio MBone is an RTP session [11] with

picture PIP editor. See Figure 6 for screen shots.

2

a single multicast group and two p ort numb ers

Finishing a Lecture

on which vps pro cesses transmit video data and

At the end of the lecture, the director uses the fade

control messages. This is well

editor to execute a fade transition from the sp eaker

known to the pro cesses and can b e con gured as a

to a black screen. On the black screen, he uses

command-line argument at system startup. One

the subtitle editor to put up acknowledgments and

2

This multicast address can b e chosen through the session

credits to thank the sp eaker, p eople organizing the

directory protocol[5]toavoid con icting with other multicast

seminar, and an advertisement for the next event.

groups. Since the Studio MBone spans Berkeley,we only

need to allo cate a multicast address not in use on campus.

Figure 7 shows an example. 6

Figure 7: Pro ducing Closing Phase of Lecture

p ort numb er is used for data and the other for sages ow from the user interfaces to the other pro-

control. Administrative scoping or the time-to-live cesses. They request e ects pro cessing from the

ttl eld in the RTP session is set to reach all pro- fx server, con gure parameters at the broadcaster,

cesses in the system. For our broadcasts, the Studio notify the broadcaster to switch to another video

MBone spans our building. source, and control video playback at the video le

We designed the system so that control messages server. Table 1 lists the control messages used in

serveasaninterface among the pro cesses. Conse- the current system. Although these messages could

quently,internal changes to a pro cess do not a ect b e unicast to the appropriate destinations, webe-

other pro cesses as long as these messages remain lievemulticast will b e more ecient when the sys-

the same. Control Messages provide co ordination tem is integrated with other MBone to ols. For ex-

among vps pro cesses. In the current system, mes- ample, as the o or control to ol grants the o or to

7

an audience memb er, it a message on the 3.4 FX Pro cessor and FX Server

Studio MBone so the resp onsible vps pro cesses can

An fx pro cessor manipulates one or more video

react to the message by switching to the correct

streams to generate sp ecial e ects. The e ects sup-

video to broadcast and/or requesting e ects pro-

p orted by the current vps are fade, mix, picture-

cessing. The Studio MBone is connected to the

in-picture PIP and subtitle. The fade and mix

public MBone byamulticast router. This way,

e ects are implemented in the compressed MJPEG

streams from remote participants are automatically

domain which means the streams are not fully-

passed to the Studio MBone.

deco ded b efore b eing pro cessed [12]. PIP and sub-

Available video streams are source videos and

title are implemented in the uncompressed YUV

result streams from e ects pro cessing. They are

domain which means the streams must b e fully-

multicast on the Studio MBone instead of unicast

deco ded b efore b eing pro cessed. More details ab out

b ecause multiple pro cesses usually need the same

the e ects pro cessing algorithms are presented in

video at one time. For example, the result of an fx

the next section.

pro cessor is needed by the preview area at the direc-

There can b e more than one fx pro cessor de-

tor's console, another fx pro cessor for other kinds

p ending on the computing resources available.

of pro cessing, and the broadcaster to output to the

The collection of fx pro cessors are managed by

MBone, all at the same time.

the fx server. It communicates with the other

vps pro cesses, accepting pro cessing requests from

3.2 Stored and Live Video Server

the director's console and assigning them to an fx

The video le server pro cess serves stored digital pro cessor. Fx pro cessors are scheduled using round

robin scheduling to ensure load balancing. The re-

videos to other vps pro cesses. Stored video play-

back is controlled by the user interfaces which send sult of e ects pro cessing is multicast onto the Stu-

dio MBone so all vps pro cesses can utilize it. For

control messages to the server. The server plays a

video bymulticasting the appropriate streams on example, the director's console previews the result

b efore it is b eing switched to the output, and an-

the Studio MBone.

Live videos originating from the lo cal studio e.g., other fx pro cessor uses the result as an input to a

camera feeds from the lecture ro om or from other di erent e ect.

This design was chosen so the fx server and fx

video feeds e.g., cable or satellite receivers are

\served" by the Studio MBone in the sense that pro cessors can b e easily extended without requiring

the other to b e signi cantly rewritten or a ecting

the streams are multicast on the asso ciated RTP

session. Live videos from remote participants e.g., other vps pro cesses. Changes in the load balancing

p olicy in the fx server do not a ect the internals of

cameras attached to studentworkstations are sent

on a separate RTP session on the public MBone. the fx pro cessors. Likewise, mo di cations to the

pro cessors, such as implementing e ects pro cessing

These streams are multicast on di erent addresses

so that lo cal data and control messages are not for- in the raw or compressed domain, are isolated.

To add a new typ e of e ects pro cessing suchas

warded to the public MBone in order to avoid wast-

ing valuable bandwidth. We distinguish each video chroma-key which is common in television weather

source within an RTP session with the unique syn- forecasts, wewould need to include the co de to im-

plement this pro cessing into the fx pro cessor, ex-

chronization source identi er eld ssrc sp eci ed

in the RTP header. tend the control messages so that the director's

console can request the e ect, and implementa

chroma-key editor so the director can control the

3.3 Broadcaster

e ect.

The broadcaster pro cess multicasts the vps output

to the public MBone at the address and p ort num-

3.5 User Interfaces

b er advertised for the broadcast. The director using

As describ ed in the scenario, there are two user

the director's console selects a video to b e the out-

put and sends a control message to tell the broad- interfaces in vps. The director's console is the main

control center which provides a set of primitives to

caster to carry out the switching b etween streams.

manipulate vps, and the sp eaker's console which 8

fx processor fx processor INTERNET fx server MBONE . . .

vicSTUDIO MBONE broadcaster rtpgw

video file UI server UI director's console . . . speaker's console video

control

Figure 8: vps Software Architecture.

sender receiver control name parameters

Director's Console Fx Server Processing FX Name, Fx Params

Director's Console Video File Server Playback File ID, Playback Speed

Speaker's Console Video File Server Playback File ID, Playback Speed

Director's Console Broadcaster Switch Video ID

Director's Console Broadcaster Configure address/port/ttl

Table 1: Control messages.

allows the lecturer to integrate stored video into the When a transition is requested, the director sends

presentation. Two separate interfaces are provided a control message to the broadcaster pro cess to ex-

so that the pro duction pro cess and the lecture can ecute the switch.

b e going on in di erent ro oms. They are written in

Tcl/Tk [10] and OTcl [14] and are easy to mo dify 3.6 Automating the Pro duction Pro cess

to incorp orate b etter UI designs, as we get more

Several asp ects of the pro duction pro cess can b e

exp erience using them, and new editors when new

automated. The director's console can follow a pre-

e ects are included.

pared script and send control messages for switch-

The thumbnail previews in the director's and

ing and e ects pro cessing at sp eci ed times. For

sp eaker's console are \optimized" in the sense that

example, the script shown in Table 2 can b e used

they are up dated infrequently to avoid exp ensive

to automate a broadcast. The rst part of the script

deco ding of each frame in the video. The current

de nes variables to b e used later on; startTime is

implementation displays one frame every one hun-

the advertised starting time of the broadcast April

dred frames of the video. When e ects pro cessing

15, 1997, 1:00 p.m., endTime is the end time 3:00

is requested by the director, the director's console

p.m. the same day, and speakerStream is the

sends a control message to the fx server to request

camera facing the sp eaker ssrc 326232628 in the

the pro cessing as describ ed ab ove. The resulting

RTP session 234.1.2.3/1234. The second part of

stream is sent back on the Studio MBone so the di-

the script automates the op ening and closing phases

rector's console can display it in the preview area.

9

of the broadcast. It rst tells the broadcaster pro- most recentversion has approximately 5000 lines of

cess to switch speakerStream to the MBone at OTcl co de.

startTime. Then, it requests subtitle pro cessing

on speakerStream with a text string and assigns

4.2 Sp ecial E ects Pro cessing

the resulting stream to the variable titledStream.

The PIP and subtitle e ects are applied in the un-

At time startTime + 30 seconds, it switches to

compressed domain, and have YUV streams as in-

the stream sp eci ed by titledStream. It then

puts and outputs. The mix and fade e ects are

switches back to the original video speakerStream

generated in the compressed JPEG domain, which

at startTime + 60 seconds. Similar actions are are

pro cess MJPEG streams and output in the same

executed in the closing phase. A p otential prob-

format. The algorithms that manipulate images in

lem here is that the estimated times are not always

the compressed domain are fully describ ed in [12].

accurate, as the lecture can start late and run over-

Toinvestigate the p erformance of the current ef-

time. These situations should b e accounted for in

fects implementation, we conducted exp eriments

the design of the automation engine.

to determine the latency of each e ects on each

frame. We also measured the p erformance of the

4 IMPLEMENTATION AND DISCUSSION

MJPEG to YUV deco ding op eration. The MJPEG

and YUV streams used in the measurements are

This section discusses our current implementation

CIF 320x240 sized videos, and are served from lo-

and related issues.

cal disks to isolate the measurements from network

overhead. The measurements were carried out on a

4.1 Status

200 MHz Pentium Pro with 32MB of memory and

vps is implemented using the Continuous Media

2GB of disk space.

To olkit CMT. CMT is a p ortable to olkit of

The results are presented in Table 3. In our im-

reusable ob jects op erating on media streams that

plementation, the generation of various e ects is

simpli es the developmentofmultimedia applica-

inexp ensive; it takes approximately 15 to 20 ms

tions. The to olkit includes video le and playback

to pro cess each frame in YUV. The deco ding from

ob jects in MJPEG and H.261 formats, communi-

MJPEG to YUV is more computationally intensive

cation ob jects for unicast UDP, multicast RTP,

and thus takes on average 45 ms. Adding the de-

and blo cking and non-blo cking RPC, synchroniza-

co ding times to the pro cessing times, it takes ab out

tion ob jects to control application b ehavior, and

65 ms to complete the YUV pro cessing on each

lter ob jects that implement e ects pro cessing on

frame. When compared to the mix e ect imple-

YUV and MJPEG data. Each pro cess in vps is

mented in the compressed domain, we see that it

comp osed of CMT ob jects connected by the Tcl

takes on average of 65 ms for pro cessing YUV data,

scripting language. The vps co de is structured into

but only 20 ms for pro cessing in the compressed do-

a hierarchy of classes using OTcl, an ob ject-oriented

main. Clearly, e ects pro cessing in the compressed

extension to Tcl develop ed at MIT. The Tk to olkit

domain is much faster than converting a stream to

is used to build the user interface.

YUV, applying the transformation, and converting

The current implementation is a prototyp e of the

back to MJPEG [12]. These measurements only

describ ed system. We implemented e ects pro cess-

account for the raw computation time needed to

ings in the YUV and MJPEG domains. The au-

generate the op erations; we did not lo ok into how

tomation engine is in its design phases and works

other b ottlenecks in the vps system can a ect the

closely with the broadcast management system de-

p erformance of e ects pro cessing. Other p ossible

scrib ed in the intro duction. To demonstrate the

I/O b ottlenecks may exist in the kernel when the

system's feasibility, this prototyp e sends video data

system transmits or receives streams over the net-

unicast UDP over the network and control mes-

work.

sages via RPC. The implementation is b eing up-

The e ects pro cessing o ered in the current sys-

dated to use IP Multicast for b oth video data and

tem are simple; they mainly involve memory copies

control messages. At the writing of this pap er, the

and/or simple calculations. For more complex ef-

system has gone through two ma jor revisions. The

fects that require greater computation p ower, such

10

set vps [new VPS :::]

set startTime [new Time ``4 15 1997'' ``13:00'' GMT]

set endTime [new Time ``4 15 1997'' ``15:00'' GMT]

set speakerStream [new LiveStream ``234.1.2.3'' 1234 326232628]

:::

at $startTime ``$vps cut $speakerStream''

set introText ``Berkeley Graphics ...''

at $startTime + 1 ``set titledStream [$vps subtitle $speakerStream $introText]''

at $startTime + 30 ``$vps cut $titledStream''

at $startTime + 60 ``$vps cut $speakerStream''

:::

at $endTime ``$vps fade $speakerStream black''

set creditsText ``Credits ...''

at $endTime + 1 ``set endStream [$vps subtitle black $creditsText]''

at $endTime + 30 ``$vps cut $endStream''

Table 2: OTcl script to automate pro duction pro cess.

Operation Latency ms Std. dev ms

8

MJPEG ! Decode ! YUV 44.62 9:64  10

7

MJPEG ! Mix ! MJPEG 18.62 9:78  10

4

YUV ! Mix ! YUV 20.65 2:42  10

4

YUV ! PIP ! YUV 14.95 2:21  10

4

YUV ! Subtitle ! YUV 19.62 5:96  10

Table 3: Measurements results.

as chroma-key,we need to resort to more sophisti- is able to realize simple e ects with ob jects written

cated algorithms. Also, for larger image sizes, such sp eci cally to p erform the desired e ect on a par-

as SCIF video 640x480, it takes at least four times ticular format of video. This approach is in exible

as long to pro cess each frame for many e ects. Sec- and requires that a new ob ject to b e written for

tion 5 discusses future work that will address the each new video format and desired e ect. In addi-

ability to execute more complex e ects in a way tion, the complexity of an ob ject increases with the

that maintains an acceptable throughput data rate. complexity of the desired e ect.

the Another issue to consider is H.261 video [8]; A more exible approach is to represent complex

this is the format almost exclusively used in pub- e ects as a combination of simpler functions. Ob-

lic MBone broadcasts. Although streams originat- jects written to p erform these basic functions can

ing in the Studio MBone are higher bitrate streams b e reused and recombined into di erent complex ef-

such as MJPEG, the fx pro cessors need to b e en- fects. As an example, Figure 9 shows a mix e ect

hanced to handle H.261 streams from the public represented as a combination of simple multiplica-

MBone as well. tion and addition functions.

Another requirement of the fx server in the

vps system is the ability to maintain an accept-

5 FUTURE WORK

able level of throughput indep endent of the com-

Future work with the software-only video pro duc-

plexity of the desired e ect. To meet this require-

tion switcher will b e concentrated on the fx server

ment, the fx server must exploit parallelism. Three

and fx pro cessor. The current vps implementation

typ es of parallelism are available: temp oral, spa- 11

Video 1 Multiply 0.5 Add Mixed Video

Video 2 Multiply

0.5

Figure 9: Mix e ect as combination of simple functions.

tial, and functional decomp osition. Temp oral par- arate pieces by exploiting one of the three typ es of

allelism can b e exploited by creating and controlling parallelism available. Each sub-piece is then recur-

fx pro cessors to pro cess frames of a video stream sively treated as a new e ect to b e realized. The

indep endently. For example, the fx server may question of what typ e of parallelism is to b e ex-

use one fx pro cessor to pro cess all o dd numb ered ploited is answered at each level of this con gura-

frames and another fx pro cessor to pro cess all even tion hierarchy, allowing for a hybrid mix of tem-

numb ered frames. Spatial parallelism can b e ex- p oral, spatial, and functional decomp osition. The

ploited by splitting the input video streams along question of how many fx pro cessors is required is

spatial b oundaries, utilizing separate fx pro cessors answered by the numb er of leaves in the con gura-

to pro cess each region indep endently, and recom- tion hierarchy.

bining the resulting streams into one video stream. Anumb er of optimizations to improve p erfor-

Functional decomp osition takes advantage of rep- mance must b e made when constructing the con-

resenting complex e ects as a combination of sim- guration hierarchy and executing the e ect. One

pler functions. The fx server can decomp ose this optimization is to balance the time sp ent building

representation into two or more stages and utilize the con guration hierarchy and the time sp ent ac-

separate fx pro cessors for each stage. tually p erforming the e ect. In most cases, a de-

The use and management of parallelism to real- sired e ect will only b e required for a nite p erio d

ize complex e ects will b e internal to the fx server. of time. For example, a fade e ect may b e used to

The vps interfaces with the fx server only to sp ec- transition b etween two di erent camera views and

ify what desired e ect is required. The details of will only b e required for a relatively short p erio d

realizing the e ect by exploiting the available par- of time. When an e ect is requested, the fx server

allelism is left to the fx server. The fx server must must evaluate a numb er of di erent p ossible con-

address two issues when executing a desired e ect. guration hierarchies that could b e used to execute

First, how many fx pro cessors will b e required to ex- the e ect. Essentially, this task amounts to search-

ecute the e ect and achieve an acceptable through- ing the space of p ossible con gurations and pre-

put? Second, what typ e of parallelism should b e dicting the p erformance of each. The time sp entin

exploited? constructing and evaluating these hierarchies must

To arrive at answers for these questions, we b e prop ortional to how long the e ect will last. To

plan on using a hierarchical con guration structure this end, accurate heuristics and cost mo dels to pre-

within the fx server. At the topmost level of the dict the p erformance of various con gurations need

hierarchy, the desired e ect and the input video to b e develop ed.

streams are represented as they were provided by Another optimization is to dynamically balance

the vps system to the fx server. The fx server eval- the numb er of fx pro cessors used by several di er-

uates the desired e ect to determine if a single fx ent con guration hierarchies. Several separate ef-

pro cessor can realize the e ect with an acceptable fects may b e required by the vps at the same time.

throughput. If this is not p ossible, the fx server For example, the vps may b e used to execute a fade

decomp oses the desired e ect into two or more sep- transition b etween one camera view and the result

12

of a chroma-key e ect using a di erent camera view framing. In ACM SIGCOMM, pages 342{356,

and a still image. The chroma-key e ect and the August 1995.

fade e ect will b e two separate e ects that are ac-

[4] Plateau Multimedia Research Group.

tive at the same time. Although each e ect will

The b erkeley continuous media to olkit.

have its own con guration hierarchy within the fx

Do cumentation available at URL

server, the computing resources available must b e

http://bmrc.b erkeley.edu/pro jects/cmt/cmt.html.

shared b etween them. As new e ects are requested,

the numb er of fx pro cessors used by e ects already

[5] J. Handley and V. Jacobson. Sdp: Session

b eing executed must b e dynamically changed to ac-

description proto col. Internet Draft, work in

commo date the needs of the new e ect. Similarly,

progress,Novemb er 1995.

as e ects are completed, fx pro cessors that are no

[6] R. Malpani. Flo or control for large-scale

longer in use must b e distributed to e ects that are

mb one seminars. Master's thesis, University

still under way.

of California, Berkeley, CA, 1997. Submitted

for publication to ACM MM '97.

6 SUMMARY

[7] S. McCanne and V. Jacobson. vat: The lbnl

The deployment of IP Multicast and MBone con-

audio conferencing to ol. 1995. Available at

ferencing to ols make it p ossible to pro duce and

URL ftp://ftp.cs.b erkeley.edu/ucb/sggs/.

broadcast live programs on the Internet. In this

pap er, we describ ed the design and implementa-

[8] S. McCanne and V. Jacobson. vic: A exi-

tion of a software-only video pro duction switcher,

ble framework for packet video. In ACM Mul-

vps, that enhances the quality of MBone broad-

timedia, pages 511{522, San Francisco, CA,

casts. vps provides sp ecial e ects pro cessing to in-

Novemb er 1995.

corp orate audience discussions, add titles and cred-

its, and integrate stored videos into the presenta-

[9] G. Millerson. The Technique of Television Pro-

tion. We showed an example scenario of vps in

duction.Fo cal Press, 12 edition, 1990.

use. We also discussed the software architecture of

[10] J. K. Ousterhout. Tcl and the Tk Toolkit.

vps. The develop ed prototyp e has demonstrated

Addison-Wesley, 1994.

the feasibilityofvps. Future work will improve

sp ecial e ects pro cessing.

[11] H. Schulzrinne, S. Casner, R. Freder-

ick, and V. Jacobson. Rtp: A trans-

ACKNOWLEDGMENTS

p ort proto col for real-time application, Jan-

The rst prototyp e of this system was designed and

uary 1996. RFC 1889. Avaiable at URL

implemented as a class pro ject byDavid Simpson

ftp://ds.internic.net/rfc/rfc1889.txt.

and Richard Fromm. Soam Acharya implemented

the mix and fade e ects in the compressed domain.

[12] B. Smith and L. Rowe. Algorithms for manip-

ulating compressed images. IEEE Computer

References

Graphics and Applications, Septemb er 1993.

[1] S. Deering and D. Cheriton. Multicast rout-

[13] A. Swan. An internet mb one broadcast man-

ing in datagram internetworks and extended

agement system. Master's thesis, Universityof

lans. ACM Transactions on Computer Systems

California, Berkeley, CA, 1997. Submitted for

TOCS, 182:85{110,May 1990.

publication to ACM MM '97.

[2] H. Eriksson. Mb one, the multicast back-

[14] D. Wetherall and C. Lindblad. Extending

b one. Communications of the ACM, 378:54{

tcl for dynamic ob ject-oriented programming.

60, August 1994.

In Proceedings of the Tcl/Tk Workshop'95,

Toronto, July 1995.

[3] S. Floyd, V. Jacobson, C. Liu, S. McCanne,

and L. Zhang. A reliable multicast framework

for light-weight sessions and application level