<<

dc: A Live Webcast Control System

Tai-Ping Yu, David Wu, Ketan Mayer-Patel, and Lawrence A. Rowe Berkeley Multimedia Research Center

University of California, Berkeley

ftaiping, davidwu, kpatel, [email protected] erkeley.edu t the story (e.g., recorded material,

ABSTRACT as needed to presen

alive-remote rep ort, etc.). Another technique

Web casts can adopt techniques develop ed by television

uses video e ects to combine several video sources into

to obtain higher quality. Television pro duction tech-

one stream (e.g., comp osition e ects), to place text on

niques, however, cannot b e integrated into web cast pro-

an image to describ e who or what is b eing displayed (e.g.

duction without considering the fundamental di erences

titling e ects), and to transition b etween streams (e.g.,

in the underlying infrastructure. Wehavedevelop ed a

fade e ects). An interview show, for example, might dis-

general web cast pro duction mo del comp osed of three

playaninterviewer in one window and a remote p erson

stages (i.e., sources, broadcast, and transmission) to ad-

who is b eing interviewed in a second window comp os-

dress these di erences. The Director's Console (dc)

ited side-by-side with titles to identify the program, the

isaliveweb cast pro duction application based on this

participants, or the lo cation of the remote source.

mo del. It utilizes a distributed service architecture,

Web cast pro ducers are reluctanttointegrate conven-

which adapts to varying physical infrastructures and

tional broadcast television techniques into web casts for

broadcast con gurations.

several reasons. First, television pro duction equipment

ery exp ensive. Second, the technologies and mo dels

1. INTRODUCTION is v

used in broadcast television typically require manypeo-

The development of IP Multicast, the Mb one To ols

ple to op erate the system. Third, most broadcast tele-

[11], and commercial systems (e.g.,

vision programs pro duce a xed data rate, single video

Apple QuickTime Streaming, Cisco IP/TV, Microsoft

stream with no interaction. Fourth, web casting has a

Windows Media, and Real Networks) has lead to a rapid

fundamentally di erent pro duction mo del than televi-

growth in the use of streaming media on the .

sion and the to ols and interfaces used to

Several hundred liveweb casts are pro duced eachweek

pro duce high qualityweb casts must re ect the underly-

that are b eing viewed by tens of thousands of viewers

ing infrastructure.

[20]. Web casts are b ecoming an alternative to tradi-

Wehave pro duced the \Berkeley Multimedia, Inter-

tional television broadcasts.

faces and Graphics (MIG) Seminar" web cast on the In-

The television industry has develop ed manytechniques

ternet Mb one since 1995. The seminar features invited

to pro duce high quality video programs that can b e used

lecturers who give formal presentations. The seminar

to improve web casts. One technique is to switch be-

takes place in a ro om equipp ed with a variety of au-

tween di erent video sources. In a typical television

dio/visual hardware, which is used to pro duce the web-

news pro duction, several video feeds are sent to a cen-

cast. Wehave used the Berkeley MIG Seminar web cast

tral studio. The director selects the stream that b est

as a practical exp eriment to explore the problems asso-

p ortrays the content and broadcasts that stream to the

ciated with web cast pro duction.

audience. News broadcasts, for example, intro duce a

sub ject at the anchor desk and cut to di erent sources

Our web cast is comprised of two video streams and

one audio stream. One video stream fo cuses on the



This research was supp orted by NSF grant ANI-

sp eaker, called the speaker stream, and a second stream

9907994, NSF equipmentgrantCDA-9512332, and con-

fo cuses on the presentation material, called the content

tributions from Fujitsu, Intel, and NEC.

stream. Figure 1 shows a sample web cast from the re-

mote viewer's p ersp ective. The director can switchany

of several video sources into either stream. For example,

when a memb er of the lo cal audience asks a question,

the content stream can be switched to a view of the

audience so that a remote viewer can follow the con-

versation. Or, if a remote p erson asks a question, the

content stream can be switched to the remote viewer

and the pro jector in the classro om can b e switched from

the presentation material to the remote p erson. These

examples illustrate the complexityoftheweb cast con- two capture machine streams. Video sp ecial-e ects can

trol problem. The director must control the pro duction b e added to the web cast by placing an e ects pro ces-

viewed by the students in the classro om and, at the sor on the edge of the studio session. It takes one or

same time, he must control the pro duction viewed bya more streams from the studio session, renders the ef-

remote participant. fect, and sends the new stream backinto the studio ses-

sion. Stored playback material (e.g., a prede ned op en-

ing segment) and remote participant video are other

examples of sources available in the studio session.

Pro duction of a high quality web cast is a complex

op eration. In a previous pap er, a Broadcast Manager

application was describ ed that automated the tasks re-

quired to initiate a web cast [23]. For example, the MIG

Seminar web cast requires that more then ten pro cesses

b e started on di erent hosts with appropriate arguments

(e.g., multicast addresses and p ort numb ers, media for-

Figure 1: An example MIG Seminar web cast. The left panel

mats, bit-rates, and quality settings).

shows a thumbnail for each stream in the session. The right

The Director's Console (dc), describ ed in this pap er,

panel displays selected streams with greater resolution and

frame rate.

is designed to control a web cast during live pro duction.

It controls the web cast sources, adds e ects to these

streams, and determines the nal output. Dc is imple-

The MIG Seminar originates from a classro om with

mented as a system of distributed pro cesses organized

many audio/video sources, computer systems, and soft-

with a service mo del, which adapts to di erentphysical

ware pro cesses as shown in gure 2. The classro om is

infrastructure and to di erent broadcast con gurations.

equipp ed with several cameras (e.g., sp eaker and audi-

The remainder of this pap er presents the design and

ence cameras), a VCR, and a do cument camera. These

implementation of dc. Section 2 describ es the web cast

devices are connected to a conventional video matrix

pro duction mo del. Section 3 describ es the system archi-

1

switch. Two capture machines take video from any

tecture. Next, the user interface is describ ed in Section

connected input device. Moreover, they can receive the

4. Section 5 describ es the implementation. Section 6

same input simultaneously. The capture machines dig-

discusses our exp eriences developing and using the to ol

itize the audio and video signals for transmission into

and suggests future work. Section 7 describ es related

two multicast sessions (i.e., one for video and one for

research. And lastly, section 8 concludes the pap er.

2

audio), whichwe call the studio session.

2. WEBCAST PRODUCTION MODEL Classroom Sources Special-Effects

Speaker Camera System

Aweb cast di ers from traditional television on sev-

Audience Camera Video and Audio

t dimensions. This section describ es some Capture Machine eral imp ortan Matrix Presentation PC Switch

Video Studio Broadcast

of these di erences and develops a pro duction mo del Session dc Session

Elmo Capture Machine

for web casts. The design and implementation of dc re- VCR Stored

Material

ects this pro duction mo del. We b egin by examining Remote Viewer Moderated

Moderator

yp es and numb er of data streams that may b e in- Remote Viewer Session the t

Remote Viewer Video Audio

volved in a web cast. Next, the source infrastructure

used in a web cast is describ ed. Finally, the receivers of a

Figure 2: Infrastructure used to pro duce the MIG Seminar

web cast are characterized. In each case, di erences b e-

web casts.

tween the web cast environmentandaconventional tele-

vision broadcast environment are highlighted and used

to motivate developing to ols that meet these web cast-

The studio session is analogous to the routing switcher

sp eci c needs.

and control ro om in a conventional . As

Unlike a television broadcast that is limited to one

in television, web cast pro duction sends all sources to

video and audio stream, a web cast can b e comprised of

one lo cation, the studio session. The pro ducer/director

multiple video and audio streams along with asso ciated

previews all sources and selects a subset of the streams

hyp ertext do cuments. The numb er of streams may also

that de ne the web cast. The selected streams are sent

b e dynamic. For example, during the web cast for the

to the broadcast session. Thus, for a viewer to see the

Berkeley MIG seminar, one video stream may be ini-

slides and the sp eaker, the pro ducer sets the matrix

tially used when the sp eaker is presenting his material

switch to the correct con guration and broadcasts the

and another video stream showing the audience maybe

1

A matrix switch is a crossbar switch that can send

added to the web cast during the question and answer

any input to multiple outputs. Atypical switch routes

p erio d at the end of the seminar. In a traditional televi-

b oth audio and video signals although the signals can

sion broadcast, switching b etween a view of the sp eaker

be controlled indep endently. Matrix switches are also

and a view of the audience might be used during the

called routing switches.

2

question and answer period. This technique could also

Amulticast session contains only one media format. It

b e used during a web cast, but providing b oth streams is sp eci ed bya multicast address and p ort pair.

simultaneously is another p ossibility b ecause web casts on the premise that pro ducing a web cast is to o complex

mayhave more than one video stream. and computationally intensive for a single application

or pro cessor. A collection of distributed pro cesses is re-

The infrastructure available to a web cast is an IP-

quired. The web casting service mo del divides the pro-

based data network with cameras and microphones at-

cesses into clients and services. Services provide func-

tached to PC's that capture, digitize, p ossibly compress,

tionality for the web cast. For example, one service can

and transmit the media streams. A television broadcast,

be resp onsible for controlling a camera while another

in contrast, requires exp ensive, sp ecial-purp ose hard-

service is resp onsible for computing video e ects. Client

ware op erated by trained technicians. The sophisti-

applications are pro cesses that use services.

cation of a web cast pro duction environmentmayvary

from a single video and audio stream to a situation like

Figure 4 shows the distribution of services used to

the Berkeley MIG seminar where multiple video and au-

pro duce the Berkeley MIG Seminar web cast. Notice

dio streams, an e ects server, remote participants, and

the similaritybetween the software architecture shown

auxiliary hyp ertext streams are used. The to ols used

in this gure and the physical infrastructure shown pre-

to pro duce a web cast are required to accommo date het-

viously in gure 2. Dc is a client application that uses

erogeneity in the pro duction infrastructure.

services to pro duce the web cast. The sp eci c services

The receivers of a web cast are also heterogeneous. For

shown in the gure are describ ed b elow. Services send

example, in the Berkeley MIG Seminar, a high quality

streams into the studio session. Dc receives the streams

version of the web cast is sent to viewers on the lo cal

in the studio session, selects a subset for web cast, and

campus and Internet2 connected sites that have high

transmits those streams into the broadcast session. Rtc

bandwidth connections and exp erience little delay. A

services are transco ders that forward and optionally con-

low qualityversion of the web cast is sent to Public In-

vert streams in the broadcast session to the transmission

ternet viewers that may have limited bandwidth con-

sessions.

nections. The director must manage two di erent trans-

missions of the same material. These transmissions may

Services

di er in bandwidth and quality and may also di er in specialfxproxy Medium

Quality

numb er and typ e of streams sent. In a television the specialfx dc

rtc Session ersion of the program is transmit- broadcast, only one v replay Broadcast

replayproxy Session

ted and the capabilities of the receivers (i.e., television room405

Studio rtc Low sets) are standardized. rvc Session Quality

Session

was develop ed to manage and control the pro duc-

Dc rvc

eweb casts. It uses a general web cast mo del

tion of liv Media Stream Communication Channel

comp osed of three stages: source, web cast, and trans-

mission as shown in gure 3. Sources are the streams

Figure 4: Web cast Architecture.

available in the studio session. From this set of sources,

a subset is selected for the broadcast. The sp eaker and

content stream, for example, can be chosen from the

The distributed service architecture has many b ene-

video sources in the classro om or any source available in

ts over a monolithic application. First, indep endent

the studio session (e.g., sp eaker camera, audience cam-

services can be develop ed and deployed rapidly since

era, stored material, or presentation PC). Finally,multi-

mo di cations are not required to existing services. Sec-

ple copies of the web cast, called transmissions, are pro-

ond, clients adapt to the existence of sp eci c services.

duced using di erenttechnologies (e.g., Real Networks,

They discover services dynamically so they can accom-

Windows Media etc.) and transp ort parameters (e.g.,

mo date a constantly changing infrastructure. Third, the

bit-rates and formats). These transmissions are selected

services used by clients can change from one web cast to

to match the exp ected capabilities of the audience.

the next. Fourth, a client dep ends on the collection

of services but not on one service alone. A fault with a

t. And lastly, pro-

Sources Broadcast Transmission single service will not cripple the clien

cessing load can b e distributed to a set of commo dity

machines instead of a single server.

Speaker Camera Audience Computer

The web casting service mo del is an extensible archi-

tecture that supp orts integration and control of ser-

vices. New services are integrated into the infrastruc-

Stored Material Content Computer Broadcast Set Medium Quality Low Quality

ture through a discovery proto col. Clients follow this

proto col to lo cate services. The mo del also sp eci es pro-

Figure 3: The web cast pro duction mo del.

to cols to incorp orate user interfaces that can b e used to

control the service.

3.1 Service Model

3. INTERNET WEBCASTING SYSTEM

This section describ es the proto cols that embody the First, a general description of the service

ARCHITECTURE service mo del.

The Internet web casting system architecture is based architecture is presented. Second, the service discovery

and handle input. Dc is implemented in Tcl/Tk so the proto col is explained. And lastly,we explain how a ser-

co de segment is a script that creates the interface and vice provides a control interface to a client application.

input event mappings. The co de segment has twoar-

The service mo del is comp osed of three comp onents:

guments: awindow and a so cket. The service interface

service abstractions, a service discovery service, and a

widgets are displayed in the window, and the so cket is

client API. Services are pro cesses that provide useful

the TCP connection to the service pro cess. Dc supplies

functionality for clients. Clients are pro cesses that use

the arguments when it evaluates the co de segment. The

services. The Service Discovery Service (SDS) is a di-

so cket is used to handle user interface events. For exam-

rectory service that allows a client to lo cate desired ser-

ple, a StartReplay metho d is invoked the on a replay

vices. For example, dc discovers available services when

service ob ject when the Start button is pressed in the

it is executed and uses services selected bytheweb cast.

replay dialog b oxshown in gure 7.

A simple web cast mayonly need a single camera ser-

vice while the Berkeley MIG Seminar web cast requires

wo camera services, an archiveplayback service for the

t 3.2 Webcast Services

op ening segment, and a sp ecial-e ects service. The ser-

This section describ es the services listed in table 1

vice mo del allows dc to con gure the services required

that wedevelop ed to pro duce the Berkeley MIG Semi-

for the particular broadcast.

nar web cast. The Remote Video Capture service (rvc)

SDS is a service that lists available services in the in-

controls a video capture device. This service is essen-

frastructure through a soft-state proto col [17]. A push-

tially a server version of the vic Mb one to ol [12]. Rvc

pull proto col is used to gather and retrieve information.

enco des analog video signals into RTP packets and pro-

Services push information to the SDS service and clients

vides format and bit-rate interface controls. For the

pull information from the SDS service. Idle services

MIG Seminar web cast, two rvc services are used to pro-

p erio dically announce their presence to a well-known

duce two streams. The studio classro om has two capture

3

multicast address. This information is cached by the

machines each connected to an AMX control system

SDS service. A timer is asso ciated with the information

and an audio/video matrix switcher. Audio/video de-

when it is received. The timer is decremented every

vices are connected to the AMX system and the matrix

half-second. Perio dic up dates by the services reset the

switcher. Rvc uses an RPC interface to issue commands

timer. If the timer go es to zero, the service is assumed

through the AMX system to the matrix switch and me-

to b e unavailable (i.e.- killed or busy).

dia devices. For example, the rvc service can send com-

Clients query the SDS service cache to lo cate a desired

mands to the AMX system to switch the video feed from

service. The query can request a sp eci c service or a

the VCR to the sp eaker camera and commands to move

collection of services (e.g., \all services in 405 So da").

the sp eaker camera to the right.

The SDS pro cess resp onds with data that satis es the

The room405 service provides an interface to non-

query. Clients p erio dically query the SDS service to

media services in the classro om that can be op erated

discover new services.

through the AMX system. The current services in-

The service information returned to the client includes

clude: ro om lights, raising/lowering pro jection screens,

the contact and attribute information required to access

and turning on/o an \On-Air" indicator light outside

and use the service. Service attributes provide informa-

the classro om.

tion ab out the service (e.g., lo cation, functionality, etc.)

The replay service manager and replayproxy services

and are used to distinguish service typ es. Contact in-

allow archived material to be played into a web cast.

formation includes an IP address and p ort numb er pair

A short op ening video segment (approximately 20 sec-

thatcanbeusedby the client to initiate communication

onds) is played at the b eginning of the web cast. To

with the service.

play this op ening, the director contacts the replay ser-

A client application uses the IP address and p ort

vice manager and sp eci es the video to b e played. The

number pair to create a TCP connection with a ser-

replay service instantiates a replayproxy service which

vice. An ad-ho c, non-blo cking Remote Metho d Invoca-

is resp onsible for controlling this particular video. We

tion (RMI) mechanism is implemented that allows the

currently use the MARS multimedia archive system to

clienttoinvoke metho ds on service ob jects.

archive and play stored material on-demand [18]. The

After initializing and establishing communication with

replayproxy service starts a MARS playback pro cess,

a service, the client application displays a user interface

connects to dc, and acts as a proxy b etween dc and the

to control the service. The control interface must be

MARS server. The proxy implements the VCR controls

provided to the client by the service. Otherwise, the

presentedinthe replayproxy control interface.

intro duction of a new service will require either: 1) that

The specialfx service manager and specialfxproxy ser-

the client source co de b e mo di ed or 2) the service con-

vices are similar to the replay service manager and re-

trol interface b e limited to a few prede ned interfaces.

playproxy services except they control video e ects pro-

Our exp erience with web casting is that the infrastruc-

cesses. The specialfx service manager starts the spe-

ture varies from ro om-to-ro om and is constantly chang-

cialfxproxy service that starts video-e ects pro cesses [10].

ing. Consequently, the service interface must b e incor-

The proxy service acts as an intermediary b etween the

porated into the de nition of the service.

client application and the pro cesses that implement the

Dc uses the GetUIControl metho d to obtain the inter-

sp ecial e ect.

face co de segment from the service. This co de segment

3

http://www.panja.com/integrator/amx/ when evaluated will display the control user interface

Service Lines Functionality

rvc 1768 Capture Machine. Digitizes analog audio and/or video, compresses it, and frames the digital media

into RTP packets. Provides controls for b oth the digitization pro cess and the device(s) connected

to the capture machine (e.g., camera, mixer, VCR).

room405 229 Ro om 405. Controls environment parameters in the ro om (e.g., lightcontrols, screen raise/drop).

replay 320 Archival Playback Manager. Archive catalog interface and playback service management.

replayproxy 612 Archival Playback Service. Streams archived media into a session. It provides controls to play,

pause, and seek in the video.

specialfx 365 Sp ecial E ects Manager. E ects system manger and interface for sp ecifying e ects services.

specialfxproxy 1402 Sp ecial E ects Service. Controls a particular e ects service.

rtc 411 Tranco ding Service. Transco des media streams at di erent bit-rates and formats from one session

to another.

nul l service 202 Overhead needed by all services.

Table 1: Table of services with their line numb ers and functionality.

rtc service is a transco ding service that is still

The Camera Control

under development. It will take streams from one mul-

ticast session, transco de them to di erent bit-rates and

to another multicast session. formats, and send them in Remote

Viewer

Rtc is a server version of the rtpgw service [2].

Auxillary

4. DC APPLICATION Audience

This section describ es the user interface of dc. The in- Titling Effects Control

Speaker

terface organization is shown in gure 5. The main win-

Media Service Set

dowcontains four panels: sources, preview, broadcast,

Content

and transmission. The sources panel displays sources

The preview panel allows the di-

in the studio session. Replay

rector to preview and prepare sources b efore they are

Effects

web cast. Streams in the web cast are displayed in the

broadcast panel. The transmission panel, which is in Control

Serivce

elopment, will control web cast transmissions. An

dev Set

example is displayed in gure 6 where a screen dump of

is shown.

dc Service Menu Archive Replay Control

Figure 6: A screen dump of the Director's Console during a

Main Window pro duction. Sources Panel Preview Panel Broadcast Panel Transmission Panel Video Video

Window Window

column of small, slowly refreshing thumbnail im-

Video Video A

t sources as shown in gure 6. Belowthese

Window Window ages represen

thumbnails are buttons that representcontrol services.

Services p erform computations or actions on b ehalf of

an application. These services can be media or con-

Figure 5: Dc user interface organization.

trol services. A media service pro duces a packet stream

(e.g., audio or video). A control service provides an in-

terface to control an entity (e.g., ro om lights, a o or

This section b egins with a discussion of the sources

control application, etc.).

panel. Second, the video window abstraction used in

The Service pull-down menuat the b ottom of the

the preview and broadcast panels is describ ed followed

sources panel allows the director to add a media or con-

by a description of these two panels. Lastly,we discuss

trol service interface to the sources panel. The video

the integration of video-e ects into dc.

sources displayed in the sources panel are called the

dia ServiceSet. The control interfaces are called the

4.1 Sources Panel Me

Control ServiceSet. The Control Service Set shown in

The sources panel serves as a starting p ointfor the

gure 6 includes three services (i.e.- Room405, SpecialFx,

web cast. It contains mechanisms to start and display

4

sources transmitting into the studio session.

rather than analog signals. Sources are therefore b etter

4

In the television community, a source is de ned as cam- de ned to b e packet streams. Under this de nition, the

eras and other devices that pro duce video signals. In capture machine in gure 2 is the source, not the cam-

web casts, however, the medium of transp ort is packets eras or other devices connected to the matrix switch.

preview and prepare the stream b efore switching to it

(e.g., p osition a replay). Consequently, dc has preview

windows for streams b eing prepared. The previews are

displayed as a video window in the preview panel imme-

diately to the right of the sources panel. Video windows

are also used to display streams in the broadcast panel.

Figure 7: An example control service dialog. This dialog

allows the director to start an archive replay source. The two

A video window displays a CIF-sized image at higher

6

entry b oxes sp ecify the archive material and audio session.

frame rate than a thumbnail. The interface controls b e-

low the video windowallow the director to control the

stream and source b eing displayed. The sp eci c control

interface dep ends on the source. For example, the con-

and Replay) that display dialog b oxes to control enti-

trol interfaces for a computer-controlled camera includes

ties or instantiates a media service. The Room405 but-

fo cus and p osition controls (e.g., pan/tilt or left/right

ton displays various ro om controls (e.g., lightcontrols,

p osition). A playback source has VCR-controls (e.g.,

pro jection screens). The SpecialFx button displays a

play, pause, p osition, etc.). Other video sources have

dialog that instantiates a video-e ect services and adds

di erentcontrol interfaces dep ending on the source.

5

it to the Media ServiceSet. The Replay button, shown

The control interface will change when the service

in gure 7, allows the director to add a stored material

state changes. Consider the capture machine service

playback stream to the Media Service Set.

connected to the matrix switch in gure 2. It can pro-

Services are presented to the director through the

duce a stream from any one of many devices, each of

Service pull-down menu, which displays a hierarchi-

which may have its own controls. Consequently, the

cal menu based on the service lo cation. The hierarchy

capture machine control interface needs controls for the

has the following structures:

video switchandcontrols for the currently selected video

organization/building/room/s ervi ce

device. When the director switches the capture machine

and

from the VCR to a camera, camera interface controls re-

Locationless/service

place the VCR controls b elow the video window.

Figure 8 shows the hierarchical structure for two ser-

vices. 4.3 Preview and Broadcast Panels

To prepare a source for broadcast, the source is rst

yed as a video window inside the preview panel. A Berkeley Berkeley displa

Locationless Soda Locationless

source can b e previewed by selecting a thumbnail with McLaughlin 405

Audience Camera ArchiveReplay the mouse, dragging it to the preview panel, and drop-

517

Dropping the thumbnail on an existing video Speaker Camera Special-Effects ping it. 523

Presentation PC window displays the new source in place of the previ-

ously displayed source. More than one video window

can b e displayed within the preview panel. Dropping

Figure 8: Examples of service menu hierarchical structure.

the thumbnail in the preview panel but not over a video

The hierarchy on the left shows a lo cation dep endent service

window instructs dc to create a new video window and

and the hierarchy on the right shows a lo cation indep endent

service.

display the source in it. A video window can b e removed

from the preview panel by selecting it with the mouse,

dragging it outside the preview panel, and dropping it.

Selecting a menu item initiates the service. A media

The broadcast panel has the same interface as the

service transmits a stream to the studio session when

preview panel. A video window displayed in the broad-

initiated. If it is a video stream, a thumbnail computed

cast panel, however, is a stream in the web cast. Data

from the stream is added to the sources panel. A con-

packets from streams in the broadcast panel are relayed

trol service adds a button to the Control Service Set

through dc into the broadcast session where audiences

displayed b elow the column of thumbnails.

view the web cast. Multiple streams are web cast byhav-

The Service pull-down menuchanges to re ect the

ing several video windows within the broadcast panel.

currentenvironment. If, for example, a remote camera

The director switches the source in the web cast stream

is installed during a web cast, the service pro cess asso ci-

by dragging and dropping a thumbnail or video window

ated with the camera will announce its existence. The

in the preview panel to the broadcast panel. The broad-

dc will detect this new service through the service dis-

cast panel b ehaves the same way as the preview panel

covery proto col and add it to the Service pull-down

- dropping on a video window switches the stream and

menu.

dropping elsewhere creates a new video window (i.e., a eb cast).

4.2 Video Window new stream in the w

umbnails provide an excellent summary of avail-

Th 4.4 Video-Effects Sources

able video sources, but a director typically wants to

5 6

A service is created for each sp ecial-e ect that can b e Thumbnails are up dated once per second and video

applied during a web cast. Multiple instances of an e ect windows are up dated as sp eci ed in the stream b eing

can b e created and used simultaneously. displayed.

Television video-e ects are implemented by a video ticast. It provides a hierarchical naming system

pro duction switcher (VPS). A VPS can switch from n and a scalable session announcement proto col [16].

inputs to one output that is typically the broadcast out- The SDS service uses SNAP to deliver reliable

put. A VPS can apply e ects on the input(s) as the messages b etween clients, services, and SDS servers

data is routed to the output. Commercial systems use for the service architecture. In addition, SNAP

custom-designed hardware to meet the pro cessing re- provides containers for messages. Container names

quirements of broadcast quality video. Some VPS sys- are organized into a hierarchical name space. Cli-

tems can layer e ects on top of other e ects (e.g., fading ents connect to a SNAP multicast and session reg-

from one stream to another with titling). However, the ister callback pro cedures for containers when a

number of layers is b ounded, and most VPS systems do message is transmitted to that container. This

not allow new e ects to b e added. facility supp orts a general-purp ose ltering mech-

The Parallel, Software-only Video-e ects Pro cessing anism for messages sent to a multicast session.

System (PSVP), which runs on a Network of Worksta- The web casting service architecture uses UPDATE,

REQUEST, and QUERY RESPONSE SNAP con- tions, was develop ed by our research group to provide QUERY

tainers for service announcements, client queries, alow-cost, extensible video-e ects system [10]. PSVP

and SDS service query results, resp ectively. pro duces an e ect by decomp osing it into a collection

of pro cesses that read RTP packets from a multicast

session, compute an e ect, and write RTP packets back

into the multicast session.

dc Render Thumbnail

Dc invokes and controls an e ect through the ser- Packet Decoder Render Preview Window Duplicator

Render Broadcast Window

vice infrastructure. E ects services (see gure 2) input

Speaker

session, render an e ect, and streams from the studio Speaker Audience Decoder Render Thumbnail

Audience Studio Receiver/ Transmitter/ Broadcast

output a new \e ects stream." E ects streams are ordi- Session Demultiplexer Multiplexer Session Archive Archive Render Thumbnail

Decoder

dc is concerned with controls for nary sources as far as Slides Render Preview Window

Slides

the e ect. Thus, e ects can b e cascaded by piping the Render Thumbnail Packet Decoder Duplicator

output of an e ect to the input of another e ect service. Render Broadcast Window

A titling e ect is shown in the Berkeley MIG Seminar

web cast shown in gure 1. Adding a new e ect to the

Figure 9: Dc Architecture.

web cast requires only that a new service b e installed in

the infrastructure.

The internal architecture of dc is shown in gure 9.

work agent receives packets from the video studio

5. DESIGN AND IMPLEMENTATION Anet

Dc and the service pro cesses are implemented as Mash

session and demultiplexes them to a handler for each

applications. Mash is a multimedia to olkit for develop-

stream. A stream may b e displayed in one or more win-

ing distributed collab oration applications [13]. It pro-

dows and, if it is in the broadcast set, the packets are

vides the basic ob jects needed to create multicast appli-

sent to the multicast session for the broadcast. A stream

cations (e.g., media stream receivers/transmitters, me-

is always displayed in a thumbnail so the packets must

dia co decs, etc). Mash is a split architecture software

be deco ded and rendered to a thumbnail window. If

system. Routines that are p erformance sensitive(e.g.,

the stream is also b eing displayed through one or more

media co ding and deco ding, network transmitting and

video windows, the deco ded packets are passed to addi-

receiving, etc.) are implemented in C++. User-interface

tional renderers for eachwindow. Packets for streams

abstractions and p erformance insensitivecodeisimple-

selected for broadcast are duplicated b efore they are

mented in the Tcl/Tk scripting language. Mash also

passed to the deco der. The duplicate packets are sent

uses the Ob ject Tcl (Otcl) extension for ob ject-oriented

to the broadcast session.

programming.

The dc application is approximately 2,800 lines of

Dc and the service programs make extensiveuse of

co de, which includes 430 lines of C++ and 2,400 lines of

the following packages written in Mash:

Tcl/Tk co de. C++ is used to duplicate packet bu ers

for broadcast and relay them to the transmitter. The

 Active ServiceFramework.

breakdown of the Tcl/Tk co de is shown in table 2.

The AS framework provides a programmable sub-

The co de length do es not include inherited ob jects and

strate on which to build arbitrary network ser-

consequently may app ear skewed. For example, the

vices [1]. These services reside in one or more

agent-broadcast ob ject inherits transmission function-

p o ols of active service no des. Client applications

ality from the MASH video-agent ob ject. So even

instantiate service agents on one or more no des.

though dc only shows 48 lines of co de, video-agent in-

The framework ensures fault-tolerance and load-

cludes over 1,000 lines of co de.

balancing for clients. The replayproxy and spe-

Services are implemented only with Tcl/Tk co de. Ta-

cialfxproxy services use the AS framework to in-

ble 1 ab oveshows the numb er of lines for each service.

stantiate MARS and PSVP system services, re-

service shows the co de needed by all ser- The null

sp ectively.

vices. It contains co de to interface with the SDS service

and clients but has no functionality. The proxy ser-  Scalable Naming and Announcement Protocol.

vices act as intermediaries b etween clients and backend SNAP is a framework for selective reliable mul-

Name Lines Functionality

agent-broadcast 48 Network agent transmitting to the broadcast session.

agent-studio 119 Network agent resp onsible for receiving packets from the studio session and

demultiplexing the packets into its constituent streams.

application-dc 224 Application ob ject resp onsible for creating and organizing other ob jects in

the dc .

session-dc 129 Ob jects used to discover services.

ui-dc 118 The main window ob ject of dc. Creates the panel ob jects that reside within

it.

ui-dcbroadcast 212 Ob ject resp onsible for broadcast panel.

ui-dcpreview 177 Ob ject resp onsible for preview panel.

ui-dcthumbnail 475 Ob ject resp onsible for thumbnail panel. It is resp onsible for new services.

ui-dcthunbnailservice 284 Control service ob jects.

ui-dcthunbnailvideo 89 Media service ob jects.

ui-dcvideo 248 Video Window Ob ject.

dc-link 324 RMI mechanism used to send commands b etween dc and services.

Table 2: Table listing dc co de. Each ob ject name is followed with co de length and a summary of its function.

is known b ecause the audio mixer detects that the in- servers which explains why replayproxy can be writ-

put from an audience microphone is ab ove a threshold. ten in only 612 lines. The ma jorityofthereplayproxy

Our studio classro om has three audience microphones service is in the MARS backend pro cess. This exam-

mounted in the ceiling on the left, center, and right ple shows the p ower of Tcl/Tk as a \glue language" for

sides of the ro om. The mixer adds the output of the building interfaces b etween disparate applications.

sp eaker microphone to the ro om audio and routes it to

eb cast audio stream. Consequently, the mixer can

6. DISCUSSION AND FUTURE WORK the w

identify the side of the ro om from which the question

A quick prototyp e for dc was implemented by Wu

comes. We need a mechanism for the mixer to signal the

in the Fall, 1998. It provided a remote control inter-

web cast control system that an audience question from

face to the AMX control system so the Berkeley MIG

a sp eci c area of the ro om has o ccurred. An intelligent

Seminar broadcast engineer could control which video

control system that implements a policy sp eci ed by

source was routed to the capture machines. The need

the director can automatically execute the commands

for a new dc that supp orted multiple streams, interfaces

to switch to the audience camera and p osition the cam-

to more services, and dynamic extensibilitywas imme-

era to the appropriate area in the ro om. In addition, the

diately obvious. The design and implementation of the

bandwidth allo cation b etween the sp eaker and content

currentversion of dc was completed during the Spring

stream can b e changed to re ect this new con guration.

and Summer of 1999. This version was used to pro-

Low-bandwidth transmissions allo cate more bandwidth

duce the seminar web casts b eginning in the Fall, 1999

to the sp eaker b ecause slides are often static relativeto

semester. Although the system has p erformed well b e-

the sp eaker. However, bandwidth should be diverted

yond initial exp ectations, it can still b e improved. This

from the sp eaker to the content stream when switching

section discusses several changes and improvements.

from slides to the audience. Lastly, automation can b e

Aweb cast b ecomes more and more dicult to man-

used to log events, such as stream switching, to assist

age as the number of services increases. The sp ecial-

video query, summarization, and automated authoring

e ects service alone can double the number of input

of multimedia titles derived from a lecture [15].

streams and interface controls. Our goal is to have one

p erson pro duce several web casts simultaneously. Con-

Automation relies on two mechanisms: 1) the ability

sequently, the need for automation is apparent. Au-

to monitor events pro duced byother pro cesses in the

tomation eliminates the need for the director to p erform

web cast and 2) the abilitytoinvoke commands in lieu

mundane manual tasks. One particularly time consum-

of the director. The bandwidth-allo cation agent, for

ing chore is camera tracking. As the sp eaker moves

example, monitors switching events and resp onds with

ab out the stage, the camera must b e moved to ensure

commands to the transco der to either increase or re-

he or she is correctly framed. Automatic camera track-

duce bit-rates. The current dc architecture cannot sup-

ingisawell-known technique for solving this problem

p ort such agents b ecause it uses unicast communication

that should b e incorp orated into the web casting system

channels to services. Thus, communication, b oth events

[7].

and commands, b etween clients and services cannot b e

monitored or forwarded to other participants. However, A second example is stream switching. Take the case

using multicast for the communication channel will re- of a web cast comp osed of sp eaker and content streams.

move this limitation. An appropriate logical decomp o- If an audience member asks a question, the content

sition of the events can b e created using SNAP contain- stream should b e switched to the audience camera and

ers so automation agents can monitor only the events the camera should be moved to show the p erson ask-

in whichtheyareinterested. The bandwidth-adjusting ing the question. The fact a question has been asked

television mo del of broadcasting in which there is no agent can wait for an eventlike \Switch ContentStream

audience interaction. The Internet, however, is capa- to Audience Camera" and resp ond with commands to

ble of audience participation. Nowhere is this capa- reduce sp eaker bandwidth and increase content band-

bility more imp ortant than in distance learning where width.

the student's ability to ask questions during a lecture

A second area in which dc canbeimproved is secu-

is critical. Manymechanisms for student participation

rity. We ignored many security issues during the de-

exist (e.g. chat sessions, video conferencing, and inter-

velopmentof dc b ecause we used rapid prototyping to

active resp onse systems), but more research is needed to

the system. In retrosp ect, we note many securityweak-

determine the most appropriate mo dels for integration

nesses that will need to b e addressed. First, the SDS

with live web casts. Moreover, mo deration techniques

service allows any pro cess to present itself as a service.

will need to b e develop ed to handle non-trivial numb ers

It also gives any pro cess information ab out all exist-

of participants. For example, if 200 p eople wanttoask

ing services. Rather than duplicating e ort, weplanto

a question simultaneously, who gets control and howis

replace the prototyp e SDS service with the Ninja SDS

the interaction mo derated?

System that is b etter designed for security [6]. Asec-

Lastly, our exp erience suggests that future studio class-

ond security issue arises b ecause Tcl/Tk co de is trans-

ro oms should use equipment directly connected to the

ferred from services and evaluated in the client without

Internet. Rather than connecting cameras and micro-

checks. Thus, a malicious service can kill a client appli-

phones to matrix switches and mixers and using a con-

cation by simply sending an exit command. Moreover,

trol system like the AMX, a b etter solution will connect

the Tcl/Tk co de segment is dicult to write correctly.

the devices directly to the network. Switching and con-

So, even a trusted service might cause harm through

trol are easier to implement. And, pro duction control

carelessness. One approach to solve these problems is

will be better if all devices are available at any time

to implement a certi cate system that ensures the cor-

rather than just a subset of devices through a matrix

rectness and trust of the co de. Another approachisto

switch.

create a description language for user-interfaces simi-

lar to the do cument language prop osed by Ho des and

[9]. Rather than sending co de, services transfer

Katz 7. RELATED WORK

a description of the user-interface. By restricting the

This section describ es related work on broadcast pro-

description language, clients can limit the access to ser-

duction and control.

vices and thereby protect themselves. A third approach

Several pro jects have develop ed software interfaces

is to sandb ox the co de as is done in Janus [8]. The

to a video pro duction system. NBC's GEnesis pro ject

co de runs in a con ned area with limited access to re-

monitors and controls program streams with a software

sources (e.g., so cket communications can b e limited to

system built with Tcl/Tk [3]. GEnesis provides a graph-

the parent service or le writes can b e prohibited).

ical user interface to control broadcast schedules that

A third area in which dc can b e improved is in audio

vary over time and region. For example, a live sp orts

control. Audio has b een largely neglected by the current

event has commercials that vary dep ending on the lo-

dc prototyp e. However, many of the design principles

cation. The system employs a database to store and re-

used for video also apply to audio. Audio devices can b e

trieveschedules, and device controls are accessed with

viewed as a service that can b e switched in conjunction

a string proto col over so ckets. Op erations are imple-

with or indep endently from video. Audio e ects can b e

mented by sending commands to conventional broad-

applied just as in video. However, the biggest problem

cast equipment. Both dc and GEnesis control remote

we must solve is to provide seamless interaction with

devices, but they are solving di erent problems. Dc

remote participants. Solving this problem will require

concentrates on the pro duction of video streams rather

changes in the studio classro om to provide visual feed-

than the scheduling of existing streams.

back or information ab out remote participants (i.e., to

The Berkeley Multimedia ResearchCenter develop ed

provide a sense of presence [4]) and an audio control sys-

several prototyp es b efore dc. The Software-Only Video

tem that can handle echo cancellation with long delays

Production Switcher pro ject intro duced video-e ects pro-

resulting from Internet communication.

cesses to our pro duction environment [21]. It used re-

A fourth area in which dc can be improved is con-

mote pro cesses that received and transmitted streams

trolling commercial web cast transmissions. We b egan

from a studio session. It employed a message proto col

simulcasting the Berkeley MIG Seminar using Real Net-

to control these remote pro cesses. However, the user

works b ecause the unicast transp ort used in commercial

interface was hard-co ded into the broadcast application

systems is easier to deploy than the multicast transp ort

and there was no notion of mo dular services. More-

used in the Mb one. Integrating these technologies into

over, only a single stream was pro duced from the sys-

dc will enable a wider web cast audience. In addition,

tem. This system was an early prototyp e for the PSVP

viewers will b ene t from the exibility and versatilityof

e ects system.

dc. For example, the Synchronized Multimedia Interac-

tion Language (SMIL) develop ed by the World Wide

eb Consortium can be used to view web casts with

W 8. CONCLUSION

multiple streams like those pro duced by dc [22].

Fundamental di erences between the underlying in-

A fth area in which web casts pro duced by dc can frastructure of web cast and traditional television pro-

b e improved is interactivity. Manyweb casts follow the hibits using a television pro duction mo del for web casts

[10] K. Mayer-Patel and L. A. Rowe. Exploiting Spatial without mo di cation. We prop ose a web cast pro duction

Parallelism for Software-only Video E ects Pro- mo del that addresses these di erences. The mo del is

cessing. In Proc. of The Sixth Annual ACM Inter- comp osed of three stages: sources, broadcast,andtrans-

national Multimedia Conference, September 1998. mission. Wehavedevelop ed the Director's Console, a

web cast pro duction to ol, to demonstrate the e ective-

[11] S. McCanne. Scalable multimedia communication

ness of this mo del. Dc is of a distributed client/server

with internet multicast, light-weight sessions, and

system organized under a web cast service architecture.

the mbone. IEEE Internet Computing, 3(2):33{45,

The service architecture provides fault tolerance and dy-

1999.

namic adaptation to a changing physical infrastructure.

[12] S. McCanne and V. Jacobson. vic: A Flexible

ramework for Packet Video. In Proc. of ACM Mul-

9. ACKNOWLEDGMENTS F

We would like to thank Irene Hsu, Peter Pletcher,

timedia 95, San Francisco, CA, Novemb er 1995.

Oliver Crow, Paul Huang, Tim Fitz, and the rest of

[13] S. McCanne, et al. Toward a Common Infrastruc-

the BMRC Researchers for their invaluable help during

ture for Multimedia Networking Middleware. In

development of the pro ject. Thanks also go to Richard

Proc. of NOSSDAV97, St. Louis, Missouri, August

Fateman, Chris A. Long, and other readers for their

1997.

input on this pap er.

[14] G. Millerson. The Technology of Television Prodcu-

. Butterworth-Heinemann, 12th edition, March

10. REFERENCES tion

1991.

[1] E. Amir, S. McCanne, and R. Katz. An ActiveSer-

[15] S. Mukhopadhyay and B. C. Smith. Passive Cap-

vice Framework and its Application to Real-time

ture and Structuring of Lectures. In Proc. of

Multimedia Transco ding. In Proc. of ACM SIG-

ACM Multimedia 99, pages 477{487, Orlando, FL,

COMM '98,Vancouver, Canada, August 1998.

Novemb er 1999.

[2] E. Amir, S. McCanne, and H. Zhang. An

[16] S. Raman and S. McCanne. Scalable Data Naming

Application-level Video Gateway.InProc. of ACM

for Application Level Framing in Reliable Multi-

Multimedia 95, San Francisco, CA, Novemb er 1995.

cast. In Proc. of ACM Multimedia '98, Bristol, UK,

[3] S. Angelovich, K. Kenny, and B. Sarachan. NBC's

Septemb er 1998.

GEnesis Broadcast Automation System: From

[17] S. Raman and S. McCanne. A Mo del, Analysis, and

Prototyp e to Pro duction. In Proc. of the 6th An-

Proto col Framework for Soft State-based Commu-

nual Tcl/TK Conference, 1998.

nication. In Proc. of ACM SIGCOMM '99, Cam-

[4] D. Boyer and M. Lukacs. The Personal Presence

bridge, MA, Septemb er 1999.

System { A Wide Area Network Resource for the

[18] A. Schuett, S. Raman, Y. Chawathe, S. McCanne,

Real Time Comp osition of Multip oint Multimedia

and R. Katz. A Soft-State Proto col for Accessing

Communications. In Proc. of the Second ACM In-

Multimedia Archives. In Proc. of NOSSDAV 98,

ternational Conference on Multimedia '94, pages

Cambridge, UK, July 1998.

453{460, San Francisco, CA, Octob er 1994.

[19] H. Schulzrinne, S. Casner, R. Frederick, and V. Ja-

[5] D. Culler, et al. Parallel Computing on the Berke-

cobson. RFC1889,RTP: A Transport Protocol for

ley NOW. In 9th Joint Symposium on Paral lel Pro-

Real-Time Applications, January 1996.

cessing, 1997.

[20] The Standard. Streaming media numb ers start

[6] S. Czerwinski, B. Zhao, T. Ho des, A. Joseph, and

to ow. http://www.thestandard.com/article/dis-

R. Katz. An Architecture for a Secure Service Dis-

play/0,1151,8419,00.html.

covery Service. In MOBICOM '99, Seattle, WA,

August 1999.

[21] T. Wong, K. Mayer-Patel, D. Simpson, and L. A.

Rowe. A Software-Only Video Pro duction Switcher

[7] D. B. Gennery. Visual Tracking of Known Three-

for the Internet MBone. In Proc. SPIE-Multimedia

Dimensional Ob jects. International Journal of

Computing and Networking 1998, San Jose, CA,

Computer Vision, 7(3):243{270, April 1992.

January 1998.

[8] I. Goldb erg, D. Wagner, R. Thomas, and E. A.

[22] Consortium. Synchronized Mul-

Brewer. A Secure Environment for Untrusted

timedia Integration Language (SMIL), November

Help er Applications: Con ning the Wiley Hacker.

1999. http://www.w3.org/AudioVideo/.

In 1996 USENIX Security Symposium, 1996.

[23] D. Wu, A. Swan, and L. A. Rowe. An Internet

[9] T. Ho des and R. Katz. A Do cument-based Frame-

Mb one Broadcast Management System. In Proc. of

work for Internet Application Control. In 2nd

Multimedia Computing and Networking, SPIE 99,

USENIX Symposium on Internet Technologies and

San Jose, CA, January 1999.

Systems, Boulder, CO, Octob er 1999.