dc: A Live Webcast Control System
Tai-Ping Yu, David Wu, Ketan Mayer-Patel, and Lawrence A. Rowe Berkeley Multimedia Research Center
University of California, Berkeley
ftaiping, davidwu, kpatel, [email protected] erkeley.edu t the story (e.g., recorded material,
ABSTRACT as needed to presen
alive-remote rep ort, etc.). Another television technique
Web casts can adopt techniques develop ed by television
uses video e ects to combine several video sources into
to obtain higher quality. Television pro duction tech-
one stream (e.g., comp osition e ects), to place text on
niques, however, cannot b e integrated into web cast pro-
an image to describ e who or what is b eing displayed (e.g.
duction without considering the fundamental di erences
titling e ects), and to transition b etween streams (e.g.,
in the underlying infrastructure. Wehavedevelop ed a
fade e ects). An interview show, for example, might dis-
general web cast pro duction mo del comp osed of three
playaninterviewer in one window and a remote p erson
stages (i.e., sources, broadcast, and transmission) to ad-
who is b eing interviewed in a second window comp os-
dress these di erences. The Director's Console (dc)
ited side-by-side with titles to identify the program, the
isaliveweb cast pro duction application based on this
participants, or the lo cation of the remote source.
mo del. It utilizes a distributed service architecture,
Web cast pro ducers are reluctanttointegrate conven-
which adapts to varying physical infrastructures and
tional broadcast television techniques into web casts for
broadcast con gurations.
several reasons. First, television pro duction equipment
ery exp ensive. Second, the technologies and mo dels
1. INTRODUCTION is v
used in broadcast television typically require manypeo-
The development of IP Multicast, the Mb one To ols
ple to op erate the system. Third, most broadcast tele-
[11], and commercial streaming media systems (e.g.,
vision programs pro duce a xed data rate, single video
Apple QuickTime Streaming, Cisco IP/TV, Microsoft
stream with no interaction. Fourth, web casting has a
Windows Media, and Real Networks) has lead to a rapid
fundamentally di erent pro duction mo del than televi-
growth in the use of streaming media on the Internet.
sion broadcasting and the to ols and interfaces used to
Several hundred liveweb casts are pro duced eachweek
pro duce high qualityweb casts must re ect the underly-
that are b eing viewed by tens of thousands of viewers
ing infrastructure.
[20]. Web casts are b ecoming an alternative to tradi-
Wehave pro duced the \Berkeley Multimedia, Inter-
tional television broadcasts.
faces and Graphics (MIG) Seminar" web cast on the In-
The television industry has develop ed manytechniques
ternet Mb one since 1995. The seminar features invited
to pro duce high quality video programs that can b e used
lecturers who give formal presentations. The seminar
to improve web casts. One technique is to switch be-
takes place in a ro om equipp ed with a variety of au-
tween di erent video sources. In a typical television
dio/visual hardware, which is used to pro duce the web-
news pro duction, several video feeds are sent to a cen-
cast. Wehave used the Berkeley MIG Seminar web cast
tral studio. The director selects the stream that b est
as a practical exp eriment to explore the problems asso-
p ortrays the content and broadcasts that stream to the
ciated with web cast pro duction.
audience. News broadcasts, for example, intro duce a
sub ject at the anchor desk and cut to di erent sources
Our web cast is comprised of two video streams and
one audio stream. One video stream fo cuses on the
This research was supp orted by NSF grant ANI-
sp eaker, called the speaker stream, and a second stream
9907994, NSF equipmentgrantCDA-9512332, and con-
fo cuses on the presentation material, called the content
tributions from Fujitsu, Intel, and NEC.
stream. Figure 1 shows a sample web cast from the re-
mote viewer's p ersp ective. The director can switchany
of several video sources into either stream. For example,
when a memb er of the lo cal audience asks a question,
the content stream can be switched to a view of the
audience so that a remote viewer can follow the con-
versation. Or, if a remote p erson asks a question, the
content stream can be switched to the remote viewer
and the pro jector in the classro om can b e switched from
the presentation material to the remote p erson. These
examples illustrate the complexityoftheweb cast con- two capture machine streams. Video sp ecial-e ects can
trol problem. The director must control the pro duction b e added to the web cast by placing an e ects pro ces-
viewed by the students in the classro om and, at the sor on the edge of the studio session. It takes one or
same time, he must control the pro duction viewed bya more streams from the studio session, renders the ef-
remote participant. fect, and sends the new stream backinto the studio ses-
sion. Stored playback material (e.g., a prede ned op en-
ing segment) and remote participant video are other
examples of sources available in the studio session.
Pro duction of a high quality web cast is a complex
op eration. In a previous pap er, a Broadcast Manager
application was describ ed that automated the tasks re-
quired to initiate a web cast [23]. For example, the MIG
Seminar web cast requires that more then ten pro cesses
b e started on di erent hosts with appropriate arguments
(e.g., multicast addresses and p ort numb ers, media for-
Figure 1: An example MIG Seminar web cast. The left panel
mats, bit-rates, and quality settings).
shows a thumbnail for each stream in the session. The right
The Director's Console (dc), describ ed in this pap er,
panel displays selected streams with greater resolution and
frame rate.
is designed to control a web cast during live pro duction.
It controls the web cast sources, adds e ects to these
streams, and determines the nal output. Dc is imple-
The MIG Seminar originates from a classro om with
mented as a system of distributed pro cesses organized
many audio/video sources, computer systems, and soft-
with a service mo del, which adapts to di erentphysical
ware pro cesses as shown in gure 2. The classro om is
infrastructure and to di erent broadcast con gurations.
equipp ed with several cameras (e.g., sp eaker and audi-
The remainder of this pap er presents the design and
ence cameras), a VCR, and a do cument camera. These
implementation of dc. Section 2 describ es the web cast
devices are connected to a conventional video matrix
pro duction mo del. Section 3 describ es the system archi-
1
switch. Two capture machines take video from any
tecture. Next, the user interface is describ ed in Section
connected input device. Moreover, they can receive the
4. Section 5 describ es the implementation. Section 6
same input simultaneously. The capture machines dig-
discusses our exp eriences developing and using the to ol
itize the audio and video signals for transmission into
and suggests future work. Section 7 describ es related
two multicast sessions (i.e., one for video and one for
research. And lastly, section 8 concludes the pap er.
2
audio), whichwe call the studio session.
2. WEBCAST PRODUCTION MODEL Classroom Sources Special-Effects
Speaker Camera System
Aweb cast di ers from traditional television on sev-
Audience Camera Video and Audio
t dimensions. This section describ es some Capture Machine eral imp ortan Matrix Presentation PC Switch
Video Studio Broadcast
of these di erences and develops a pro duction mo del Session dc Session
Elmo Capture Machine
for web casts. The design and implementation of dc re- VCR Stored
Material
ects this pro duction mo del. We b egin by examining Remote Viewer Moderated
Moderator
yp es and numb er of data streams that may b e in- Remote Viewer Session the t
Remote Viewer Video Audio
volved in a web cast. Next, the source infrastructure
used in a web cast is describ ed. Finally, the receivers of a
Figure 2: Infrastructure used to pro duce the MIG Seminar
web cast are characterized. In each case, di erences b e-
web casts.
tween the web cast environmentandaconventional tele-
vision broadcast environment are highlighted and used
to motivate developing to ols that meet these web cast-
The studio session is analogous to the routing switcher
sp eci c needs.
and control ro om in a conventional television studio. As
Unlike a television broadcast that is limited to one
in television, web cast pro duction sends all sources to
video and audio stream, a web cast can b e comprised of
one lo cation, the studio session. The pro ducer/director
multiple video and audio streams along with asso ciated
previews all sources and selects a subset of the streams
hyp ertext do cuments. The numb er of streams may also
that de ne the web cast. The selected streams are sent
b e dynamic. For example, during the web cast for the
to the broadcast session. Thus, for a viewer to see the
Berkeley MIG seminar, one video stream may be ini-
slides and the sp eaker, the pro ducer sets the matrix
tially used when the sp eaker is presenting his material
switch to the correct con guration and broadcasts the
and another video stream showing the audience maybe
1
A matrix switch is a crossbar switch that can send
added to the web cast during the question and answer
any input to multiple outputs. Atypical switch routes
p erio d at the end of the seminar. In a traditional televi-
b oth audio and video signals although the signals can
sion broadcast, switching b etween a view of the sp eaker
be controlled indep endently. Matrix switches are also
and a view of the audience might be used during the
called routing switches.
2
question and answer period. This technique could also
Amulticast session contains only one media format. It
b e used during a web cast, but providing b oth streams is sp eci ed bya multicast address and p ort pair.
simultaneously is another p ossibility b ecause web casts on the premise that pro ducing a web cast is to o complex
mayhave more than one video stream. and computationally intensive for a single application
or pro cessor. A collection of distributed pro cesses is re-
The infrastructure available to a web cast is an IP-
quired. The web casting service mo del divides the pro-
based data network with cameras and microphones at-
cesses into clients and services. Services provide func-
tached to PC's that capture, digitize, p ossibly compress,
tionality for the web cast. For example, one service can
and transmit the media streams. A television broadcast,
be resp onsible for controlling a camera while another
in contrast, requires exp ensive, sp ecial-purp ose hard-
service is resp onsible for computing video e ects. Client
ware op erated by trained technicians. The sophisti-
applications are pro cesses that use services.
cation of a web cast pro duction environmentmayvary
from a single video and audio stream to a situation like
Figure 4 shows the distribution of services used to
the Berkeley MIG seminar where multiple video and au-
pro duce the Berkeley MIG Seminar web cast. Notice
dio streams, an e ects server, remote participants, and
the similaritybetween the software architecture shown
auxiliary hyp ertext streams are used. The to ols used
in this gure and the physical infrastructure shown pre-
to pro duce a web cast are required to accommo date het-
viously in gure 2. Dc is a client application that uses
erogeneity in the pro duction infrastructure.
services to pro duce the web cast. The sp eci c services
The receivers of a web cast are also heterogeneous. For
shown in the gure are describ ed b elow. Services send
example, in the Berkeley MIG Seminar, a high quality
streams into the studio session. Dc receives the streams
version of the web cast is sent to viewers on the lo cal
in the studio session, selects a subset for web cast, and
campus and Internet2 connected sites that have high
transmits those streams into the broadcast session. Rtc
bandwidth connections and exp erience little delay. A
services are transco ders that forward and optionally con-
low qualityversion of the web cast is sent to Public In-
vert streams in the broadcast session to the transmission
ternet viewers that may have limited bandwidth con-
sessions.
nections. The director must manage two di erent trans-
missions of the same material. These transmissions may
Services
di er in bandwidth and quality and may also di er in specialfxproxy Medium
Quality
numb er and typ e of streams sent. In a television the specialfx dc
rtc Session ersion of the program is transmit- broadcast, only one v replay Broadcast
replayproxy Session
ted and the capabilities of the receivers (i.e., television room405
Studio rtc Low sets) are standardized. rvc Session Quality
Session
was develop ed to manage and control the pro duc-
Dc rvc
eweb casts. It uses a general web cast mo del
tion of liv Media Stream Communication Channel
comp osed of three stages: source, web cast, and trans-
mission as shown in gure 3. Sources are the streams
Figure 4: Web cast Architecture.
available in the studio session. From this set of sources,
a subset is selected for the broadcast. The sp eaker and
content stream, for example, can be chosen from the
The distributed service architecture has many b ene-
video sources in the classro om or any source available in
ts over a monolithic application. First, indep endent
the studio session (e.g., sp eaker camera, audience cam-
services can be develop ed and deployed rapidly since
era, stored material, or presentation PC). Finally,multi-
mo di cations are not required to existing services. Sec-
ple copies of the web cast, called transmissions, are pro-
ond, clients adapt to the existence of sp eci c services.
duced using di erenttechnologies (e.g., Real Networks,
They discover services dynamically so they can accom-
Windows Media etc.) and transp ort parameters (e.g.,
mo date a constantly changing infrastructure. Third, the
bit-rates and formats). These transmissions are selected
services used by clients can change from one web cast to
to match the exp ected capabilities of the audience.
the next. Fourth, a client dep ends on the collection
of services but not on one service alone. A fault with a
t. And lastly, pro-
Sources Broadcast Transmission single service will not cripple the clien
cessing load can b e distributed to a set of commo dity
machines instead of a single server.
Speaker Camera Audience Computer
The web casting service mo del is an extensible archi-
tecture that supp orts integration and control of ser-
vices. New services are integrated into the infrastruc-
Stored Material Content Computer Broadcast Set Medium Quality Low Quality
ture through a discovery proto col. Clients follow this
proto col to lo cate services. The mo del also sp eci es pro-
Figure 3: The web cast pro duction mo del.
to cols to incorp orate user interfaces that can b e used to
control the service.
3.1 Service Model
3. INTERNET WEBCASTING SYSTEM
This section describ es the proto cols that embody the First, a general description of the service
ARCHITECTURE service mo del.
The Internet web casting system architecture is based architecture is presented. Second, the service discovery
and handle input. Dc is implemented in Tcl/Tk so the proto col is explained. And lastly,we explain how a ser-
co de segment is a script that creates the interface and vice provides a control interface to a client application.
input event mappings. The co de segment has twoar-
The service mo del is comp osed of three comp onents:
guments: awindow and a so cket. The service interface
service abstractions, a service discovery service, and a
widgets are displayed in the window, and the so cket is
client API. Services are pro cesses that provide useful
the TCP connection to the service pro cess. Dc supplies
functionality for clients. Clients are pro cesses that use
the arguments when it evaluates the co de segment. The
services. The Service Discovery Service (SDS) is a di-
so cket is used to handle user interface events. For exam-
rectory service that allows a client to lo cate desired ser-
ple, a StartReplay metho d is invoked the on a replay
vices. For example, dc discovers available services when
service ob ject when the Start button is pressed in the
it is executed and uses services selected bytheweb cast.
replay dialog b oxshown in gure 7.
A simple web cast mayonly need a single camera ser-
vice while the Berkeley MIG Seminar web cast requires
wo camera services, an archiveplayback service for the
t 3.2 Webcast Services
op ening segment, and a sp ecial-e ects service. The ser-
This section describ es the services listed in table 1
vice mo del allows dc to con gure the services required
that wedevelop ed to pro duce the Berkeley MIG Semi-
for the particular broadcast.
nar web cast. The Remote Video Capture service (rvc)
SDS is a service that lists available services in the in-
controls a video capture device. This service is essen-
frastructure through a soft-state proto col [17]. A push-
tially a server version of the vic Mb one to ol [12]. Rvc
pull proto col is used to gather and retrieve information.
enco des analog video signals into RTP packets and pro-
Services push information to the SDS service and clients
vides format and bit-rate interface controls. For the
pull information from the SDS service. Idle services
MIG Seminar web cast, two rvc services are used to pro-
p erio dically announce their presence to a well-known
duce two streams. The studio classro om has two capture
3
multicast address. This information is cached by the
machines each connected to an AMX control system
SDS service. A timer is asso ciated with the information
and an audio/video matrix switcher. Audio/video de-
when it is received. The timer is decremented every
vices are connected to the AMX system and the matrix
half-second. Perio dic up dates by the services reset the
switcher. Rvc uses an RPC interface to issue commands
timer. If the timer go es to zero, the service is assumed
through the AMX system to the matrix switch and me-
to b e unavailable (i.e.- killed or busy).
dia devices. For example, the rvc service can send com-
Clients query the SDS service cache to lo cate a desired
mands to the AMX system to switch the video feed from
service. The query can request a sp eci c service or a
the VCR to the sp eaker camera and commands to move
collection of services (e.g., \all services in 405 So da").
the sp eaker camera to the right.
The SDS pro cess resp onds with data that satis es the
The room405 service provides an interface to non-
query. Clients p erio dically query the SDS service to
media services in the classro om that can be op erated
discover new services.
through the AMX system. The current services in-
The service information returned to the client includes
clude: ro om lights, raising/lowering pro jection screens,
the contact and attribute information required to access
and turning on/o an \On-Air" indicator light outside
and use the service. Service attributes provide informa-
the classro om.
tion ab out the service (e.g., lo cation, functionality, etc.)
The replay service manager and replayproxy services
and are used to distinguish service typ es. Contact in-
allow archived material to be played into a web cast.
formation includes an IP address and p ort numb er pair
A short op ening video segment (approximately 20 sec-
thatcanbeusedby the client to initiate communication
onds) is played at the b eginning of the web cast. To
with the service.
play this op ening, the director contacts the replay ser-
A client application uses the IP address and p ort
vice manager and sp eci es the video to b e played. The
number pair to create a TCP connection with a ser-
replay service instantiates a replayproxy service which
vice. An ad-ho c, non-blo cking Remote Metho d Invoca-
is resp onsible for controlling this particular video. We
tion (RMI) mechanism is implemented that allows the
currently use the MARS multimedia archive system to
clienttoinvoke metho ds on service ob jects.
archive and play stored material on-demand [18]. The
After initializing and establishing communication with
replayproxy service starts a MARS playback pro cess,
a service, the client application displays a user interface
connects to dc, and acts as a proxy b etween dc and the
to control the service. The control interface must be
MARS server. The proxy implements the VCR controls
provided to the client by the service. Otherwise, the
presentedinthe replayproxy control interface.
intro duction of a new service will require either: 1) that
The specialfx service manager and specialfxproxy ser-
the client source co de b e mo di ed or 2) the service con-
vices are similar to the replay service manager and re-
trol interface b e limited to a few prede ned interfaces.
playproxy services except they control video e ects pro-
Our exp erience with web casting is that the infrastruc-
cesses. The specialfx service manager starts the spe-
ture varies from ro om-to-ro om and is constantly chang-
cialfxproxy service that starts video-e ects pro cesses [10].
ing. Consequently, the service interface must b e incor-
The proxy service acts as an intermediary b etween the
porated into the de nition of the service.
client application and the pro cesses that implement the
Dc uses the GetUIControl metho d to obtain the inter-
sp ecial e ect.
face co de segment from the service. This co de segment
3
http://www.panja.com/integrator/amx/ when evaluated will display the control user interface
Service Lines Functionality
rvc 1768 Capture Machine. Digitizes analog audio and/or video, compresses it, and frames the digital media
into RTP packets. Provides controls for b oth the digitization pro cess and the device(s) connected
to the capture machine (e.g., camera, mixer, VCR).
room405 229 Ro om 405. Controls environment parameters in the ro om (e.g., lightcontrols, screen raise/drop).
replay 320 Archival Playback Manager. Archive catalog interface and playback service management.
replayproxy 612 Archival Playback Service. Streams archived media into a session. It provides controls to play,
pause, and seek in the video.
specialfx 365 Sp ecial E ects Manager. E ects system manger and interface for sp ecifying e ects services.
specialfxproxy 1402 Sp ecial E ects Service. Controls a particular e ects service.
rtc 411 Tranco ding Service. Transco des media streams at di erent bit-rates and formats from one session
to another.
nul l service 202 Overhead needed by all services.
Table 1: Table of services with their line numb ers and functionality.
rtc service is a transco ding service that is still
The Camera Control
under development. It will take streams from one mul-
ticast session, transco de them to di erent bit-rates and
to another multicast session. formats, and send them in Remote
Viewer
Rtc is a server version of the rtpgw service [2].
Auxillary
4. DC APPLICATION Audience
This section describ es the user interface of dc. The in- Titling Effects Control
Speaker
terface organization is shown in gure 5. The main win-
Media Service Set
dowcontains four panels: sources, preview, broadcast,
Content
and transmission. The sources panel displays sources
The preview panel allows the di-
in the studio session. Replay
rector to preview and prepare sources b efore they are
Effects
web cast. Streams in the web cast are displayed in the
broadcast panel. The transmission panel, which is in Control
Serivce
elopment, will control web cast transmissions. An
dev Set
example is displayed in gure 6 where a screen dump of
is shown.
dc Service Menu Archive Replay Control
Figure 6: A screen dump of the Director's Console during a
Main Window pro duction. Sources Panel Preview Panel Broadcast Panel Transmission Panel Video Video
Window Window
column of small, slowly refreshing thumbnail im-
Video Video A
t sources as shown in gure 6. Belowthese
Window Window ages represen
thumbnails are buttons that representcontrol services.
Services p erform computations or actions on b ehalf of
an application. These services can be media or con-
Figure 5: Dc user interface organization.
trol services. A media service pro duces a packet stream
(e.g., audio or video). A control service provides an in-
terface to control an entity (e.g., ro om lights, a o or
This section b egins with a discussion of the sources
control application, etc.).
panel. Second, the video window abstraction used in
The Service pull-down menuat the b ottom of the
the preview and broadcast panels is describ ed followed
sources panel allows the director to add a media or con-
by a description of these two panels. Lastly,we discuss
trol service interface to the sources panel. The video
the integration of video-e ects into dc.
sources displayed in the sources panel are called the
dia ServiceSet. The control interfaces are called the
4.1 Sources Panel Me
Control ServiceSet. The Control Service Set shown in
The sources panel serves as a starting p ointfor the
gure 6 includes three services (i.e.- Room405, SpecialFx,
web cast. It contains mechanisms to start and display
4
sources transmitting into the studio session.
rather than analog signals. Sources are therefore b etter
4
In the television community, a source is de ned as cam- de ned to b e packet streams. Under this de nition, the
eras and other devices that pro duce video signals. In capture machine in gure 2 is the source, not the cam-
web casts, however, the medium of transp ort is packets eras or other devices connected to the matrix switch.
preview and prepare the stream b efore switching to it
(e.g., p osition a replay). Consequently, dc has preview
windows for streams b eing prepared. The previews are
displayed as a video window in the preview panel imme-
diately to the right of the sources panel. Video windows
are also used to display streams in the broadcast panel.
Figure 7: An example control service dialog. This dialog
allows the director to start an archive replay source. The two
A video window displays a CIF-sized image at higher
6
entry b oxes sp ecify the archive material and audio session.
frame rate than a thumbnail. The interface controls b e-
low the video windowallow the director to control the
stream and source b eing displayed. The sp eci c control
interface dep ends on the source. For example, the con-
and Replay) that display dialog b oxes to control enti-
trol interfaces for a computer-controlled camera includes
ties or instantiates a media service. The Room405 but-
fo cus and p osition controls (e.g., pan/tilt or left/right
ton displays various ro om controls (e.g., lightcontrols,
p osition). A playback source has VCR-controls (e.g.,
pro jection screens). The SpecialFx button displays a
play, pause, p osition, etc.). Other video sources have
dialog that instantiates a video-e ect services and adds
di erentcontrol interfaces dep ending on the source.
5
it to the Media ServiceSet. The Replay button, shown
The control interface will change when the service
in gure 7, allows the director to add a stored material
state changes. Consider the capture machine service
playback stream to the Media Service Set.
connected to the matrix switch in gure 2. It can pro-
Services are presented to the director through the
duce a stream from any one of many devices, each of
Service pull-down menu, which displays a hierarchi-
which may have its own controls. Consequently, the
cal menu based on the service lo cation. The hierarchy
capture machine control interface needs controls for the
has the following structures:
video switchandcontrols for the currently selected video
organization/building/room/s ervi ce
device. When the director switches the capture machine
and
from the VCR to a camera, camera interface controls re-
Locationless/service
place the VCR controls b elow the video window.
Figure 8 shows the hierarchical structure for two ser-
vices. 4.3 Preview and Broadcast Panels
To prepare a source for broadcast, the source is rst
yed as a video window inside the preview panel. A Berkeley Berkeley displa
Locationless Soda Locationless
source can b e previewed by selecting a thumbnail with McLaughlin 405
Audience Camera ArchiveReplay the mouse, dragging it to the preview panel, and drop-
517
Dropping the thumbnail on an existing video Speaker Camera Special-Effects ping it. 523
Presentation PC window displays the new source in place of the previ-
ously displayed source. More than one video window
can b e displayed within the preview panel. Dropping
Figure 8: Examples of service menu hierarchical structure.
the thumbnail in the preview panel but not over a video
The hierarchy on the left shows a lo cation dep endent service
window instructs dc to create a new video window and
and the hierarchy on the right shows a lo cation indep endent
service.
display the source in it. A video window can b e removed
from the preview panel by selecting it with the mouse,
dragging it outside the preview panel, and dropping it.
Selecting a menu item initiates the service. A media
The broadcast panel has the same interface as the
service transmits a stream to the studio session when
preview panel. A video window displayed in the broad-
initiated. If it is a video stream, a thumbnail computed
cast panel, however, is a stream in the web cast. Data
from the stream is added to the sources panel. A con-
packets from streams in the broadcast panel are relayed
trol service adds a button to the Control Service Set
through dc into the broadcast session where audiences
displayed b elow the column of thumbnails.
view the web cast. Multiple streams are web cast byhav-
The Service pull-down menuchanges to re ect the
ing several video windows within the broadcast panel.
currentenvironment. If, for example, a remote camera
The director switches the source in the web cast stream
is installed during a web cast, the service pro cess asso ci-
by dragging and dropping a thumbnail or video window
ated with the camera will announce its existence. The
in the preview panel to the broadcast panel. The broad-
dc will detect this new service through the service dis-
cast panel b ehaves the same way as the preview panel
covery proto col and add it to the Service pull-down
- dropping on a video window switches the stream and
menu.
dropping elsewhere creates a new video window (i.e., a eb cast).
4.2 Video Window new stream in the w
umbnails provide an excellent summary of avail-
Th 4.4 Video-Effects Sources
able video sources, but a director typically wants to
5 6
A service is created for each sp ecial-e ect that can b e Thumbnails are up dated once per second and video
applied during a web cast. Multiple instances of an e ect windows are up dated as sp eci ed in the stream b eing
can b e created and used simultaneously. displayed.
Television video-e ects are implemented by a video ticast. It provides a hierarchical naming system
pro duction switcher (VPS). A VPS can switch from n and a scalable session announcement proto col [16].
inputs to one output that is typically the broadcast out- The SDS service uses SNAP to deliver reliable
put. A VPS can apply e ects on the input(s) as the messages b etween clients, services, and SDS servers
data is routed to the output. Commercial systems use for the service architecture. In addition, SNAP
custom-designed hardware to meet the pro cessing re- provides containers for messages. Container names
quirements of broadcast quality video. Some VPS sys- are organized into a hierarchical name space. Cli-
tems can layer e ects on top of other e ects (e.g., fading ents connect to a SNAP multicast and session reg-
from one stream to another with titling). However, the ister callback pro cedures for containers when a
number of layers is b ounded, and most VPS systems do message is transmitted to that container. This
not allow new e ects to b e added. facility supp orts a general-purp ose ltering mech-
The Parallel, Software-only Video-e ects Pro cessing anism for messages sent to a multicast session.
System (PSVP), which runs on a Network of Worksta- The web casting service architecture uses UPDATE,
REQUEST, and QUERY RESPONSE SNAP con- tions, was develop ed by our research group to provide QUERY
tainers for service announcements, client queries, alow-cost, extensible video-e ects system [10]. PSVP
and SDS service query results, resp ectively. pro duces an e ect by decomp osing it into a collection
of pro cesses that read RTP packets from a multicast
session, compute an e ect, and write RTP packets back
into the multicast session.
dc Render Thumbnail
Dc invokes and controls an e ect through the ser- Packet Decoder Render Preview Window Duplicator
Render Broadcast Window
vice infrastructure. E ects services (see gure 2) input
Speaker
session, render an e ect, and streams from the studio Speaker Audience Decoder Render Thumbnail
Audience Studio Receiver/ Transmitter/ Broadcast
output a new \e ects stream." E ects streams are ordi- Session Demultiplexer Multiplexer Session Archive Archive Render Thumbnail
Decoder
dc is concerned with controls for nary sources as far as Slides Render Preview Window
Slides
the e ect. Thus, e ects can b e cascaded by piping the Render Thumbnail Packet Decoder Duplicator
output of an e ect to the input of another e ect service. Render Broadcast Window
A titling e ect is shown in the Berkeley MIG Seminar
web cast shown in gure 1. Adding a new e ect to the
Figure 9: Dc Architecture.
web cast requires only that a new service b e installed in
the infrastructure.
The internal architecture of dc is shown in gure 9.
work agent receives packets from the video studio
5. DESIGN AND IMPLEMENTATION Anet
Dc and the service pro cesses are implemented as Mash
session and demultiplexes them to a handler for each
applications. Mash is a multimedia to olkit for develop-
stream. A stream may b e displayed in one or more win-
ing distributed collab oration applications [13]. It pro-
dows and, if it is in the broadcast set, the packets are
vides the basic ob jects needed to create multicast appli-
sent to the multicast session for the broadcast. A stream
cations (e.g., media stream receivers/transmitters, me-
is always displayed in a thumbnail so the packets must
dia co decs, etc). Mash is a split architecture software
be deco ded and rendered to a thumbnail window. If
system. Routines that are p erformance sensitive(e.g.,
the stream is also b eing displayed through one or more
media co ding and deco ding, network transmitting and
video windows, the deco ded packets are passed to addi-
receiving, etc.) are implemented in C++. User-interface
tional renderers for eachwindow. Packets for streams
abstractions and p erformance insensitivecodeisimple-
selected for broadcast are duplicated b efore they are
mented in the Tcl/Tk scripting language. Mash also
passed to the deco der. The duplicate packets are sent
uses the Ob ject Tcl (Otcl) extension for ob ject-oriented
to the broadcast session.
programming.
The dc application is approximately 2,800 lines of
Dc and the service programs make extensiveuse of
co de, which includes 430 lines of C++ and 2,400 lines of
the following packages written in Mash:
Tcl/Tk co de. C++ is used to duplicate packet bu ers
for broadcast and relay them to the transmitter. The
Active ServiceFramework.
breakdown of the Tcl/Tk co de is shown in table 2.
The AS framework provides a programmable sub-
The co de length do es not include inherited ob jects and
strate on which to build arbitrary network ser-
consequently may app ear skewed. For example, the
vices [1]. These services reside in one or more
agent-broadcast ob ject inherits transmission function-
p o ols of active service no des. Client applications
ality from the MASH video-agent ob ject. So even
instantiate service agents on one or more no des.
though dc only shows 48 lines of co de, video-agent in-
The framework ensures fault-tolerance and load-
cludes over 1,000 lines of co de.
balancing for clients. The replayproxy and spe-
Services are implemented only with Tcl/Tk co de. Ta-
cialfxproxy services use the AS framework to in-
ble 1 ab oveshows the numb er of lines for each service.
stantiate MARS and PSVP system services, re-
service shows the co de needed by all ser- The null
sp ectively.
vices. It contains co de to interface with the SDS service
and clients but has no functionality. The proxy ser- Scalable Naming and Announcement Protocol.
vices act as intermediaries b etween clients and backend SNAP is a framework for selective reliable mul-
Name Lines Functionality
agent-broadcast 48 Network agent transmitting to the broadcast session.
agent-studio 119 Network agent resp onsible for receiving packets from the studio session and
demultiplexing the packets into its constituent streams.
application-dc 224 Application ob ject resp onsible for creating and organizing other ob jects in
the dc .
session-dc 129 Ob jects used to discover services.
ui-dc 118 The main window ob ject of dc. Creates the panel ob jects that reside within
it.
ui-dcbroadcast 212 Ob ject resp onsible for broadcast panel.
ui-dcpreview 177 Ob ject resp onsible for preview panel.
ui-dcthumbnail 475 Ob ject resp onsible for thumbnail panel. It is resp onsible for new services.
ui-dcthunbnailservice 284 Control service ob jects.
ui-dcthunbnailvideo 89 Media service ob jects.
ui-dcvideo 248 Video Window Ob ject.
dc-link 324 RMI mechanism used to send commands b etween dc and services.
Table 2: Table listing dc co de. Each ob ject name is followed with co de length and a summary of its function.
is known b ecause the audio mixer detects that the in- servers which explains why replayproxy can be writ-
put from an audience microphone is ab ove a threshold. ten in only 612 lines. The ma jorityofthereplayproxy
Our studio classro om has three audience microphones service is in the MARS backend pro cess. This exam-
mounted in the ceiling on the left, center, and right ple shows the p ower of Tcl/Tk as a \glue language" for
sides of the ro om. The mixer adds the output of the building interfaces b etween disparate applications.
sp eaker microphone to the ro om audio and routes it to
eb cast audio stream. Consequently, the mixer can
6. DISCUSSION AND FUTURE WORK the w
identify the side of the ro om from which the question
A quick prototyp e for dc was implemented by Wu
comes. We need a mechanism for the mixer to signal the
in the Fall, 1998. It provided a remote control inter-
web cast control system that an audience question from
face to the AMX control system so the Berkeley MIG
a sp eci c area of the ro om has o ccurred. An intelligent
Seminar broadcast engineer could control which video
control system that implements a policy sp eci ed by
source was routed to the capture machines. The need
the director can automatically execute the commands
for a new dc that supp orted multiple streams, interfaces
to switch to the audience camera and p osition the cam-
to more services, and dynamic extensibilitywas imme-
era to the appropriate area in the ro om. In addition, the
diately obvious. The design and implementation of the
bandwidth allo cation b etween the sp eaker and content
currentversion of dc was completed during the Spring
stream can b e changed to re ect this new con guration.
and Summer of 1999. This version was used to pro-
Low-bandwidth transmissions allo cate more bandwidth
duce the seminar web casts b eginning in the Fall, 1999
to the sp eaker b ecause slides are often static relativeto
semester. Although the system has p erformed well b e-
the sp eaker. However, bandwidth should be diverted
yond initial exp ectations, it can still b e improved. This
from the sp eaker to the content stream when switching
section discusses several changes and improvements.
from slides to the audience. Lastly, automation can b e
Aweb cast b ecomes more and more dicult to man-
used to log events, such as stream switching, to assist
age as the number of services increases. The sp ecial-
video query, summarization, and automated authoring
e ects service alone can double the number of input
of multimedia titles derived from a lecture [15].
streams and interface controls. Our goal is to have one
p erson pro duce several web casts simultaneously. Con-
Automation relies on two mechanisms: 1) the ability
sequently, the need for automation is apparent. Au-
to monitor events pro duced byother pro cesses in the
tomation eliminates the need for the director to p erform
web cast and 2) the abilitytoinvoke commands in lieu
mundane manual tasks. One particularly time consum-
of the director. The bandwidth-allo cation agent, for
ing chore is camera tracking. As the sp eaker moves
example, monitors switching events and resp onds with
ab out the stage, the camera must b e moved to ensure
commands to the transco der to either increase or re-
he or she is correctly framed. Automatic camera track-
duce bit-rates. The current dc architecture cannot sup-
ingisawell-known technique for solving this problem
p ort such agents b ecause it uses unicast communication
that should b e incorp orated into the web casting system
channels to services. Thus, communication, b oth events
[7].
and commands, b etween clients and services cannot b e
monitored or forwarded to other participants. However, A second example is stream switching. Take the case
using multicast for the communication channel will re- of a web cast comp osed of sp eaker and content streams.
move this limitation. An appropriate logical decomp o- If an audience member asks a question, the content
sition of the events can b e created using SNAP contain- stream should b e switched to the audience camera and
ers so automation agents can monitor only the events the camera should be moved to show the p erson ask-
in whichtheyareinterested. The bandwidth-adjusting ing the question. The fact a question has been asked
television mo del of broadcasting in which there is no agent can wait for an eventlike \Switch ContentStream
audience interaction. The Internet, however, is capa- to Audience Camera" and resp ond with commands to
ble of audience participation. Nowhere is this capa- reduce sp eaker bandwidth and increase content band-
bility more imp ortant than in distance learning where width.
the student's ability to ask questions during a lecture
A second area in which dc canbeimproved is secu-
is critical. Manymechanisms for student participation
rity. We ignored many security issues during the de-
exist (e.g. chat sessions, video conferencing, and inter-
velopmentof dc b ecause we used rapid prototyping to
active resp onse systems), but more research is needed to
the system. In retrosp ect, we note many securityweak-
determine the most appropriate mo dels for integration
nesses that will need to b e addressed. First, the SDS
with live web casts. Moreover, mo deration techniques
service allows any pro cess to present itself as a service.
will need to b e develop ed to handle non-trivial numb ers
It also gives any pro cess information ab out all exist-
of participants. For example, if 200 p eople wanttoask
ing services. Rather than duplicating e ort, weplanto
a question simultaneously, who gets control and howis
replace the prototyp e SDS service with the Ninja SDS
the interaction mo derated?
System that is b etter designed for security [6]. Asec-
Lastly, our exp erience suggests that future studio class-
ond security issue arises b ecause Tcl/Tk co de is trans-
ro oms should use equipment directly connected to the
ferred from services and evaluated in the client without
Internet. Rather than connecting cameras and micro-
checks. Thus, a malicious service can kill a client appli-
phones to matrix switches and mixers and using a con-
cation by simply sending an exit command. Moreover,
trol system like the AMX, a b etter solution will connect
the Tcl/Tk co de segment is dicult to write correctly.
the devices directly to the network. Switching and con-
So, even a trusted service might cause harm through
trol are easier to implement. And, pro duction control
carelessness. One approach to solve these problems is
will be better if all devices are available at any time
to implement a certi cate system that ensures the cor-
rather than just a subset of devices through a matrix
rectness and trust of the co de. Another approachisto
switch.
create a description language for user-interfaces simi-
lar to the do cument language prop osed by Ho des and
[9]. Rather than sending co de, services transfer
Katz 7. RELATED WORK
a description of the user-interface. By restricting the
This section describ es related work on broadcast pro-
description language, clients can limit the access to ser-
duction and control.
vices and thereby protect themselves. A third approach
Several pro jects have develop ed software interfaces
is to sandb ox the co de as is done in Janus [8]. The
to a video pro duction system. NBC's GEnesis pro ject
co de runs in a con ned area with limited access to re-
monitors and controls program streams with a software
sources (e.g., so cket communications can b e limited to
system built with Tcl/Tk [3]. GEnesis provides a graph-
the parent service or le writes can b e prohibited).
ical user interface to control broadcast schedules that
A third area in which dc can b e improved is in audio
vary over time and region. For example, a live sp orts
control. Audio has b een largely neglected by the current
event has commercials that vary dep ending on the lo-
dc prototyp e. However, many of the design principles
cation. The system employs a database to store and re-
used for video also apply to audio. Audio devices can b e
trieveschedules, and device controls are accessed with
viewed as a service that can b e switched in conjunction
a string proto col over so ckets. Op erations are imple-
with or indep endently from video. Audio e ects can b e
mented by sending commands to conventional broad-
applied just as in video. However, the biggest problem
cast equipment. Both dc and GEnesis control remote
we must solve is to provide seamless interaction with
devices, but they are solving di erent problems. Dc
remote participants. Solving this problem will require
concentrates on the pro duction of video streams rather
changes in the studio classro om to provide visual feed-
than the scheduling of existing streams.
back or information ab out remote participants (i.e., to
The Berkeley Multimedia ResearchCenter develop ed
provide a sense of presence [4]) and an audio control sys-
several prototyp es b efore dc. The Software-Only Video
tem that can handle echo cancellation with long delays
Production Switcher pro ject intro duced video-e ects pro-
resulting from Internet communication.
cesses to our pro duction environment [21]. It used re-
A fourth area in which dc can be improved is con-
mote pro cesses that received and transmitted streams
trolling commercial web cast transmissions. We b egan
from a studio session. It employed a message proto col
simulcasting the Berkeley MIG Seminar using Real Net-
to control these remote pro cesses. However, the user
works b ecause the unicast transp ort used in commercial
interface was hard-co ded into the broadcast application
systems is easier to deploy than the multicast transp ort
and there was no notion of mo dular services. More-
used in the Mb one. Integrating these technologies into
over, only a single stream was pro duced from the sys-
dc will enable a wider web cast audience. In addition,
tem. This system was an early prototyp e for the PSVP
viewers will b ene t from the exibility and versatilityof
e ects system.
dc. For example, the Synchronized Multimedia Interac-
tion Language (SMIL) develop ed by the World Wide
eb Consortium can be used to view web casts with
W 8. CONCLUSION
multiple streams like those pro duced by dc [22].
Fundamental di erences between the underlying in-
A fth area in which web casts pro duced by dc can frastructure of web cast and traditional television pro-
b e improved is interactivity. Manyweb casts follow the hibits using a television pro duction mo del for web casts
[10] K. Mayer-Patel and L. A. Rowe. Exploiting Spatial without mo di cation. We prop ose a web cast pro duction
Parallelism for Software-only Video E ects Pro- mo del that addresses these di erences. The mo del is
cessing. In Proc. of The Sixth Annual ACM Inter- comp osed of three stages: sources, broadcast,andtrans-
national Multimedia Conference, September 1998. mission. Wehavedevelop ed the Director's Console, a
web cast pro duction to ol, to demonstrate the e ective-
[11] S. McCanne. Scalable multimedia communication
ness of this mo del. Dc is of a distributed client/server
with internet multicast, light-weight sessions, and
system organized under a web cast service architecture.
the mbone. IEEE Internet Computing, 3(2):33{45,
The service architecture provides fault tolerance and dy-
1999.
namic adaptation to a changing physical infrastructure.
[12] S. McCanne and V. Jacobson. vic: A Flexible
ramework for Packet Video. In Proc. of ACM Mul-
9. ACKNOWLEDGMENTS F
We would like to thank Irene Hsu, Peter Pletcher,
timedia 95, San Francisco, CA, Novemb er 1995.
Oliver Crow, Paul Huang, Tim Fitz, and the rest of
[13] S. McCanne, et al. Toward a Common Infrastruc-
the BMRC Researchers for their invaluable help during
ture for Multimedia Networking Middleware. In
development of the pro ject. Thanks also go to Richard
Proc. of NOSSDAV97, St. Louis, Missouri, August
Fateman, Chris A. Long, and other readers for their
1997.
input on this pap er.
[14] G. Millerson. The Technology of Television Prodcu-
. Butterworth-Heinemann, 12th edition, March
10. REFERENCES tion
1991.
[1] E. Amir, S. McCanne, and R. Katz. An ActiveSer-
[15] S. Mukhopadhyay and B. C. Smith. Passive Cap-
vice Framework and its Application to Real-time
ture and Structuring of Lectures. In Proc. of
Multimedia Transco ding. In Proc. of ACM SIG-
ACM Multimedia 99, pages 477{487, Orlando, FL,
COMM '98,Vancouver, Canada, August 1998.
Novemb er 1999.
[2] E. Amir, S. McCanne, and H. Zhang. An
[16] S. Raman and S. McCanne. Scalable Data Naming
Application-level Video Gateway.InProc. of ACM
for Application Level Framing in Reliable Multi-
Multimedia 95, San Francisco, CA, Novemb er 1995.
cast. In Proc. of ACM Multimedia '98, Bristol, UK,
[3] S. Angelovich, K. Kenny, and B. Sarachan. NBC's
Septemb er 1998.
GEnesis Broadcast Automation System: From
[17] S. Raman and S. McCanne. A Mo del, Analysis, and
Prototyp e to Pro duction. In Proc. of the 6th An-
Proto col Framework for Soft State-based Commu-
nual Tcl/TK Conference, 1998.
nication. In Proc. of ACM SIGCOMM '99, Cam-
[4] D. Boyer and M. Lukacs. The Personal Presence
bridge, MA, Septemb er 1999.
System { A Wide Area Network Resource for the
[18] A. Schuett, S. Raman, Y. Chawathe, S. McCanne,
Real Time Comp osition of Multip oint Multimedia
and R. Katz. A Soft-State Proto col for Accessing
Communications. In Proc. of the Second ACM In-
Multimedia Archives. In Proc. of NOSSDAV 98,
ternational Conference on Multimedia '94, pages
Cambridge, UK, July 1998.
453{460, San Francisco, CA, Octob er 1994.
[19] H. Schulzrinne, S. Casner, R. Frederick, and V. Ja-
[5] D. Culler, et al. Parallel Computing on the Berke-
cobson. RFC1889,RTP: A Transport Protocol for
ley NOW. In 9th Joint Symposium on Paral lel Pro-
Real-Time Applications, January 1996.
cessing, 1997.
[20] The Standard. Streaming media numb ers start
[6] S. Czerwinski, B. Zhao, T. Ho des, A. Joseph, and
to ow. http://www.thestandard.com/article/dis-
R. Katz. An Architecture for a Secure Service Dis-
play/0,1151,8419,00.html.
covery Service. In MOBICOM '99, Seattle, WA,
August 1999.
[21] T. Wong, K. Mayer-Patel, D. Simpson, and L. A.
Rowe. A Software-Only Video Pro duction Switcher
[7] D. B. Gennery. Visual Tracking of Known Three-
for the Internet MBone. In Proc. SPIE-Multimedia
Dimensional Ob jects. International Journal of
Computing and Networking 1998, San Jose, CA,
Computer Vision, 7(3):243{270, April 1992.
January 1998.
[8] I. Goldb erg, D. Wagner, R. Thomas, and E. A.
[22] World Wide Web Consortium. Synchronized Mul-
Brewer. A Secure Environment for Untrusted
timedia Integration Language (SMIL), November
Help er Applications: Con ning the Wiley Hacker.
1999. http://www.w3.org/AudioVideo/.
In 1996 USENIX Security Symposium, 1996.
[23] D. Wu, A. Swan, and L. A. Rowe. An Internet
[9] T. Ho des and R. Katz. A Do cument-based Frame-
Mb one Broadcast Management System. In Proc. of
work for Internet Application Control. In 2nd
Multimedia Computing and Networking, SPIE 99,
USENIX Symposium on Internet Technologies and
San Jose, CA, January 1999.
Systems, Boulder, CO, Octob er 1999.