The Evolution of From the

to InterDomain Multicast to Deployment

Kevin C Almeroth

Department of Computer Science

University of California

Santa Barbara CA

almerothcsucsbedu

Octob er

Abstract

Without a doubt multicast communicationthe onetomanyormanytomany delivery of

datahas b ecome a hot topic It is of interest in the research community among standards

groups and to network service providers For all the attention multicast has received there

are still issues that have not b een completely resolved One result is that proto cols are still

evolving and some standards are not yet nished From a deployment p ersp ective the lackof

standards has slowed progress but eorts to deploymulticast as an exp erimental service are in

fact gaining momentum The question nowishow long it will b e b efore multicast b ecomes a true

Internet service The goal of this pap er is to describ e the past present and future of multicast

Starting with the Multicast Backb one MBone we describ e how the emphasis has b een on

developing and rening intradomain proto cols Starting in the middle to

late s particular emphasis has b een placed on developing interdomain multicast routing

proto cols We provide a functional overview of the currently deployed solution The future

of multicast may hinge on several research eorts that are working to make the provision of

multicast less complex by fundamentally changing the multicast mo del We briey survey these

eorts Finally attempts are b eing made to deploy nativemulticast routing in b oth Internet

networks and the commo dity Internet We examine howmulticast is b eing deployed in these

networks

Intro duction

Without a doubt multicast communicationthe onetomanyormanytomanydelivery of data

has b ecome a hot topic It is the fo cus of intense study in the researchcommunity It has b ecome

a highly desired feature of manyvendors network pro ducts It is growing into a true deployment

challenge for Internet engineers It is evolving into a highly touted service b eing oered by some

Internet Service Providers ISPs And nally it is starting to b e used byanumb er of companies

oering largescale Internet applications and services From almost all p ersp ectives multicast is

developing into one of the most interesting Internet services

For all the p otential multicast has and for all the advocacy multicast has received there are still

some concerns First by Internet standards multicast is an old concept yet by most measures

deployment has been very slow To put deployment in p ersp ective compare multicast to the

World Wide Web WWW and the Hyp erText Transfer Proto col HTTP IP multicast was rst

intro duced in Steve Deerings PhD dissertation in and tested on a wide scale during an

audio cast at the Internet Engineering Task Force IETF meeting in San Diego The

rst WWW browser was written in and in there were ab out sites on the WWW So

while multicast and the WWW are roughly the same age multicast is considered to b e in the early

stages of evolution while the WWWs success inuence and use seem totally p ervasive Second

IP multicast is one of the rst services to be deployed which requires additional intelligence in

the network Multicast requires a nontrivial amount of state and complexity in b oth core and edge

routers These requirements are at o dds with the longstanding b elief that intelligence should b e

pushed to the edges of the network While many in the Internet community realize that the new

generation of network services will put demands on the network the dicultyisindeploying and

managing these services in an infrastructure that has a lengthy history of only oering b esteort

unicast service

With these concerns in mind the image of multicast may seem somewhat tarnished Is multicast

then more trouble than its eciency gains and economies of scale are worth This question is

esp ecially relevantifmulticast is to b e used as a moneymaking enterprise for commercial companies

The challenges are to dene elegant proto cols to supp ort an infrastructure on top of which new

applications can be develop ed to continue to investigate new ways of increasing eciency and

reducing complexity Doing multicast the rightway is a noble endeavor and an appropriate long

term research topic but the demand for working multicast has created an environment in which

even shortterm functional solutions are very attractive

In this pap er we attempt to describ e the past present and future of multicast The history

of multicast should help the reader understand how multicast has evolved into its current state

Relevant topics include a description of the Multicast Backb one MBone and an overview of the

common intradomain multicast routing proto cols More recently multicast evolution has been

primarily fo cused in the area of interdomain proto col development Multicast in the present can

be characterized as an eort to deploy multicast on a wide scale using a triumvirate of routing

proto cols These deployments have b een carried out in the two Internet backb one networksthe

very high sp eed Backb one Network Service vBNS and Abileneas well as in the commo dity

Internet so designated in order to distinguish it from Internet networks The future of multicast

is ro oted in the continued development evaluation and standardization of new proto cols However

unlike current eorts which are fo cused primarily on routing future eorts are likely to include

other issues such as address allo cation management and billing We are already starting to see

some eorts in these areas

The remainder of this pap er is organized as follows Section describ es the early evolution of

multicast in particular the development of intradomain multicast The fo cus of Section is on

interdomain multicast including the b est current practices and several of the eorts to dene the

next generation of proto cols Section details interdomain deployment eorts in the commo dity

Internet and in Internet networks Section is the conclusion of the pap er

Evolution of IntraDomain Multicast

From the rst Internetwide exp eriments in to the middle of standardization and de

ploymentinmulticast fo cused on a single at top ology This top ology is in contrast to the Internet

top ology which is based on a hierarchical routing structure The initial multicast proto col re

search and standardization eorts were aimed at developing routing proto cols for this at top ology

Beginning in when the multicast community realized the need for a hierarchical multicast

infrastructure and interdomain routing the existing proto cols were categorized as intradomain

proto cols and work began on standardizing an interdomain solution In this section we describ e

the standard IP multicast mo del and the evolution and characterization of intradomain multicast

proto cols

The Standard IP Multicast Mo del

Stephen Deering is resp onsible for describing the standard multicast mo del for IP networks This

mo del describ es how end systems are to send and receive multicast packets The mo del includes

both an explicit set of requirements and several implicit requirements An understanding of the

mo del will help the reader understand part of the evolutionary path multicast has taken The

mo del is as follows

 IPStyle Semantics A source can send multicast packets at any time with no need to

register or to schedule transmission IP multicast is based on UDP not TCP so packets are

delivered using a b esteort p olicy

 Op en Groups Sources only need to knowa They do not need to know

group memb ership and they do not need to be a member of the multicast group to which

they are sending A group can haveanynumb er of sources

 Dynamic Groups Multicast group memb ers can join or leave a multicast group at will

There is no need to register synchronize or negotiate with a centralized group management

entity

The standard IP multicast mo del is an endsystem sp ecication and do es not discuss require

ments for how the network should p erform routing The mo del also do es not sp ecify anymechanisms

for providing quality of service security or address allo cation

Birth of the Multicast Backb one

Interest in building a multicastcapable Internet motivated by Deerings work b egan to achieve

critical mass in the late s This work led to the creation of multicast in the Internet and

the creation of the Multicast Backb one MBone In March the MBone carried its rst

worldwide event when sites received audio from the meeting of the Internet Engineering Task

Force IETF in San Diego While the conferencing software itself represented a considerable

accomplishment the most signicantachievement here was the deploymentof a virtual multicast

network The multicast routing function was provided byworkstations running a daemon pro cess

called mrouted pronounced mrouted which received unicastencapsulated multicast packets on

an incoming interface and then forwarded packets over the appropriate set of outgoing interfaces

Connectivity among these machines was provided using pointtop oint IPencapsulated tunnels

Each tunnel connected twoendpoints via one logical link but could cross several Internet routers

Once a packet is received it can b e sent to other tunnel endp oints or broadcast to lo cal memb ers

Routing decisions were made using the Distance Vector Multicast Routing Proto col DVMRP

An example of connectivity provided via a virtual top ology is shown in Figure In this earliest

phase of the MBone all tunnels were terminated on workstations and the MBone top ology was

such that sometimes multiple tunnels ran over a common physical link Multicast routing in the

early MBone was actually a controlled form of o o ding The rst versions of mrouted did not

implement pruning It was not until several years later that pruning was deployed

Figure Generic tunnelbased top ology representative of the early MBone

The original multicast routing proto col DVMRP creates multicast trees using a technique

known as broadcastandprune Because of the way the tree is constructed byDVMRP it is called

a reverse shortest path tree The steps to creating this typ e of tree are as follows

The source broadcasts each packet on its lo cal network An attached receives the

packet and sends it on all outgoing interfaces

Each router that receives apacket p erforms a Reverse Path Forwarding RPF check That

is each router checks to see if the incoming interface on whicha multicast packet is received

is the interface the router would use as an outgoing interface to reach the source In this way

a router will cho ose to only receive packets on the one interface that it believes is the most

ecient path back to the source All packets received on the prop er interface are forwarded



on all outgoing interfaces All others are discarded silently

Eventually a packet will reach a router with some number of attached hosts This leaf router

will check to see if it knows of any group memb ers on any of its attached subnets A router

discovers the existence of group memb ers by p erio dically issuing Internet Group Management

Proto col IGMP queries If there are memb ers the leaf router forwards the mul

ticast packet on the subnet Otherwise the leaf router will send a prune message toward the

source on the RPF interface ie the interface the leaf router would use to forward packets

to the source

Prune packets are forwarded backtoward the source and routers along the way create prune

state for the interface on which the prune message is received If prune messages are received



In reality the action for a packet that fails an RPF check dep ends on the proto col Some proto cols tell all

upstream routers except the RPF router to stop forwarding packets

on all interfaces except the RPF interface the router will send a prune message of its own

toward the source

In this way reverse shortest path trees are created These trees can be constructed even on

a virtual top ology like the MBone Broadcastandprune proto cols are also known as dense mode

proto cols b ecause they are designed to p erform b est when the top ology is densely p opulated with

group memb ers Routers assume there are group members downstream and so forward packets

Only when explicit prune messages are received do es a router not forward multicast trac If a

group is densely p opulated routers are unlikely to ever need to prune The key disadvantage of

dense mo de proto cols is that state information must be kept for each source at every router in

the network regardless of whether downstream group members exist If a group is not densely

p opulated signicantstate must b e stored in the network and a signicant amount of bandwidth

maybewasted

Evolution of IntraDomain Multicast

Since the MBone has grown tremendously It is no longer a simple virtual network sitting

on top of the Internet but is rapidly b ecoming integrated into the Internet itself In addition to

simple DVMRP tunnels b etween workstations the MBone nowhasnative multicast capability ie

routers are capable of handling multicast packets see Figure Furthermore ongoing research

has led to the development and deployment of two additional dense mo de proto cols These are

describ ed b elow

MOSPF Multicast Extensions to OSPF MOSPF uses the Op en Shortest Path First OSPF

proto col to provide multicast Basically MOSPF routers o o d an OSPF area with information

ab out group receivers This allows all MOSPF routers in an area to have the same view of group

memb ership In the same way that each OSPF router indep endently constructs the unicast rout

ing top ology each MOSPF router can construct the shortest path tree for each source and group

While group memb ership rep orts are o o ded throughout the OSPF area data is not MOSPF is

something of an o ddity in terms of classication It is considered a dense mo de proto col b ecause

memb ership information is broadcast to each MOSPF router but it is also considered an explicit

join proto col b ecause data is only sent to those receivers that sp ecically request it The key to

understanding MOSPF is to realize that it is heavily dep endent on OSPF and its link state routing

paradigm

PIMDM Proto col Indep endent Multicast PIM has b een split into two proto cols a dense

mo de version called PIMDM and a sparse mo de version called PIMSM PIMDM is very

similar to DVMRP there are only two ma jor dierences The rst is that PIM b oth dense

mo de and sparse mo de uses the unicast routing table to p erform RPF checks While DVMRP

maintains its own routing table PIM uses whatever unicast table is available The name PIM router-to-mrouted tunnel

native multicast links

Multicast Router Unicast Router Multicast Link router-to-router

Unicast Link Link w/ w/ Tunnel Tunnel tunnel

Figure Example multicast top ology with a combination of tunnels and nativemulticast links

is derived from the fact that the unicast table can be built using any unicast routing algorithm

PIM simply requires the unicast routing table to exist and so is independent of the algorithm

used to build it The second dierence between PIMDM and DVMRP is that DVMRP tries to

avoid sending unnecessary packets to neighb ors who will then generate prune messages based on a

failed RPF check The set of outgoing interfaces built by a given DVMRP router will include only

those downstream routers that use the given router to reach the source successful RPF check

PIMDM avoids this complexity but the tradeo is that packets are forwarded on all outgoing

interfaces Unnecessary packets are often forwarded to routers which must then generate prune

messages b ecause of the resulting RPF failure

The next evolutionary step in intradomain routing was to develop proto cols that addressed the

disadvantages of dense mo de proto cols A new class of proto cols called sparse mode proto cols was

created Instead of optimizing only for the case when a group has many memb ers sparse mo de

proto cols are designed to work more eciently when there are only a few widely distributed group

members Instead of broadcasting trac and triggering prune messages receivers are exp ected to

send explicit join messages These join messages are sent to a router acting as a core Sources

are exp ected to send their data trac to this same no de The use of a core as a meeting place

for sources and receivers facilitates creation of the multicast tree Two of the most p opular sparse

mo de proto cols are describ ed b elow

CBT The Core Based Trees CBT proto col was rst discussed in the research community

and is now b eing standardized by the IETF CBT uses the basic sparse mo de paradigm to create

asingleshared tree used by all sources The tree is ro oted at a core All sources send their data to

the core and all receivers send explicit join messages to the core There are two dierences b etween

CBT and PIMSM First CBT uses only a shared tree and is not designed to use shortest path

trees Second CBT uses bidirectional shared trees but PIMSM uses unidirectional shared trees

Bidirectional shared trees involve slightly more complexity but are more ecient when packets

traveling from a source to the core cross branches of the multicast tree In this case instead of

only sending trac up to the core packets can also be sent down the tree While CBT has

signicant technical merits and is on par technically with PIMSM few routing vendors provide

supp ort for CBT

PIMSM PIMSM is much more widely used than CBT It is similar to PIMDM in that

routing decisions are based on whatever underlying unicast routing table exists but the tree con

struction mechanism is quite dierent PIMSMs tree construction algorithm is actually more

similar to that used by CBT than to that used by PIMDM In the following description of sparse

mo de proto col op eration we use PIMSM as our example



A core called a rendezvous point RP in PIM terminology must be congured Dierent

groups may use dierent routers for RPs but a group can only have a single RP

 Information ab out which routers in the network are RPs and the mappings of multicast

groups to RPs must b e discovered by all routers

 RP discovery is done using a b o otstrap proto col However b ecause the RP discovery

mechanism is not included in the PIMSMv sp ecication eachvendor implementation

of PIMSMv has its own RP discovery mechanism For PIMSMv the b o otstrap

proto col is included in the proto col sp ecication

 The basic function of the b o otstrap proto col in addition to RP discovery is to provide

robustness in case of RP failure The b o otstrap proto col includes mechanisms to select

an alternate RP if the primary RP go es down

Receivers send explicit join messages to the RP Forwarding state is created in each router

along the path from the receiver to the RP A single shared tree ro oted at the RP is formed

for each group As with other multicast proto cols the tree is a reverse shortest path treejoin

messages followa reverse path from receiverstotheRP

Each source sends multicast data packets encapsulated in unicast packets to the RP When

an RP receives one of these register packets a number of actions are p ossible First if the

RP has forwarding state for the group ie there are receivers who have joined the group

the encapsulation is stripp ed o the packet and it is sent on the shared tree However if the

RP do es not have forwarding state for the group it sends a registerstop message to the RP

This avoids wasting bandwidth b etween the source and the RP Second the RP may wish to

send a join message toward the source By establishing multicast forwarding state between

the source and the RP the RP can receive the sources trac as multicast and avoid the

overhead of encapsulation

These steps describ e the basic mechanism used by sparse mo de proto cols in general and PIM

SM in particular In summary the basic goal is to use the RP as a meeting place for sources and

receivers Receivers explicitly join the shared tree and sources register with the RP



Deciding how manyRPsto have and where to place them in the network is anetwork planning issue and is

beyond the scop e of this pap er Arecent b o ok oers some discussion on this topic

Sparse mo de proto cols have a number of advantages over dense mo de proto cols First sparse

mo de proto cols typically oer b etter scalability in terms of routing state Only routers on the path

between a source and a group member must keep state Dense mo de proto cols require state in all

routers in the network Second sparse mo de proto cols are more ecient b ecause the use of explicit

join messages means multicast trac only ows across links that have b een explicitly added to the

tree

Sparse mo de proto cols do have a few disadvantages These are mostly related to the use of

RPs First the RP can be a single p oint of failure Second the RP can b ecome a hot sp ot for

multicast trac Third having trac forwarded from a source to the RP and then to receivers

means that nonoptimal paths may exist in the multicast tree The rst problem is mostly solved

with the b o otstrap router proto col The second and third problems are solved in CBT by using

bidirectional trees PIMSM solves these problems by providing a mechanism to switch from a

shared tree to a shortest path tree This change o ccurs when a leaf router sends a sp ecial message

towards the source Forwarding state is changed so trac ows directly to the receiver instead of

rst through the RP This action o ccurs when a trac rate threshold is violated

Finally not only has progress b een made in proto col development but MBone growth has led

to increased user awareness of multicast which in turn has led to demand for new applications and

b etter supp ort for realtime data Improvements have b een made in transp ort layer proto cols For

example the RealTime Proto col RTP assists loss and delaysensitive applications in adapting

to the Internets b esteort service mo del With resp ect to applications the MBone has seen an

increasingly diverse set of media typ es Originally the MBone was considered a research eort

and its evolution was overseen by memb ers of the MBone community Co ordination of events was

handled almost exclusively through the use of a global session directory to ol originally called sd

but now called sdr As multicast deployment has continued and as multicast has b een integrated

into the Internet as a native service the informal use agreements and guidelines have faded Even

though sdrbased sessions remain at the core of Internet multicast events their p ercentage of the

total is shrinking Other applications are b eing deployed that do not co ordinate sessions through

sdr or use RTP This p otp ourri of to ols has enriched the diversity of applications available but

has stressed the ability of the network to provide multicast according to the standard IP multicast

mo del

For clarity it is worth summarizing the key multicast terminology Multicast proto cols use

either a broadcastandprune or an explicit join mechanism Broadcastandprune proto cols are

commonly called dense mo de proto cols and always use a reverse shortest path tree ro oted

at a source Explicit join proto cols commonly called sparse mo de proto cols can use either a

reverse shortest path tree or a shared tree A shared tree uses a core or a rendezvous point to

bring sources and receivers together

Problems with Multicast

As the MBone has grown it has suered from an increasing numb er of problems and these problems

have b een o ccurring with increasing frequency The most imp ortant reason for this is the growing

diculty of managing a at virtual top ology The same problems exp erienced with classbased

unicast routing have manifested themselves in the MBone As the MBone has grown its size has

b ecome a problem in terms of both routing state and susceptibility to miscongurations As a

result the multicast community has realized the need to deploy hierarchical interdomain routing

In particular the MBone faces problems of scalability and manageability

Scalability Large at networks are inherently unstable Exacerbating this problem are organi

zational mechanisms which do not provide signicant route aggregation For these two reasons the

MBone has exp erienced substantial scalability problems At its p eak the MBone had almost

routes Unfortunately most of these routes had long prexes b etween and whichmeant

that very few hosts could b e represented in each routing table entry These scalability problems are

not new As the Internet has grown unicast routing had to be fundamentally changed to enable

continued growth and stability The solutionsroute aggregation and hierarchical routinghave

proven successful and the issue nowishow to apply them to multicast

Manageability As the MBone has grown it has b ecome harder to manage The MBone has

no central management and most tasks have b een handled on a p ersite basis Most co ordination

takes place via the MBone mailing list Because the MBone is a virtual top ology and new sites can

b e connected anywhere there should b e a formal pro cedure for adding new sites Because no such

mechanism exists the MBone has grown randomly and there are many ineciencies Two typ es

of ineciency commonly observed are

 Virtual Top ology Tunnel Management The MBone is characterized as a set of

multicastcapable islands connected by tunnels The goal has always b een to connect these

islands in the most ecient manner but over time suboptimal tunnels have been created

Tunnels are often set up in very inecient ways see Figure for several examples This

b ehavior was observed very early in the history of the MBone esp ecially with regard to the

MCI Backb one To avoid the growing tangle of tunnels engineers at MCI underto ok the

dicult task of enforcing a p olicy that tunnels through or into the MCI network would have

to b e terminated at designated b order p oints The goal was to resolve the observed problem

of single physical links b eing crossed by several up to tunnels The work of the MCI

engineers set an example that help ed keep the MBone reasonably ecient for a number of

years

 InterDomain Policy Management Domain b oundaries are another source of problems

when trying to manage a at top ology The mo del in to days Internet is to establish Au

tonomous System AS b oundaries b etween Internet domains ASes are commonly managed

or owned by dierent organizations Entities in one AS are typically not trusted byentities

in another AS As a result exchange of routing information across AS b oundaries is handled

very carefully Peering relationships among ASes are provisioned using the Border Gateway

Proto col BGP whichprovides routing abstraction and p olicy control As a result

of widescale use of BGP there is a commonly accepted pro cedure when two ASes wish to

communicate Because the MBone do es not provide suchaninterdomain proto col it oers

no protection across domain b oundaries When there is a single at top ology connected using

tunnels routing problems can easily spread throughout the top ology

To summarize the rst problem is the complexity and instability of a large at top ology The

second problem is that there are no proto col mechanisms to build a hierarchical multicast routing

top ology The need to solve these two problems created the rst attempts to deployinterdomain

multicast

Evolution of InterDomain Multicast

Interdomain multicast has evolved out of the need to provide scalable hierarchical Internetwide

multicast Proto cols that provide the necessary functionalityhave b een develop ed but the technol

ogy is relatively immature These proto cols are b eing considered by the IETF while simultaneously

b eing evaluated through extensive deployment The particular interdomain solution in use is con

sidered nearterm and is possibly only an interim solution While the solution is functional it

lacks elegance and longterm scalability As a result additional workisunderway to nd longterm

solutions Some of these prop osals are based on the standard IP multicast mo del Others attempt

to rene the service mo del in hop es of making the problem easier

NearTerm Solution

The nearterm solution for interdomain multicast routing has three parts The rst is a straight

forward extension of the interdomain unicast route exchange proto col BGP The second and third

are additional proto cols needed to build and interconnect trees across domain b oundaries

Carrying Multicast Routes in BGP

The rst requirement follows from the need to make multicast routing hierarchical in the same

manner as unicast routing Route aggregation and abstraction as well as hopbyhop p olicy routing

are provided in unicast using the Border Gateway Proto col BGP BGP oers substantial

abstraction and control among domains Within a domain a network administrator can run any

routing proto col desired Routing to hosts in an external domain is simply a matter of cho osing

the b est external link

BGP supp orts interdomain routing by reliably exchanging network reachability information

This information is used to compute an endtoend distancevectorstylepathofASnumb ers Each

AS advertises the set of routes it can reach and an asso ciated cost Each border router can then

compute the set of ASes that should b e traversed to reachany network The use of a distance vector

algorithm together with full path information allows BGP to overcome many of the limitations of

traditional distance vector algorithms Packets are still routed on a hopbyhop basis but less

information is needed and b etter routing decisions can b e made

The functionality provided by BGP and its wellundersto o d paradigm for connecting ASes are

imp ortant catalysts for supp orting interdomain multicast Aversion of BGP capable of carrying

multicast routes would not only provide hierarchical routing and p olicy decisions but would also

allow a service provider to use dierent top ologies for unicast and multicast trac

The mechanism by which BGP has b een extended to carry multicast routes is called Multipro



to col Extensions to BGP MBGP MBGP is able to carry multiproto col routes by adding

the Subsequent Address Family Identier SAFI to two BGP messages MP REACH NLRI and

MP UNREACH NLRI Sp ecically for multicast the SAFI eld can sp ecify unicast multicast or

unicastmulticast With MBGP instead of every router needing to know the entire at multicast

top ology each router only needs to know the top ology of its own domain and the paths to reach

each of the other domains Figure shows an example of several domains connected together by

MBGP sessions In one case two domains are connected together using dierent connections for

unicast and multicast

There is some confusion over exactly what functionality MBGP provides Tobeclearweoer

the following example If one domain advertises reachability for multicast the message will say

I have a path to sources on the networks listed in this message MBGP messages do not carry

information ab out multicast groups ie class D addresses are never carried in an MBGP message

Recall that multicast trees are constructed using a reverse path back to the source Therefore

MBGP information is used when a join message is sentfromanRPorreceiver toward the source

This join message needs to know the best reverse path toward the source MBGP provides this

nexthop information between domains If all unicast and multicast top ologies were assumed to

be the same the reverse path join could simply followthe same next hop that any unicast trac

would follow MBGP allows a network administrator to sp ecify a dierentreverse path for the join



There is some ambiguity over terminology here First multiproto col BGP is sometimes also referred to as

BGP Second some think that MBGP stands for Multicast BGP All three terms refer to the same proto col Intra-Domain Intra-Domain Cloud Cloud

Intra-Domain Multicast-Capable Cloud Border Router (running MBGP)

Intra-Domain Unicast-Only Cloud Border Router

(running BGP only)

Figure Example interdomain multicast top ology running BGP andor MBGP

to follow and subsequently a dierent forward path when data is sent

While MBGP is the rst step toward providing interdomain multicast it alone is not a com

plete solution MBGP is capable of determining the next hop to a host but it is not capable of

providing multicast tree construction functions More sp ecically what is the format of the join

message When should join messages be sent and how often Supp ort for this functionality is

not provided by MBGP a true interdomain multicast routing proto col is needed Furthermore

conventional wisdom suggests that this proto col should not use the broadcastandprune metho d

of tree construction The nearterm solution b eing advo cated is to use PIMSM to establish a

multicast tree b etween domains containing group members

The Multicast Source Discovery Proto col

To summarize various intradomain routing proto cols exist there is a route exchange proto col

to supp ort multicast and PIMSM is to be used to connect receivers and sources across domain

b oundaries But there is still one function missing from the nearterm solution This function

is needed when trying to connect sparse mo de domains together Given that PIMSM is the

only sparse mo de proto col that has seen signicant deployment this function tends to be heavily

inuenced by PIMSM The problem is basically how to inform an RP in one domain that there are

sources in other domains The underlying assumption here is that a group can nowhavemultiple

RPs However the reality is that there is still only one RP p er domain but nowmultiple domains

may be involved The approach adopted is largely motivated by the p erceived needs of the ISP

community In fact the decision to havemultiple RPs rather than a single ro ot is what dierentiates

the nearterm solution from other prop osed solutions

A problem arises when group memb ers are spread over multiple domains There is no mechanism

to connect the various intradomain multicast trees together While trac from all the sources for a

particular group within a particular domain will reach the groups receivers any sources outside the

domain will remain disjoint Why is this the case Within a domain receivers send join messages

toward one RP and sources send register messages to the same RP However there is no way for

an RP in one domain to nd out ab out sources in other domains using dierent RPs There is no

mechanism for RPs to communicate with each other when one receives a source register message This problem is summarized in Figure

Sources register w/ the RP in their domain Receivers send joins toward the RP in their domain

S R R S S R RP

R MBGP RP S connection

R S R

No way for receivers in Domain A to receive traffic from sources in Domain A Domain B and vice versa Domain B

Multicast Tree Link

Non-Tree Multicast Link

Figure The problem of connecting sources and receivers across two sparse mo de domains

The decision to maintain a separate multicast tree and RP for each domain is driven by the

need to reduce administrative dep endencies b etween domains Two p otential problems are avoided

in this way

It is not necessary for two domains to coadminister a single sparse mo de cloud Relevant

administrative functions include identifying candidate RPs and establishing the groupRP

mapping

It b ecomes p ossible to avoid second and thirdparty dep endencies in whichmulticast delivery

for sources and groups in one or more domains is dep endent on another domain whose only

function is to provide the RP Dep endencies can o ccur when all sources and receivers in the

RPs domain leave or b ecome inactive The domain with the RP has no group memb ers and

yet is still providing the RP service Dep ending on how multicast and interdomain trac

billing is handled this could b e particularly undesirable

The nearterm solution adopted for this problem is a new proto col appropriately named the

Multicast Source Discovery Proto col MSDP This proto col works byhaving representatives in

each domain announce to other domains the existence of active sources MSDP is run in the same

router as a domains RP or one of the RPs MSDPs op eration is similar to that of MBGP in

that MSDP sessions are congured b etween domains and TCP is used for reliable session message

exchange MSDP op eration is describ ed b elow with each step shown in Figure

When a new source for a group b ecomes active it will register with the domains RP

The MSDP p eer in the domain will detect the existence of the new source and send a Source

Active SA message to all directlyconnected MSDP p eers

MSDP message o o ding

 MSDP p eers that receive an SA message will p erform a peerRPF check The MSDP

peer that received the SA message will check to see if the MSDP peer that sent the

message is along the correct MSDPp eer path These p eerRPF checks are necessary

to prevent SA message lo oping

 If an MSDP p eer receives an SA message on the correct interface the message is for

warded to all MSDP p eers except the one from which the message was received This is

called peerRPF ooding

Within a domain an MSDP p eer also the RP will check to see if it has state for any group

memb ers in the domain If state do es exist the RP will send a PIM join message to the

source address advertised in the SA message

If data is contained in the message the RP then forwards it on the multicast tree Once

group members receive data they may cho ose to switch to a shortest path tree using PIM

SM conventions

Steps are rep eated until all MSDP peers have received the SA message and all group

memb ers are receiving data from the source

This ends the description of the shortterm interdomain multicast routing solution The solu

tion is referred to with the abbreviations for the three relevant proto cols MBGPPIMSMMSDP

However while the given description is relatively complete there are a number of details which

are not discussed And as with any system most of the complexity is in the details Furthermore

we have not yet discussed the limitations of the current solution in any detail In particular a

qualitative assessment of the scalability complexityandoverall quality of the proto cols would b e

valuable

The MBGPPIMSMMSDP solution is relatively straightforward once a p erson understands

all the abbreviations and understands the motivating factors that drove the design of the proto cols

While some argue that the current set of proto cols is not simple it really is no more complex

than many other Internet services such as unicast routing The key advantage of MBGPPIM

SMMSDP is that it is a functional solution largely built on existing proto cols Furthermore it is R R RP/ R MSDP MSDP RP/ 2 Between RPs MSDP 2 2 1 R 5 5 S 3 5 R R 3 R MBGP 3 Between Domains

3 5 4 4 3 RP/ 5 5 RP/ MSDP R

MSDP

Figure MSDP op eration including ow of Source Active SA messages

already b eing deployed with a fair amount of success The key disadvantage is that as a longterm

solution the MBGPPIMSMMSDP proto col suite may be susceptible to scalability problems

Further discussion of two particular problems follows

MSDP and Dynamic Groups When multicast sources begin to transmit the network is

required to create some typ e of routing state to control packet ow We have already discussed

how dierent typ es of multicast routing proto cols accomplish this function However in the case

of MSDP information ab out the existence of sources must rst b e transmitted b efore routing state

can be created This extra complexity increases the overhead of managing groups When groups

are dynamic due to either bursty sources or frequent group memb er joinleaveevents the overhead



of managing the group can b e signicant A formidable task would b e created for networks that

must establish and remove information for thousands of sources and receivers scattered around the

world Two sp ecic problems related to dynamic groupssources are

 Join Latency Because SA messages are only sent p erio dically there may be a signicant

delay between when new receivers join and when they hear the next SA message To solve

this problem MSDP peers may be congured to cache SA messages Anoncaching MSDP

p eer can send an SARequest message to an MSDP p eer that do es p erform caching This

gives MSDP p eers a mechanism to actively determine source thereby reducing join latency

The tradeo is the extra state and complexityofmaintaining the cache

 Bursty Sources This typ e of source can be characterized as sending short packet bursts

separated by silent p erio ds on the order of several minutes One example is when a to ol like

sdr to p erio dically advertise a session A problem o ccurs when trying to establish a multicast



Again it should b e noted that b ecause no formal study of MBGPPIMSMMSDP p erformance has b een con

ducted many of these statements are hyp othetical

tree for this kind of source The problem b egins when one or a few packets are sent to the RP

The RP will hear the packet and o o d an SA message and RPs in other domains will send

join messages back to the source However because no multicast forwarding state existed

when the packet was originally sent and b ecause it takes time to forward SA messages and

have other RPs establish forwarding state the original burst will not reach new receivers

Once state is established all subsequent packets should reach these receivers The problem

o ccurs when the p erio d of silence b etween packet bursts exceeds the forwarding state timeout

value typically minutes Because no packets are sent the forwarding state is discarded

When another session announcementissent the same pro cess of establishing state but losing

the initial burst is rep eated In this way no packets from bursty sources ever reach group

memb ers The solution sp ecied in the MSDP proto col is to have SA messages carry the

rst n data packets This is not a particularly elegant solution but it do es solve the problem

The lack of elegance is making the proto col harder to standardize Because data packets are

delivered via SA messages which are delivered over TCP connections some in the multicast

community wonder if this will have undesirable side eects or break assumptions of higher

layer proto cols As a result recent discussions in the MSDP working group have generated

prop osals which allow data to b e carried in either GRE or UDP packets The nal decision

on which data delivery options to supp ort has not b een made

MSDP Scalability The issue of scalability is an imp ortant one to consider for MSDP Because

of the way MSDP op erates if multicast b ecomes tremendously successful the overhead of MSDP

may b ecome to o large The limitation o ccurs if multicast use grows to the p oint where there are

thousands of multicast sources The number of SA messages plus data b eing o o ded around

the network could b ecome very large The generallyagreedup on conclusion is that MSDP is not

a particularly scalable solution and will likely be insucient for the long term But given that

longterm solutions are not ready to be deployed MSDP is seen as an immediate solution to an

immediate need

LongTerm Prop osals

While MBGPPIMSMMSDP is a recognized nearterm solution there is still a need to develop

longterm solutions Numerous eorts are b eing undertaken in this direction These eorts can b e

broken down into two groups eorts based on the standard IP multicast philosophy and eorts

whichlooktochange this mo del in hop es of simplifying the problem Eorts in each of these areas

are describ ed next

Border Gateway Multicast Proto col

The Border Gateway Multicast Proto col BGMP was rst prop osed as a longterm solution for



Internetwide interdomain multicast The key idea of BGMP is to construct bidirectional



BGMP should not b e confused with MBGP After reading this pap er the dierences should b e obvious but the

similarity of the names and abbreviations has led to constant confusion Furthermore BGMP has recently b een

renamed It was previously known as Grand Unied Multicast GUM

shared trees b etween domains using a single ro ot One of the functions of BGMP is then to decide

in which particular domain to ro ot the shared tree BGMP relies on the b elief that interdomain

dep endencies can b e avoided by using a strict address allo cation scheme Such an address allo cation

scheme allows domains to own sp ecic addresses or sp ecic ranges of addresses The b elief is that

if a particular domain owns the address for a particular group the domain will be signicantly

involved in the multicast service Finally this means dep endency problems even though there is

a single ro ot should b e highly unlikely For example a videoondemand application will likely b e

ro oted at the server a video conference group will b e ro oted at the primary source or at a session

co ordinator The b elief is that no matter the typ e of session one domain will always b e the logical

choice for the ro ot domain

As a result of a proto col like BGMP there is a need for a strict address allo cation scheme

Strict means that ownership must b e clearly dened and that there cannot b e collisions There

fore the sdr mechanism of randomly cho osing an address is not sucient Because of BGMP as

well as demands from ISPs and application writers work is b eing conducted to develop the neces

sary address allo cation schemes Before discussing two of the prop osals for address allo cation it is

worthwhile to maketwopoints First BGMP is relatively exible and can use anyschemeaslong

as it provides strict address allo cation Second indep endent of BGMP there is a need for b etter

address allo cation The sdr mechanism is not particularly scalable and is no longer sucienteven

for the current MBGPPIMSMMSDP solution Prop osals usable in b oth the current mo del and

with BGMP are b eing considered by the IETF They are describ ed b elow

MASC The Multicast AddressSet Claim MASC proto col supp orts address allo cation b etween

domains MASC includes mechanisms to guarantee that address collisions are immediately re

solved From a more abstract p ersp ective MASC provides the functionality required at the highest

layer of a more general addressing scheme called the Multicast Address Allo cation Architecture

MAAA MASC and its supp orting proto cols are sp ecic instances of proto cols that meet the

requirements of the MAAA sp ecication In MAAA there are three levels of address allo cation at

the domain level within a domain and b etween hosts and the network Work to develop proto cols

at eachlevel is underway in the IETF MASC would act as a toplevel address allo cation proto col

and op erate between domains the multicast Address Allo cation Proto col AAP would allo

cate addresses within a domain and the Multicast Address Dynamic Client Allo cation Proto col

MADCAP would b e used by hosts to request addresses from a Multicast Address Allo cation

Server MAAS

GLOP Another much simpler prop osal is to statically allo cate multicast addresses to each AS

A glop of addresses is assigned to eachASTheASnumb er is enco ded as part of the address

The rst version of GLOP is b eing evaluated with only part of the address range Only the

address range is b eing used As a result the rst o ctet is static the next two o ctets enco de

the AS numb er and the nal o ctet provides a range of addresses to b e allo cated This prop osal is

gaining in p opularitybutit has two limitations First because only bits or addresses are

available to each AS there is likely to b e an insucientnumb er of addresses p er AS This problem

could be solved by using more of the Class D address space or by switching to IPv addressing

The second problem is that GLOP do es not sp ecify a mechanism bywhich addresses are allo cated

within the domain This problem could b e solved by using a simple administrative pro cedure by

using a dynamic proto col like AAPMADCAP or by using an mo died intradomain version of

sdr

Ro ot Addressed Multicast Architecture

In resp onse to the p erceived complexity of MBGPPIMSMMSDP and BGMP and to the need to

address additional multicastrelated issues like security billing and management some memb ers

of the multicast community are lo oking to make fundamental changes in the multicast mo del One

class of prop osals b eing oered is called the Ro ot Addressed Multicast Architecture RAMA

The premise for RAMAstyle proto cols is that most multicast applications are singlesource or have

an easily identiable primary source By making this source the ro ot of the tree the complexity

of core placement in other multicast routing proto cols can be eliminated This tradeo raises a

numb er of imp ortant issues which are describ ed at the end of this section There are two primary

RAMAstyle proto cols b eing discussed Express Multicast and Simple Multicast The key

asp ects of these two proto cols are

Express Multicast Express is designed sp ecically as a singlesource proto col The ro ot of

the tree is placed at the source and group memb ers send join messages along the reverse path to

the source Express also provides mechanisms to eciently collect information ab out subscrib ers

The proto col is sp ecically designed for subscrib erbased systems that use logical channels Rep

resentative applications include TV broadcasts le distribution and any singlesource multimedia

application The key advantages of Express are that routing complexity can b e reduced and that

closed groups can b e oered

Simple Multicast Simple Multicast and Express Multicast are similar but Simple Multicast

has the added exibility of allowing multiple sources per group A particular source must be

chosen as the primary and the tree is ro oted at this no des rsthop router Receivers send join

messages to the source and a bidirectional tree is constructed Additional sources send packets

to the primary source Because the tree is bidirectional as so on as packets reach a router in the

tree they are forwarded b oth downstream to receivers and upstream to the core The advantages

and disadvantages of this prop osal are b eing heavily debated but the prop osals authors b elieve

that it eliminates the address allo cation problem and the need to place and lo cate RPs Address

allo cation is done by using the core address and the multicast group address to uniquely identify a

group By routing on this pair of addresses each ro otcoresource can allo cate without collision



up to addresses

The Express and Simple multicast prop osals have received signicant attention in b oth the

research community and the IETF There is another question in addition to that of the merits

of these new proto cols If these proto cols are standardized will they be exp ected to replace all

existing proto cols or will they work in parallel with the existing multicast infrastructure If the

RAMAstyle proto cols are exp ected to work in co op eration with existing proto cols there will b e

yet another set of proto cols to deploy evaluate and interop erate with This do es not make the

provision of Internetwide multicast easier If RAMAstyle proto cols are exp ected to replace the

current set of proto cols the question b ecomes whether they have enough exibility to supp ort all

typ es of multicast applications The b ottom line is that these new proto cols are still prop osals and

it is uncertain what their future will b e

InterDomain Multicast Deployment

The successful deployment of multicast or lack thereof was one of the original motivations for

developing interdomain routing proto cols In this section we describ e eorts to deploy these

proto cols Our description is divided into two parts a discussion of the commo dityInternet and

a discussion of the Internet architecture

Deployment in the Commo dity Internet

Measuring the success of interdomain deployment either from a qualitative point of view or by

taking a count of connected hosts is a dicult problem Published studies have so far only dealt

with the MBone although several studies that distinguish b etween the MBone and interdomain

multicast are currently underway It is beyond the scop e of this pap er to oer any quantitative

results However it is p ossible to describ e the plan now b eing implemented to transition from the

MBones at virtual top ology to a true interdomain multicast infrastructure

Nowthatinterdomain multicast routing is p ossible the issue is how to deal with the MBone

While the rest of the Internet is working to deploy interdomain multicast the challenge is how

to bring MBone users into the new infrastructure The solution has been to make the MBone

its own AS called AS All MBone tunnels and sites connected by tunnels are relegated to

AS Connectivity between AS and other multicastcapable ASes is provided at the

NASA Ames Multicastfriendly Internet eXchange MIX The NASA Ames MIX provides

connectivity between the MBone AS and all other ASes that have deployed MBGPPIM

SMMSDP The deploymentofinterdomain multicast can continue to grow while the at routing

top ology that is the MBone is eliminated Sites on the MBone will hop efully transition to native

multicast by deploying whatever interdomain solution is appropriate When this o ccurs these sites

will no longer need their old MBone tunnels Observational analysis suggests that this transition

pro cess is indeed o ccurring Because of the dierences in route aggregation b etween MBGP routes

and MBone routes it is dicult to quantify this assertion However the numb er of routes in the

MBone has decreased dramatically and the numb er of MBGP routes has increased dramatically

Deployment in Internet

For Internet the plan has always b een to try and do multicast the right way to the extent

p ossible given the currently available set of proto cols As a result Internet multicast deployment

is following guidelines set forth bytheInternet Multicast Working Group Briey these guidelines

require all multicast deployed in Internet to b e native and sparse mo de No tunnels are allowed

and all routers must supp ort interdomain multicast routing using MBGPMSDP To date Inter

net has exp erienced a reasonable amount of success in deploying multicast This success includes

backb one deployment connecting other highsp eed networks connecting memb er institutions and

running several highbandwidth on the order of Mbps multicast applications

There are twoInternet backb ones in the United States One is the vBNS and the other

is Abilene The vBNS has b een in existence since and from a very early stage has had basic

dense mo de capability During the Internet Memb er Meeting in San Francisco the inherent

problems of dense mo de proto cols were painfully realized when tens of megabits of trac were

o o ded across the network As a result vBNS engineers worked hard to transition the network to

PIMSM and MBGPMSDP As of mid the network had successfully deployed interdomain

multicast and was in the pro cess of establishing MBGP and MSDP p eering relationships with

other networks Figure shows the top ology of the vBNS including the existing MBGP and

MSDP p eering relationships As vBNS engineers gain exp erience in using MBGP and MSDP and

as other network op erators also gain exp erience the rate at which new MBGPMSDP p eerings are

added will increase A number of additional networks including several international highsp eed

networks are planning to connect to the vBNS in the very near future

The other Internet backb one is the Abilene network Because Abilene is a newer network

and has only recently February b ecome op erational the state of interdomain multicast in

Abilene is not nearly as advanced as it is in the vBNS However Abilene has b een running PIMSM

since mid and has b egun to establish its rst set of interdomain p eering relationships The

challenge has b een to climb the learning curve and establish multicast capability in the backb one

Now that the rst MBGPPIMSMMSDP p eering relationships have b een established additional

p eerings are b eing added rapidly The current top ology is shown in Figure

Figure vBNS MBGP and MSDP p eering top ology

Figure Abilene multicast map

Conclusions

In this pap er we have presented a tutorialstyle overview of multicast We have covered the

early development of intradomain routing proto cols the evolution of the MBone the needs and

current solutions for interdomain multicast the set of nextgeneration proto cols currently under

investigation and the current state of deployment in the Internet and Internet Whatever the

future holds for multicast it is likely to present ma jor challenges for b oth research and deployment

Acknowledgements

This pap er would not have b een p ossible without the exp ertise of manymany p eople It would b e

quite imp ossible for any list to do justice to them all However sp ecic technical and qualitative

suggestions were oered by Dave Meyer Dave Thaler Mostafa Ammar Christophe Diot Brian

Levine and Ben Chinowsky Also engineers from b oth the vBNS and Abilene contributed top olo

gies for their resp ective networks These individuals included Kevin Thompson Steven Wallace

and BrentSweeny

References

S Casner and S Deering First IETF Internet audio cast ACM Computer Communication Review pp

July

K Almeroth A longterm analysis of growth and usage patterns in the Multicast Backb one MBone in IEEE

InfocomTel Aviv ISRAEL March to app ear

C Diot B Levine J Lyles H Kassem and D Balensiefen Deployment issues for the IP multicast service

and architecture IEEE Network

S Deering Multicast routing in a internetwork PhD Dissertation

S Deering and D Cheriton Multicast routing in datagram internetworks and extended LANs ACM Trans

actions on Computer Systems pp May

S Deering Host extensions for IP multicasting Internet Engineering Task Force IETF RFC August

S Casner Frequently Asked QuestionsFAQ on the Multicast BackboneMBone USCISI Decemb er

Available from ftpftpisiedumb onefaqtxt

H Eriksson The multicast backb one Communications of the ACMvol pp

D Waitzman C Partridge and S Deering Distance vector multicast routing proto col DVMRP Internet

Engineering Task Force IETF RFC Novemb er

W Fenner Internet group management proto col version Internet Engineering Task Force IETF RFC

Novemb er

B Cain S Deering and A Thyagara jan Internet group management proto col version Internet Engineering

Task Force IETF draftietfidmrigmpvtxt February

J Moy Multicast extensions to OSPF Internet Engineering Task Force IETF RFC March

J Moy OSPF version Internet Engineering Task Force IETF RFC April

S Deering D Estrin D Farinacci V Jacobson G Liu and L Wei PIM architecture for widearea multicast

routing IEEEACM Transactions on Networking pp Apr

S Deering D Estrin D Farinacci V Jacobson A Helmy D Meyer and L Wei Proto col indep endent

multicast version dense mo de sp ecication Internet Engineering Task Force IETF draftietfpimvdm

txt Novemb er

D Estrin D Farinacci A Helmy D Thaler S Deering M Handley V Jacobson C Liu P Sharma and

L Wei Proto col indep endentmulticast sparsemo de PIMSM Proto col sp ecication Internet Engineering

Task Force IETF RFC June

T Ballardie P Francis and J Crowcroft Core based trees CBT An architecture for scalable multicast

routing in ACM Sigcomm San Francisco California USA pp Septemb er

A Ballardie Core based trees CBT version multicast routing Internet Engineering Task Force IETF

RFC Septemb er

B Williamson Developing IP Multicast Networks VolumeIFundamentals Indianap olis Indiana USA Cisco

Press

H Schulzrinne S Casner R Frederick and J V RTP A transp ort proto col for realtime applications

Internet Engineering Task Force IETF RFC January

C Huitema Routing in the Internet Englewo o d Clis New Jersey Prentice Hall Inc

Y Rekhter and T Li A b order gateway proto col BGP Internet Engineering Task Force IETF RFC

March

PTraina Exp erience with the BGP proto col Internet Engineering Task Force IETF RFC March

T Bates R Chandra D Katz and Y Rekhter Multiproto col extensions for BGP Internet Engineering

Task Force IETF RFC February

D Farinacci Y Rekhter P Lothb erg H Kilmer and J Hall Multicast source discovery proto col MSDP

Internet Engineering Task Force IETF draftfarinaccimsdptxt June

S Kumar P Radoslavov D Thaler C Alaettinoglu D Estrin and M Handley The MASCBGMP archi

tecture for interdomain multicast routing in ACM SigcommVancouver CANADA August

M Handley D Thaler and D Estrin The internet multicast address allo cation architecture Internet Engi

neering Task Force IETF draftietfmallo carchtxt Decemb er

M Handley Multicast address allo cation proto col AAP Internet Engineering Task Force IETF draftietf

mallo caaptxt August

B Patel M Shah and S Hanna Multicast address dynamic client allo cation proto col MADCAP Internet

Engineering Task Force IETF draftietfmallo cmadcaptxt February

D Meyer and P Lothb erg Static allo cations in Internet Engineering Task Force IETF draftietf

mb onedstaticallo cationtxt May

T Ballardie J Crowcroft C Diot C Lee R Perlman and Z Wang On extending the standard IP multicast

architecture tech rep University College London Research Rep ort RN Octob er

H Holbro ok and D Cheriton IP multicast channels EXPRESS supp ort for largescale singlesource applica

tions in ACM Sigcomm Cambridge Massachusetts USA August

R Perlman CY Lee J Crowcroft Z Wang C Diot J Tho o and M Green Simple multicast A design for

simple lowoverhead multicast Internet Engineering Task Force IETF draftp erlmansimplemulticasttxt

February

H LaMaster S Shultz J Meylor and D Meyer Multicastfriendly internet exchange MIX Internet Engi

neering Task Force IETF draftietfmb onedmixtxt June

J Jamison and R Wilder vBNS The internet fast lane for research and education IEEE Communications

January

J Jamison R Nicklas G Miller K Thompson R Wilder L Cunningham and C Song vBNS Not your

fathers internet IEEE Spectrum July