<<

THE BEST OF 2008 BBC iPlayer EBU Audio over IP EBU HD Codec Tests EBU P2P Media Portal SVC Compression DAB Open Source Handheld DAB Open Source Technical Review Technical

EBU TECHNICAL REVIEW – 2008

Welcome 3

Catch-up Radio & TV

Evolution of the BBC iPlayer 4 Anthony Rose (Controller, Vision and Online Media Group, BBC)

Open Source Handhelds

Open source Handhelds – a broadcaster-led innovation for 16 BTH services François Lefebvre (Project Leader), Jean-Michel Bouffard and Pascal Charest (Communications Research Centre, Canada)

Video compression

SVC – a highly-scalable version of H.264/AVC 27 Adi Kouadio (EBU Technical); Maryline Clare and Ludovic Noble (Orange Labs, France Telecom R&D); Vincent Bottreau (Thomson Corporate Research)

HD Production Codecs

HDTV production codec test 44 Massimo Visca (RAI) and Hans Hoffmann (EBU Technical), EBU Project Group P/HDTP

Webcasting

EBU P2P media portal 54 Franc Kozamernik (EBU Technical)

Audio over IP

Streaming audio contributions over IP – a new EBU standard 62 Lars Jonsson (Swedish Radio) and Mathias Coinchon (EBU Technical)

2008 – EBU TECHNICAL REVIEW

The best 2008of

Welcome to the “Best of 2008” selection of articles which have in terms of portability, weight, readability, quality, price and previously been published electronically in EBU Technical (last, but not least) battery life! Review at http://tech.ebu.ch As there is still considerable value in paper publication, the Since 1998, the Technical Review has been published on-line, EBU decided to re-publish some of the articles in an annual four times per year. This venture has been very successful printed edition of EBU Technical Review. Whereas the on- because it has expanded the readership. It has been very line version is available only in English, the annual edition gratifying to meet people who have “discovered” the on-line is also available in French. version. Some of these new readers are not directly connected with the EBU but they work in related areas, such as suppliers If you enjoy reading the articles in this publication, remember of hardware or to the broadcasting industry. to consult the on-line version of EBU Technical Review at http://tech.ebu.ch/ Our electronic publishing service includes an archive of the Technical Review, dating back to 1992. You can access Finally, do not forget to give this URL to your friends and these archive articles by clicking on the “Archive / Thematic colleagues! Index” link, in the navigator frame to the left of the screen. This archive contains a wealth of excellent articles on many different topics. Recently we added a “Hot Topics” section to the on-line version which brings together relevant articles from the archive on topics such as “HDTV in Europe” and “Broadcasting to Handhelds”.

The on-line version also includes a list of abbreviations used in the Technical Review over the past 16 years or so. New Lieven Vermaele Director terms are continually being added to the list which you can EBU Technical Department download as a PDF file by clicking on the “Abbreviations” link in the navigator frame.

At the beginning of the year 2000, the EBU abandoned the printed version of EBU Technical Review. Nevertheless, it was recognised that electronic publishing could not entirely replace “hard copies”. This state of “Nirvana” will not arrive until we have computers that can match this paper publication

2008 – EBU TECHNICAL REVIEW 3 CATCH-UP RADIO & TV

Evolution of the BBC iPlayer

Anthony Rose Controller, Vision and Online Media Group, BBC

For more than ten years, EBU Members have been developing and refining their websites in order to enhance and augment their core radio and television broadcasting activities. The web is no longer merely an information medium (providing textual and pictorial information) but has become an audiovisual content-distribution medium for the internet- connected PC user – for both linear (scheduled) programmes (“channels”) as well as for non-linear (“on-demand”) programmes.

The BBC’s development of the iPlayer is undoubtedly one of the best examples of how broadcasters can exploit the internet as a new media delivery mechanism. It can thus serve as a blueprint for other broadcasters to develop their broadcast services on the internet.

This article is based on a series of phone-calls in August 2008 between Franc Kozamernik (EBU Technical) and Anthony Rose, BBC Controller Vision & Online Media Group, which includes the iPlayer.

For the uninitiated, some background channels as well as VoD and catch-up FK: Our provisional idea is that our information on the iPlayer is provided in TV. What advice could you give to the system will be fully automated. The the box on Page 5. group? system will allow users to find content via a variety of categories and other Anthony Rose (AR): The biggest problem criteria. The metadata used will be Franc Kozamernik (FK): There is a lot in developing services such as the iPlayer broadcast via DVB-SI and TV Anytime, of interest among EBU Members in is typically not so much the website and as appropriate. the BBC’s iPlayer developments. The media playout, or even the transcoding EBU Delivery Management Committee system, but rather the metadata and the AR: There are a number of important (DMC) set up a Project Group D/WMT content ingestion. questions which think need to be (Web Media Technologies) chaired by addressed before one starts a project Paola Sunna (RAI), in order to develop In the case of the proposed EBU project, such as this. For example, is it the and evaluate a similar development a key design question is whether it will EBU’s intention that each broadcaster termed the “EBU Media Player”, which be an automated system that will capture creates their own website to where users will be capable of delivering all kinds of content from satellite or another source, can access this captured content, or content including the streams received or whether there will be a team who will the EBU provide a so-called white from satellite, terrestrial, cable and IPTV manually and ingest content. label solution, which means that the

4 EBU TECHNICAL REVIEW – 2008 CATCH-UP RADIO & TV

EBU develops a fully working website, It is more likely that Redux can be made that interests them. We did not want the which each broadcaster can then “skin” available to the EBU for testing rather iPlayer to become a regular video-sharing or brand to make it look like their own than the iPlayer, given the sheer amount of site like YouTube or a music store like site? Will broadcasters need to arrange resources that have been spent on making iTunes, where people would need to sort rights clearances for each territory, or the iPlayer a viable commercial product. through thousands of programmes to can the EBU arrange this on behalf of all? Many European broadcasters approach us find one of interest. This is a different Where will you source detailed metadata with an interest in licensing the iPlayer. use case. The reason why people like to from (e.g. actor names, full programme The question is whether they want a come to the iPlayer site is because it allows descriptions, etc.)? Would you pull it complete end-to-end system or whether them to find a particular programme that from the DVB feed or will editors log in they want individual pieces of the iPlayer they missed on TV or radio. They want separately to apply enhanced metadata? production system, playout system or to catch up with what they know exists website. We spent several million pounds but were unable to enjoy at the time of Perhaps the EBU project is more similar of taxpayers’ money and could not give broadcast. It is possible that over the to the Redux project developed by BBC away that technology readily. However, coming months and years, the iPlayer will Research rather than the iPlayer? Redux in the case of Redux, the investment is become a general browsing proposition, is a technical trial using a fully-automated substantially lower and the technology with demand driven by you or your media ingest and capture system, is largely could perhaps be more readily available friends rather than by the linear broadcast built on open-source technologies, and to 3rd-party broadcasters. schedule. However, today it is focused on does not use DRM. Redux is being used catching up with regularly-scheduled BBC within the BBC as a means for transcoding FK: We would like you to focus on radio and TV programmes. and providing content to BBC platforms. the iPlayer . What was the BBC’s It is a very convenient and flexible input motivation to develop the iPlayer? When we launched iPlayer streaming at system. Christmas 2007, the home page had only AR: Our motivation for designing the six featured programmes and that was all. In contrast, the BBC iPlayer is a well-staffed iPlayer has been to develop a consumer The BBC marketing team chose these six 24/7 operation with significant viewer proposition to satisfy the end user, i.e. the featured programmes. If you liked one of traffic. We make sure that a comprehensive BBC listener and viewer, in an age where these programmes, you were in luck as this metadata scheme is exactly right. An people are acquiring their entertainment was exactly what you could easily find, essential asset of iPlayer is the right level of from the internet, not just from their TV right on the home page. The problem content protection for files and streams, as set. What does the user really want? They was if you did not want one of those well as geo-protection, to address licence do not care about codecs and metadata programmes, you had to do a bit of work fee and content owner issues. taxonomy, they want to find content and browse by category, by day or search

The BBC iPlayer in a nutshell

The iPlayer is a web application – available at www..co.uk/iplayer – that allows internet users in the UK to download and stream BBC television and radio programmes for up to 7 days after the broadcast.

Users are able to download and stream programmes as soon as they have been broadcast on BBC TV and Radio. Users can keep downloads and watch them as many times as they like during the following 30 days.

For selected series, all episodes of the series are available for up to 13 weeks, known as Series Stacking. The iPlayer will in due course allow users to subscribe to a programme series and automatically download each programme after it is broadcast.

Recently, simulcast streaming was added, allowing users to watch TV live in addition to the on-demand catchup services.

The iPlayer services can be accessed on broadband internet-connected devices such as PCs, Apple Macs and computers as well as Apple iPhone, Nintendo Wii and Sony PS3 gaming consoles, Nokia N96 mobile phones, Windows Media compatible portable media players, and set-top boxes.

To follow new developments of the iPlayer, go to www.bbc.co.uk/blogs/bbcinternet/iplayer/.

The iPlayer now has over 1 million users per day, and up to 1.7 million stream and download requests each day. The iPlayer should reach the 300 million play-request milestone early in 2009.

2008 – EBU TECHNICAL REVIEW 5 CATCH-UP RADIO & TV

by name and so on. That might have BBC programmes would not add much It is interesting that the average viewing been a complex (or even unsuccessful) value for the iPlayer user. In one sense, time per programme did not change. We operation, so we tried to make it easier the programmes are all pretty good and found that people watch a programme to find a programme. marketed for different demographics. they chose for an average of 22 minutes. We also found that, on average, people The first home page design was essentially However, we need to develop more watched two programmes per day, giving “the BBC chooses what you watch”. personal recommendations – which an average viewing time of about 40 Then we added a “most popular” zone programmes are good for “you”. When minutes per person per day. About 35% on the home page – this was about what we changed the site by adding the above of programmes are viewed all the way to other viewers (rather than the BBC) selection mechanisms such as most the end. This is an excellent outcome, recommended that you should watch. popular, we made it much easier to find because our programmes are usually 30 And then we also added a “just in” programmes. Before launching these or 60 minutes long. feature for those items that have just features, we asked ourselves whether: arrived and “the last chance” feature for FK: What is the editorial relationship items that would disappear soon. Finally, l people would watch more pro- between the BBC website and iPlayer? we also added a “more like this” option grammes because they can find more How are they differentiated? as a sort of recommendation system programmes; (similar to that used by Amazon). These l they would make fewer page views AR: The iPlayer is a destination within content-selection mechanisms proved to (because navigation is better); the BBC website. In many cases a given be extremely useful and popular among l they would make more page views programme is available both within iPlayer users. (because they may browse more, as iPlayer and elsewhere on the BBC site, browsing is easier); allowing users to discover and view FK: The iPlayer does not use any ratings, l people would watch more programmes the programme in the context in which as opposed to ZDF’s Mediathek in but would watch for less time (they they were browsing the BBC site. For . Why? may see recommendations for other example, most people used the BBC sports programmes and would just click on site rather than iPlayer for the Beijing AR: Indeed, we have considered adding something else before finishing the Olympics. We’re promoting the iPlayer a rating mechanism, but we feel it’s only current programme). as the home for long-format content. useful where applying a rating is a means The sports site, the news site and other of recommending that programme to your Before we introduced these recom- BBC sites are typically focused on shorter friends, rather than rating the programme mendation changes, there were about formats, like news clips and programme in the way it’s done on YouTube. If you ten web-page views for every programme trailers. They also cover live events such have a video website with a million , played. After these changes were as the Opening Ceremony at Beijing: live possibly uploaded by the users themselves introduced, the number of pages viewed streaming was watched by over 100,000 and often of mediocre quality, then you dropped by 30% while the number of simultaneous users on the www.bbc. need a rating system so that users can say programmes played went up by 30%. co.uk website. A total stream capacity which are worth watching and which are These numbers showed that our changes of 45 Gbit/s was provided by the Akamai not. In contrast, when you only have actually helped people to find their content distribution network (CDN). For 600 programmes of professional quality, programmes more easily. Finally, the video coding, the On2 VP6 Flash format it adds little value to invite viewers to number of page views per programme was used. rate them. For example, how do you watched settled to about five and stayed rate a Parliamentary channel? Rating there. The consumption of Olympic programmes on the iPlayer was also very good. Many people who could not watch the Olympic Abbreviations events while broadcast on terrestrial, cable or satellite networks were able CDN RTMP (Adobe) Real-Time to use the iPlayer and watch those CPU Messaging Procol programmes delayed. For example, the DRM Digital Rights Management RTMPE RTMP – Encrypted Opening Ceremony was the most-viewed DVB-SI DVB - Service Information RTSP Real-Time Streaming programme on iPlayer. It added more HTTP HyperText Transfer Protocol Protocol than 20 percent to the iPlayer traffic after IP Internet Protocol VoD Video-on-Demand the event 1. ISP Internet Service Provider WMV (Microsoft) Windows LLU Local Loop Unbundling Media Video FK: How would you describe the structure PoP Point of Presence of the iPlayer system? Which are the principal layers?

6 EBU TECHNICAL REVIEW – 2008 CATCH-UP RADIO & TV

1) This interview was held during the Olympic Games. During the second week, as the Games moved into the final stage, the iPlayer consumption even increased by about 40%.

2008 – EBU TECHNICAL REVIEW 7 CATCH-UP RADIO & TV

AR: The iPlayer basically contains four layers, as follows: l iPlayer destination portal site – this is what everybody sees; l embedded media player – a Flash player which is used for media playout both in iPlayer and across the BBC site; l media production – to create the content that can be used by the Flash player and is invisible to most people; l a media distribution system.

FK: Could we start perhaps with the latter one first, please?

AR: For On2 VP6 streaming, we currently use the Akamai CDN, whereas for H.264 streaming we currently use the Level 3 CDN, which is one of the biggest CDNs in the USA (in August 2008, Akamai did not provide for H.264 streaming).

FK: Why does the BBC iPlayer not use a Peer-to-Peer solution? or two, P2P may again be the preferred in some cases, particularly if you have a choice. few programmes or a few files which are AR: The BBC has explored a range of downloaded by many people, because then distribution solutions, but P2P does not Of course, we know about Octoshape, there is a good peering efficiency. It does currently provide the optimal proposition Rawflow and a few others, and we not work well if you have an enormous for streaming. First, viewers do not want have investigated using them for iPlayer catalogue, because the downloaded file to install any specific plug-ins. Currently distribution. But we are very happy with only resides with a few peers. to use P2P you need to install extra our current CDN-based streaming sys- software. Second, P2P uses a computer’s tem; you click on play and the stream For the Kangaroo project 2, P2P might CPU and bandwidth, and most users starts to render in about 300 ms. The only work well for the 50 most popular generally do not like it. reason not to use a direct streaming facility programmes, but it will not be optimal could be cost and potential savings. to use P2P for a catalogue with lots of If you are going to download some items. content via BitTorrent, you may agree to For downloading, we currently use the use P2P, and many people are happy to Kontiki P2P system which currently We believe that the right approach is trade their bandwidth for free content. gives us a bandwidth saving of about 60 not P2P but caching at the edge of the But in the case of the BBC, where people percent, so it halves our bandwidth bill network. We only have 500 hours a week have to pay a licence fee of £130 a year, for downloads. But we have to run a very of video content, which means that one some are less than happy if we require complex server farm to make up the cost TB of storage is enough to store our entire that they use their bandwidth and install associated with it. catalogue. This can be more efficiently special software. This is especially true done by simply putting a caching service for people with low bandwidth and Actually the BBC is running a massive in our network. those who pay additional charges if they server farm itself, with over 200 computers, exceed a certain download limit. There and we have 92 percent free peering. In It may be a solution that, for the primary were definite and substantial benefits fact, our bandwidth really does not cost proposition, the user need not install any from using P2P two years ago, but in that us very much, at least not for downloads. plug-ins. But you can have a secondary time the price of bandwidth has declined If you look at all these various pieces, proposition which could offer better dramatically, such that today the use you wonder about the benefits of P2P. quality (say, high-definition TV). In this of P2P no longer provides substantial We believe that P2P works really well case, the use of a P2P plug-in may be benefits. Of course nothing stands still in the technology world and, in a year 2) Wikipedia article: http://en.wikipedia.org/wiki/Kangaroo_(video_on_demand)

8 EBU TECHNICAL REVIEW – 2008 CATCH-UP RADIO & TV

as the UK statistics indicate, that only 5 percent of users are multicast enabled. It is probably not worthwhile to put much effort into making a multicast system for such a small number of IP multicasting- enabled users. It is really a chicken and egg situation.

Nevertheless, we are considering in the forthcoming months to use JavaScript or other means to detect if users are multicast-enabled, and if so, we may be able to give these users a higher quality stream. If they are not multicast-enabled, they would only get a lower quality stream. In this way, both ISPs and the users would have an incentive to introduce multicasting. The users are likely to choose those ISPs that have been able to upgrade their routers and can offer higher quality streams.

FK: There has been recently a lot of noise in the UK about the increase in network load caused by the iPlayer justified, because distribution costs for but we need to build an agile architecture traffic. It seems that some ISPs have filed HD streaming are very high and could be that allows different transport layers to complaints with the telecom regulator? significantly lower by using P2P. be plugged in. We should separate the delivery layer from the content delivery AR: The press largely misrepresented the FK: How about a combination of P2P and formats, the DRM and the download situation by saying that due to the iPlayer, CDN, which is now increasingly used by manager, so that we can flexibly glue in the internet will collapse and everything both CDN and P2P providers? different propositions at short notice as will come to an end. Of course, this is needed. not true. We spent a lot of time talking to AR: With the iPlayer we have a bandwidth ISPs and we continue to meet with them bill which is not insignificant, but it We are of course monitoring the regularly. The reality is that about 7% is something we can afford. We do developments of Tribler and other future- of peak UK internet usage is due to the something like 100 TB per day of streaming generation P2P approaches, as well as iPlayer. So, the iPlayer service is only a traffic. This is a fairly significant amount hybrid systems where P2P is backed to a small fraction of the overall traffic and will of traffic. The cost of bandwidth is caching box, for example. certainly not cause internet failure. falling very rapidly and there is a lot of competition between the CDNs. FK: The BBC is renowned for its trials on In the UK, there are three classes of ISP IP multicasting. Could that be an option delivery networks: cable (example: Virgin At the moment the cost is not too for iPlayer distribution too? Media), LLU (Local Loop Unbundling) excessive. But imagine in a year or two and IP stream. when we have a TV set-top box with an AR: BBC Research has been trialling IP integrated iPlayer and millions of people multicasting for a while. Different parts The cost of reaching the end user with using it, and each of them consuming 1.6 of the BBC may have slightly different cable is very low. In the case of LLU, the Mbit/s for a TV stream. The bandwidth objectives. In our case, we just want the ISPs invested a lot of money in putting required would be 10 times what it is now. iPlayer to work for everybody: go to the some equipment in the local exchange, Obviously, if this happens we will have iPlayer site, find a programme on the resulting in a very low cost-per-bit. The a problem. The question is, what is the home page, click it and play it. Other third class, so-called IP stream, is a rented best solution for this problem. Is it P2P, parts of the organization, such as BBC bandwidth from BT Wholesale. should we make this new box with P2P Research, look further into the future, or shall we build an edge-caching solution and would like ISPs to build IP Multicast If you are looking for some figures, in conjunction with other broadcasters in their networks. Of course, we would there are in total about 5000 points of 2) Wikipedia article: http://en.wikipedia.org/wiki/Kangaroo_(video_on_demand) and ISPs? We do not know the answer like this as well, but the reality today is, presence (POPs) around the UK. About

2008 – EBU TECHNICAL REVIEW 9 CATCH-UP RADIO & TV

1500 of them are LLU enabled. About 30% of users are on cable. For cable and LLU the cost is relatively low, while for IP stream the cost of bandwidth is very high. This hurts those ISPs. There is no problem with the amount of bandwidth as the iPlayer is no way near reaching the bandwidth limit. However, our audience statistics show that iPlayer usage peaks in the hours between 6 and 11 p.m., which is also peak traffic for ISPs. The ISPs license the bandwidth for IP stream, based on peak usage. For this reason, iPlayer traffic is costing those ISPs. It is not just iPlayer, all traffic from YouTube, Facebook and other services is costing them. Our statistics indicate that this traffic is even larger than the iPlayer’s traffic.

The situation is quite complicated as some ISPs like Virgin Media (cable) are offering 50 Mbit/s packages. This encourages people to use more bandwidth. Virgin Media is happy with the iPlayer and higher bandwidth consumption. Other ISPs that offer an IP Stream service are iPlayer service for, say, £10 a month but during the night. If you leave your less happy because the iPlayer traffic is for £20, a much better iPlayer quality computer on and if, for example, you costing them more. would be available. watched Dr Who last week and the week before, it is likely that you will want to FK: So the situation is very complex, isn’t If we can create iPlayer in tiers, then ISPs watch Dr Who next week. For ISPs, it? How do you plan to resolve it? will be able to work out how to sell that. peak bandwidth is very expensive, but Every content provider should create it is cheap during the night. We know AR: The future lies in tiered services. such quality tiers and then ISPs will be that our top 20 programmes account for What we need to do is to create the iPlayer able to build business models around about 70 percent of all our bandwidth. In services at different quality levels and these propositions. This can lead to win- this way, most of our programmes could then let ISPs offer different bandwidth win situations and ISPs will see video be delivered during the off-peak hours, propositions to users. For example, services as a profit centre rather than a downloaded and stored on the user’s the user who enjoys higher bandwidth cost burden. local hard drive. Thus, peak bandwidth connections would pay more, and those usage could be significantly reduced. who are satisfied with lower bandwidth FK: Which bitrates are actually being This is really a mixed economy where connections would pay less. Of course, used for streaming and downloading? the difference between streaming and nobody should get a worse experience downloading is getting blurred. than today. We were offering streaming AR: Back at Christmas 2007, we started initially at 500 kbit/s. Today we are also with 500 kbit/s for live streaming and 1.2 In this scenario, our programmes will all offering 800 kbit/s and in three months Mbit/s for downloads coded in WMV be DRM’d and you will be able to either time we might be offering 1.5 Mbit/s. (Windows Media Video). Now, we stream them or download them. A person have introduced 800 kbit/s as well. In with a good network connection will be Some people will stay with 500 kbit/s, so the future there should be no difference able to stream, whereas the user with a they will not be able to experience our between downloads and streams but we poorer connection speed will download it high-quality streams. If you sign up with are going to make a range of different and watch after the download completes Virgin, you will be on a 20 Mbit/s plan bitrates, for example, 500, 800 and 1500 or even during downloading. and you can download a film in 6 minutes, kbit/s. rather than in one hour if you only have The prime user experience is and will a 2 Mbit/s line. So we could introduce The other thing we are going to do is always be the iPlayer website. Imagine a new scalable business model. For pre-booking. The user will be able to you go to the iPlayer website and you example, the user can get a good quality download automatically a programme want to play something. Of course, you

10 EBU TECHNICAL REVIEW – 2008 CATCH-UP RADIO & TV

should not look at your hard drive to find more successfully. With some older are potentially interested in using it for out what is on it, your web page should computers there is a problem. But with internet delivery. Does the iPlayer have now be smart enough to find out whether newer computers, again if you choose any plans to migrate to Dirac? the programme is already stored on your wisely, you can actually get a better local hard disk, and if it is, play it from experience. AR: At the moment we believe Dirac is there, rather than from the BBC server. probably better focused on high-quality This complete seamless integration of on- MPEG-2 is old and no longer in the video encoding rather than on internet line and local playout is what we would running, as bitrate requirements are transmission. If you look at what is like to implement in 2009. Another far too high. Two other candidates needed for successful internet trans- advantage is that users can simply unplug for encoding are Microsoft VC-1 and mission and for putting in the production their computer and watch the downloaded On2 VP6 or indeed On2 VP7. Many workflow (using TeleStream, AnyStream programme offline, for example, while on people have evaluated these, and other or some workflow software), you need a an airplane. codecs, and the outcome is that H.264 codec that you can put in the workflow is generally thought to be the winner. software. Then, you need a streaming FK: Recently the BBC introduced the But it is not always that clear cut. For server with a CDN that understands that H.264 codec for the iPlayer and some lower-end computers, On2 VP6 is the particular codec format. You also need to users complained about poor accessibility. best choice. On the other hand, if you have a rights protection model (DRM) Why? are targeting Windows computers and and the user’s computer needs a plug-in full-screen playback, I think Microsoft with a good renderer that can do frame- AR: H.264 requires more processing has done a really good job with the rate adjustments and so on. So, there are power and better graphics cards. We Windows renderer, so that VC-1 plays actually quite a lot of pieces that need to have spent quite some time looking at back beautifully, even on lower-end come together. this problem. There are a few H.264 Windows machines, but it does not work compression settings that produce brilliant well for the Mac. Currently Dirac is a stand-alone encoder results but which require a high-end and has not yet been worked into the computer and graphics card. If you have Microsoft Silverlight is a cross-platform different workflows. The Dirac player is a dual-core processor with a high-end application but it does not yet have not quite apt for real time on lower-end graphics card, it looks fantastic, you can the hardware-rendering capability that machines. There is no integration with do HD at 4 Mbit/s. However, if you have has, which is CDNs and no plug-in has been developed, a low-end portable computer, the quality unfortunate. as of yet. Therefore it is premature for is terrible, with the video running at 10 Dirac to be a consumer proposition at the frames per second or less. So you need FK: Is this issue the reason why the BBC moment but that will come with time. to carefully select the profile you use to also considers Adobe AIR? ensure the video plays back seamlessly on FK: How important is Digital Rights a wide variety of target computers. AR: Adobe AIR works fine with H.264 Management (DRM) for the iPlayer? and is a clear candidate for the download H.264 allows for three profiles – Base, solution with its DRM system, partially AR: It is too narrow to look only at Main and High – and for each profile because we have a requirement to be fully streaming and downloading. For general you can turn on different features. We cross-platform, and AIR runs on PC, Mac analogue or digital broadcast we do not have gone for Main profile and we and Linux. have any DRM or any obfuscation, so also turned on hardware scaling for people can do what they want, when- full-screen playback, as the default. In FK: Broadcasters often face the problem ever, with the content received. Live fact, we have now found that H.264 of codec licensing. What is your broadcasting is readily recordable and does not use more CPU power for the experience? there is no attempt to prevent people from configuration we have chosen, compared recording it. to the On2 VP6 codec. Rather, the AR: iPlayer is now using H.264 and the contrary is true in full screen mode and, question of licensing does not arise. If you As far as streaming on the internet is because we use hardware acceleration, use Flash, Adobe’s agreements cover the concerned, we do not use DRM (in the it uses less CPU power. The answer is playback licence fees. The BBC believes conventional sense of the word) but we that, if you are not careful, H.264 is that there is no H.264 per-stream fee use some stream obfuscation technologies. unplayable on low-end machines, but involved. Essentially, a stream must remain a stream, if you choose carefully, H.264 could it must not become a download. So if a be a pretty good user proposition. It is FK: BBC Research is developing an stream remains a stream, we believe we do bit more complicated than that because open source, licence-free codec called not need to DRM it. In order to prevent the older Mac computers have problems “Dirac”. The EBU plans to evaluate its a stream from turning into a download, with H.264 and can play On2 VP6 technical merits, as many EBU Members we use technologies such as RTMP or

2008 – EBU TECHNICAL REVIEW 11 CATCH-UP RADIO & TV

other technologies that make sure a stream probably pay less if there was no DRM, as AR: The open source community criticises remains a stream. the content would be available elsewhere. us for using Microsoft DRM and tells This is the main reason why the rights us we should use an open-source DRM FK: What experience with using RTMP holders demand DRM. In addition, it solution. We have done a lot of due do you have? is a requirement of the BBC Trust (the diligence and we have investigated all BBC governing body) that files are only the viable DRM solutions. We have met AR: If you link to a media file served available for 30 days after download and with companies that develop them and we from an HTTP server, your media player seven days after being broadcast. So these looked at the technologies themselves and will pop up and begin playing it and that are the reasons why we have to apply evaluated them. The reality is that, until is called a progressive download. The DRM to downloads. quite recently, Microsoft was the only played file would probably end up in viable one. It is free, secure and approved your browser’s cache and it would be Not all content owners however demand by Hollywood labels and approved by very easy to copy this link and place it in DRM. For example, we do not need DRM rights holders. It is easy to put on servers another application which lets you save it. for our parliamentary channel. However, and clients. The problem is, however, that The problem with this approach is that it with time and usage restrictions still in it is Windows only. becomes easy to save a file that is meant force, we do need to apply it. We have, to be streamed only. So we do not do that. of course, the open source community Other companies with DRM, for instance Instead, a lot of companies offer streaming saying that we should not use DRM at all. Apple, do not give access to the DRM solutions which do not let you easily save system. The only way to allow content the file. It will let your media player FK: You clarified why DRM should or to be available using Apple DRM is to throw away the segments of the file after should not be used for the iPlayer put content on the iTunes store and that they’ve been played, rather than allowing content, but then the question is which really means disaggregating our content. them to be saved to your hard drive. DRM do you use to control iPlayer usage? Therefore, we do not have BBC iPlayer

Microsoft has a solution and the product is called MMS. Then there is RTSP (Real Time Streaming Standard) which is an open standard, and Adobe has a proprietary standard called RTMP (Real Time Messaging Protocol) and another one, RTMPE, which is an encrypted version. The latter one offers better protection but requires more CPU power on the user’s machine. Currently we do not see the need for it, as there is no widespread evasion or hacking. We monitor regularly whether content hacking occurs and, at the moment, this is not the case. Also, as the same pro- gramme was broadcast in the clear the evening before, the cost benefit is not there and we do not really see the need to DRM our streaming content.

Now, for downloading our position is different. For downloading, we have to DRM our files for two reasons. First, the rights holders expect that the content will be available in the UK only. Second, content must only be available for a limited amount of time, so it can be commercially exploited, as is the case with BBC Worldwide’s licensing of the Top Gear programme. Broadcasters in the USA who pay BBC Worldwide millions of pounds for broadcast rights would

12 EBU TECHNICAL REVIEW – 2008 CATCH-UP RADIO & TV

content available in the iTunes store. AR: The answer is pretty simple. We use vendors to make sure that you are in the Apple would like us to give them our look-up tables of UK IP addresses, stored UK, even if Vodafone UK has a roaming content and put it in a bucket with a in a Quova database. These lists are agreement with France. million other programmes. For us that is regularly updated. We check the user’s equivalent to the BBC taking the content IP address and if it is located in the UK FK: What kind of arrangements do you of BBC 1 programmes and giving it to it is good and, if not, we say “sorry you have with the ISPs to provide you with competitors to put on their sites. This can’t have it”. the users’ IP numbers? is clearly not acceptable. We have asked Apple for access to the DRM but so far Why do we not use the Akamai AR: Quova makes those arrangements and they have not given us access. Geolocation database? First, it would lock it regularly updates the look-up tables. us into exclusively using Akamai and we There is a way for ISPs to also update The good news however is that other do not want to use Akamai for all services. this information. We are quite happy companies like Adobe are developing In fact, H.264 content is now being with these arrangements; there is 99.9% cross-platform DRM products. Adobe distributed via Level 3 Communications effectiveness. AIR now has DRM available for the PC, Inc. It is strategically better that we have Mac and Linux. We hope to have a cross- our own central control system. Second, FK: You have ported the content to platform solution by the end of this year we need to maintain the whitelists and mobile devices such as the Nokia N96 based on Adobe AIR and Adobe DRM. blacklists, so for example sometimes we mobile phone. Can you please outline want to set up a proxy to try and access the process for doing this? FK: iPlayer services are not available iPlayer outside the UK, so we need the outside the UK. At my home in means to control this ourselves and not AR: We have addressed the content I received a message “Not to rely on Akamai. creation not only for PCs but also for available in your area”. Why do you some portable devices. Previously, if constrain iPlayer to the UK territory? Another reason for not relying on a CDN you used Windows Media Video (WMV) company’s geo-location service is that we files and downloaded them onto your AR: Two reasons: one is the rights reason. really want to alert the user that the video , the WMV files may Licence holders sell their content in each won’t be available to them as soon as they either have been refused by the device or territory. Traditional broadcasts are view the iPlayer web page, rather than they were played with significant frame geographically targeted by the transmitter waiting for them to click the Play button dropping. As of September, we are now radiation and TV is generally very short and receiving a streaming error. creating content specifically for Windows range. But on the internet, streams can Media compatible mobile devices. We are go anywhere. Licensing models change We really need to know the geo-location at creating a special low-resolution version dramatically, they are still limited by, the time we render the web page, so that which is small enough to download and or are working within, a TV broadcast we can give the user a nice message saying play nicely on these devices. framework. The BBC is licensed to that the content is not applicable to the broadcast in the UK and these are the user: “Sorry you are not in the UK, you The standard resolution on these devices licence rights we typically acquire. cannot play TV but you can play radio”. is 320 by 240 pixels. At the moment If we just relied on the CDN company’s the resolution of our main PC profile The other reason is less obvious: public streaming service to enforce the geo- is 720 by 544 non-square pixels, which services are funded by licence-fee payers location, then the user would receive a gives best quality on a PC but it is not in the UK. As there is always a distribution stream error message and no explanation suitable for small devices, so we plan cost on the internet, it is not fair for why they cannot see the content. to make a number of special encoded a licence payer in the UK to pay for formats for these portable devices. As distribution to someone in the USA Things are getting more complex now with far as downloading these formats is watching the content. Even in cases where 3G access. For example, you may have a concerned, we will offer a number of we have rights to broadcast outside the UK roaming arrangement with Vodafone UK. different options. These formats may still or make content available outside the UK, If you are in France, our system may think be primarily available from the iPlayer site we would not do it in such a way that UK that you are still in the UK, even though intended for a PC, but we will develop licence payers fund the distribution. BBC you are actually in France. This is a new several custom websites intended for Worldwide might fund it or may cover challenging area. It is not a widespread downloading content to different mobile the distribution costs or may have ads to problem yet because roaming access is and portable devices, such as the Nokia support the model. For these two reasons, so expensive that it would probably cost N96 etc. we need geo-locking. you a fortune to receive BBC programmes abroad via a mobile phone and hence For certain devices which we think offer FK: Which geolocation system do you use few people try. However, we will need a great user experience, we plan to design and how effective is it? to tweak the IP lists and work with 3G a special version of the site. Such a site

2008 – EBU TECHNICAL REVIEW 13 CATCH-UP RADIO & TV

Anthony Rose is Controller of the Vision & Online Media Group at the BBC, where he heads a team of over 200 people who are responsible for the BBC iPlayer, embedded media player, social media, syndication, programme websites and other projects within the BBC’s Future Media & Technology division.

Mr Rose joined the BBC in Sep 2007, prior to which he was at /Altnet. During his six years with them, he worked on a host of projects and patents covering P2P networks, DRM-based content publishing and social networking services.

Prior to joining Kazaa/Altnet, Anthony Rose was Vice President for Technology at Sega Australia New Developments, developing real-time 3D animation and 3D graphics engines. will tailor the content automatically to the answer is broadly that we would like hard to get onto these STBs. In other the characteristics of the the iPlayer content to be there. To the words, it creates a huge amount of work (screen size, resolution, etc). The first of extent that device manufacturers are able for us but few consumers would use it. those devices was the iPhone. So if you to offer the iPlayer site experience, we The cost benefit really does not work out, go to the iPlayer site on an iPhone, you get would like to work with them. But, to which is why we are currently not working a nicely tailored web version. The BBC the extent that they would like to take the on this project. Today there are quite a will produce a custom version of the site BBC programming and put it in their own few different providers and the market is for a selected number of other mobile/ interface, broadly speaking, that does not still relatively small, say, several hundred portable devices, so that the media will work for us. It is not acceptable for the thousand subscribers. But this may change play automatically in the right format for BBC to just give away its content to other in the future. that device. websites that can then build a consumer business proposition around it. FK: Thank you for clarifying the most FK: Do you plan to bring the iPlayer to burning issues relating to the iPlayer. STBs and consumer devices such as TV If you “BBC IPTV”, you will see I am sure that the EBU Members will sets? announced plans to work on IPTV set-top find this article very interesting and boxes that are already open and available useful. Should they have any further AR: The answer is yes. The challenge is to either everyone or selected parties. questions, could they approach you that these devices often try to aggregate This is a very good second-generation directly? different content into one portal. To the IPTV proposition. One of the problems extent that the box is just a playout device is that often there are not many of these AR: Sure. You can give them my email like Windows Media Extender devices3, boxes on the market and it is really very address if appropriate.

3) Extender is a set-top box which is configured to connect via a network link to a computer running XP Media Center Edition or Windows Vista to stream the computer’s media center functions to the Extender device. This allows the Media Center and its features to be used on a conventional television or other display device. The household’s Media Center can be physically set up in a location more appropriate for its role, instead of being in the living room. Additionally, with an Extender, the Media Center can be accessed at the same time by several users. The Xbox 360 gaming console is a very popular example of a Media Center Extender.

14 EBU TECHNICAL REVIEW – 2008 CATCH-UP RADIO & TV

Note from FK: Since the interview (which the TV audience. The iPlayer is popular took place in August 2008), the iPlayer during office hours through the day but, has been under-going constant software as viewership peaks in the evening around development, with new features and 9 pm, heavy usage typically continues for functionality added almost every week. an hour longer than TV viewing.

At the Microsoft Professional Developers The BBC’s user data shows that the Conference in October, Anthony Rose iPlayer is used by a range of ages. 15- to successfully demonstrated the syncing 34-year-olds account for 37% of viewers of iPlayer content across computers and 35- to 54-year-olds account for 43%. and mobile devices, using a Microsoft A further 21% of users are aged 55 or desktop application called Live Mesh over and Huggers credited the iPlayer’s cloud, which is based on cross-platform popularity to it being easy to use. Silverlight technology. The application automatically synchronizes downloaded The priority is to make the iPlayer shows across all the iPlayer-compatible available on as many digital platforms devices on the person’s Mesh network. as economically possible. PC users still That includes Mac computers, which also account for the vast majority of iPlayer have a download client for iPlayer. viewers with 85% of the audience, with Nintendo Wii and Linux both accounting The new prototype iPlayer also featured for 1%. The popularity of the iPhone several social-networking features, such as and iPod Touch had taken the BBC future lists of the most popular shows watched media and technology team by surprise. by friends on the MSN Messenger list Apple Mac users now account for 10% and updates on which shows each of the of iPlayer viewers, while iPhone and iPod contacts had watched and downloaded. Touch owners account for a further 3%. iPlayer users will also be able to rate scenes from the show as they go along – using the Here are three conclusive quotes from “Lovemeter” – which shows the parts of Erik Huggers: shows that people like the most. “ The situations we’re seeing are Erik Huggers, the BBC’s Director of interesting – mum and dad are future media and technology, recently watching linear TV in the living stated 4 that the success of the iPlayer is room but the kids are watching in a proof that the corporation is right to bet different way ... on the iPhone, iPod its future on the internet. He stated that Touch or a laptop. ” the online TV catch-up service has served 248 m items of content since it launched “ Having seen all this and understanding officially on Christmas Day 2007. The more about the success of the service, iPlayer service that is available through the sort of users, when they watch it Virgin Media’s cable service alone has and what they watch ... I think the served 49 m videos since June 2008. The BBC is absolutely betting on internet soap drama, EastEnders, which pulls in protocol in a way where it’s not just an average of 18.9 million TV viewers for the distribution side of what the each month on BBC1 and BBC3, attracts internet enables. ” around 457,000 viewers on the iPlayer. The CBBC digital channel programme, “ We are completely re-engineering MI High, has a far higher proportion of the way in which we make fantastic viewership on the iPlayer: it has a TV programming. ” audience of 145,000, while 30,000 watch it on the iPlayer. Huggers insisted that the online audience did not cannibalise First published: 2008-Q4

4) Guardian story: www.guardian.co.uk/media/2008/nov/07/bbc-erikhuggers

2008 – EBU TECHNICAL REVIEW 15 OPEN SOURCE HANDHELDS

Open source

Handhelds– a broadcaster-led innovation for BTH services

François Lefebvre (Project Leader), Jean-Michel Bouffard and Pascal Charest Communications Research Centre, Canada

Emerging Broadcasting to Handhelds (BTH) technologies could be used to convey much more than the usual audio or video programming. For a long time now, broadcasters have imagined and standardized many new multimedia and data applications which, deplorably, did not succeed in the market.

In the first part of this article, we suggest that the open source handhelds which have become prominent as a consequence of recent technological trends, could also bring the emergence of broadcaster-led applications on mobile devices. In the second part, we will introduce the Openmokast project and describe how the CRC was able to produce, with very limited resources, the first open mobile-phone prototype, capable of receiving and presenting live broadcasting services.

There is a growing enthusiasm today This interest has led to the development are emerging in the quest to reach mobile for BTH services as presented in EBU of many new standards for BTH services users wherever they are with a maximum Technical Review by Weck & Wilson [1]. in the context of DAB: MOT transport, throughput. These services can either use broadcast Broadcast Web Site (BWS), SlideShow, standards such as DAB/DMB and ATSC- DLS, TopNews, EPG and TPEG. But only It appears as if BTH is standing at M/H or the standards proposed by the a few of these standards have actually been a juncture between broadcasters and mobile telecommunications industry implemented on commercial receivers. mobile network operators (MNOs). such as DVB-H or MediaFLO. These For example, BWS, which could be used Their business models are in conflict. technologies provide efficient delivery to provide very attractive information Broadcasters are naturally inclined to mechanisms for up-to-date information, services, is generally not supported on pursue and extend their current free-to- popular media and valuable data services current receivers. In the remainder of this air (FTA) services which are monetized to mobile users. In the present interest article, we will refer to those unsuccessful by public funding, licence fees and for rich media convergence, BTH and applications as the missing apps. advertising. MNOs, on the other hand, mobile technologies have complementary plan to deploy BTH services to generate features. BTH infrastructures promise The stagnation of BTH technological new revenue streams through cable-like to transmit large volumes of valuable advances could be explained by the subscriptions and pay-per-view models. content in one-to-many communications innovative and competitive wireless while wireless telecommunications communications ecosystem that is The Internet is also challenging the networks could provide the channels for thriving today. Several kinds of new traditional broadcasting model. It has one-to-one exchanges. wireless communications technologies lead to rapid innovation cycles that

16 EBU TECHNICAL REVIEW – 2008 OPEN SOURCE HANDHELDS

produce direct and disruptive benefits to reach profitability. This results in adds significant value to the software to end-users. For example, webcasting, steep barriers to entry for newcomers development chain. Individual peer-to-peer streaming and podcasting all in the field. Fortunately, the relentless and organizations work represent new and attractive alternatives push of Moore’s law has permitted the at innovating and solving issues with a to broadcasting. implementation of generic, compact “let’s-not-reinvent-the-wheel” approach. and powerful integrated circuits. These Instead, they build together a solid We suggest here that one of the key limiting components can be produced affordably common core which sustains their factors for broadcaster-led innovation lies and run on little power. This benefits respective business models. in the broadcasters’ limited control over smart phones too. the implementation of their standards In our opinion, the qualifying term into receivers. The market for broadcast Inside today’s mobile devices, specialized “open source” is not in itself very receivers is horizontal. Implementing hardware components are increasingly significant. The important factor broadcaster-led standards into mobile being replaced by generic ones. Flexible is to apply the proper rights to the phones is an even bigger challenge because application processors (APs) can be code, once it is written. This is where these devices are part of a vertical market. re-used in the design of many types of software licensing comes into play. That is, MNOs have a firm control over devices, thereby lowering the overall Projects are said to be free/libre Open which feature sets get implemented into production costs associated with specific Source Software (FLOSS) when their “their” devices. Understandably, they will applications. licensing terms offer flexibility in the likely promote their own feature sets before usage rights while remaining accessible those of broadcasters. As a consequence, Functionality goes to the largest possible community. The we can expect that BTH innovations that Foundation classifies are broadcaster-led are less likely to be software software licences and promotes implemented into mobile phones. Von In early-generation mobile phones, user those that are, in accordance with Hippel’s theory [2] would suggest that as applications were performed by low-level their criteria, genuinely open. The empowered “users” of the mobile phone software loaded directly on the device’s Open Source Initiative (OSI) is also technology, MNOs stand in a much better permanent storage. Only basic tasks a recognized reference body that position to innovate and compete. were supported: dialler, phonebook, leads the reviewing and approving device settings or ringtone selection. of licences that conform to the Open The following section will introduce There was no room for new applications. Source Definition [3]. emerging trends that could create Nowadays, generic and powerful APs new opportunities for broadcaster-led coupled with large storage capacities There is an obvious trend in motion innovations in the BTH space. Please see have led to better capabilities. This has now. Industry has adopted many open Appendix A: “Anatomy of a Handheld” for also significantly raised the importance source software solutions. Several OSS an overview of the components and termi- of software in handhelds. products are widely deployed: GNU/ nology that will be referred to throughout Linux, Apache, OpenOffice, Firefox, As- the remainder of this article. As a consequence, functionality features terisk, Eclipse and MySQL. Companies – based on software – present much using these tools see several benefits in Note: Throughout this paper, the term lower barriers to entry than for hardware the form of flexibility, independence, handhelds is used to refer to generic because of the fundamental properties of ability to fix, to enhance and to tweak pocket-sized computing devices which software: the software that they need. For all may provide connectivity to any kind of those involved, OSS is a key to of network. On the other hand, mobile l it can be duplicated at no cost; success. phones are a specialized category of l it can be distributed instantly and at handhelds that connect to MNOs’ no cost; The popularity of OSS has even reached infrastructures. l it can be developed with low-cost or one of the technology-sector bastions. free tools; We can now purchase consumer Trends towards l it can be modified, fixed and enhanced electronics (CE) products that do “run” with no impact on the manufacturing on OSS. Here are some important broadcaster-led chain. examples of such devices: innovation Software goes open l The Linksys WRT54G Wi-Fi wireless Specialized hardware router [4] is a product for which the The software industry is experiencing GNU/Linux was released. goes generic changes as open source software (OSS) Since this release, users have enhanced The manufacturing of handhelds is is gradually entering the domain. With its functionality to enterprise-grade expensive. Mass-production is imperative OSS, the global level of collaboration routers.

2008 – EBU TECHNICAL REVIEW 17 OPEN SOURCE HANDHELDS

l Neuros [5], an Internet set-top box manufacturer of motherboards for Canadian company, announced that it manufacturer, will shortly release its personal computers. Early on, the will port Android (another OSS distri- next product called OSD2. Neuros company made the bet that their best bution introduced in the next section) to relies partly on users and external option to become competitive with their the FreeRunner by the end of November developers to create or integrate new new products was to open 2009. This shows how organic and applications for their platforms which up a complete software stack that would efficient an OSS ecosystem can be. Just also use OSS. enable and leverage user innovation. a few months after its introduction, the In July 2007, FIC Inc. released its first FreeRunner device can already host l The DASH Express [6] is a new type “developer preview” prototype called the several new software platforms. of GPS car navigation system with Neo 1973 with the Openmoko software a two-way telecommunication data stack. Android channel. It can receive real-time traffic data information and can also To many developers, the Neo represents Android [9] was originally conceived offer Internet search for points of the first truly open mobile phone to be the fundamental building block interest. The DASH device came platform. The Neo incorporates several of new mobile devices. It is a software on the market a year ago and seems interesting connectivity options: GSM, distribution that includes an operating to be very successful in the USA. GPRS, GPS and Bluetooth. Interest- system, a middleware layer and some Interestingly, DASH is based on the ingly, the Neo 1973 initially shipped with key applications. It is being developed FIC Inc. Openmoko first hardware a primitive software stack that did not by the Open Handset Alliance (OHA), release. The DASH is a good example even allow the completion of phone calls. a consortium of 34 members including of a vertically-integrated device. It is We had to wait a few more months before Google, HTC, T-Mobile and other a great case study of how OSS can be the “phoning feature” became available important players in the field. used with the right licensing scheme through software updates. on closed business models. The Android Java-based software The Neo FreeRunner (GTA02), was development kit (SDK) was released in l Some developers have also enabled the next version of the device. It was November 2007 to allow application OSS frameworks on top of closed released in June 2008 with enhanced development long before any Android CE devices. The Rockbox project usability, hardware improvements and device was even produced. Since then, [7] creates open source replacement Wi-Fi connectivity. an application development contest firmware for several brands of sponsored by Google was initiated portable digital audio players like Openmoko was conceived to enable to stimulate the development of new Apple, and . Audio various business models. Openmoko Android applications. A total of US$5 codecs are amongst the numerous Inc., the company, does sell devices for million was awarded to 50 submissions features added: FLAC, WavPack, AC3 profit. Independent developers can sell featuring the most innovative applications (A/52), AAC/MP4 and WMA. These applications thanks [10]. The first Android-based mobile were not available on the iPod models to the LGPL licence which covers the phone (T-Mobile G1) was released in sold by Apple. Openmoko software stack. With the October 2008 in selected markets. DASH Express, FIC Inc. has also hinted Handhelds go open that vertically integrated devices could Initially, the degree of openness of be constructed on Openmoko. With Android was limited despite the fact source such a framework, both closed and open that the SDK was available for free. The applications are on a level playing field OHA recently decided to release the Today, there is an important push towards for competition. In this environment, Android open source project [11] which OSS on handhelds promoted by many users are just “one click away” from one will encompass most components of the industry players. Their interest to or the other type of applications. The Android platform. With the Android OS participate in the mobility value chain is end-user will decide which best fits his/ and virtual machine becoming open, new motivated by the current trends described her needs. exciting development projects are now above. Some key projects with the possible. This could even lead to the potential to impact BTH are described in Another interesting fact is that shortly creation of new hardware components. this section. after the release of the FreeRunner, Details to come about the licensing another developer community was able and the governance of the project will Openmoko to successfully port the Qtopia OSS ultimately reveal to what extent Android distribution onto the device. Qtopia had will be open. But in its current configura- Openmoko [8] is an OSS project that been developed some years before by tion, Android still presents a compelling was initiated by First International Trolltech, a company acquired by Nokia option for the design of open source Computer, Inc. (FIC Inc.), an important in 2008. At the current time, Koolu, a handhelds.

18 EBU TECHNICAL REVIEW – 2008 OPEN SOURCE HANDHELDS

Other open mobile works and reference implementations. common SBBs found in those packages. platforms Symbian, with a market share currently In order to build a final product, we had surpassing 50%, is still the platform to find a DAB reception platform and Other open platforms which do not deployed on most in the integrate it into the prototype. Other include mobile phone network interfaces world. major software components such as are available to create new handhelds as the receiver control unit, the bitstream well. These devices represent potential demultiplexer and the decoder had to be candidates for broadcast reception. The Openmokast project developed from scratch. We even had to None of the open software frameworks manufacture a physical extension to the The Nokia Internet tablet computers and devices encountered in our study did original handset to be able to embed the based on Maemo [12] are such devices. support digital broadcasting hardware. small USB DAB receiver and its required The Maemo platform provides lots of Since BTH is a main field of research antenna into our prototype. functionalities and shows great potential for our group at the CRC, we were for many useful usage scenarios. Maemo motivated to explore the possibility of The Openmokast is a software platform that is based on integrating broadcast functionality into OSS projects such as Debian GNU/Linux such an open device. When FIC Inc. software platform and GNOME. announced in February 2007 the launch CRC-DABRMS is a stable software of their open mobile phone prototype platform developed previously at the The Ubuntu Mobile Edition is another using the Openmoko framework, we CRC to control commercial computer- example of a GNU/Linux based software decided to initiate the Openmokast based DAB receivers. This original stack that could fulfil the requirements (“OPEN MObile broadKAST-ing”) project effort provided access to raw DAB for new handhelds. This effort was in our lab. bitstreams on typical personal computers. launched by Canonical Inc. to support CRC-DABRMS can decode signalling the development of an OSS distribution The aim of the project was to integrate a information contained in the fast for mobile internet devices. DAB receiver in a fully-functional mobile information channel and dispatch desired phone. We would design, build and sub-channels to various types of outputs. Another potential major advance for test, with a live DAB signal, a prototype This in turn permitted the demonstration OSS on mobile phones could come capable of decoding typical DAB audio and testing of new applications geared from Nokia. The company announced services as well as some of the missing toward DAB but not yet standardized. the creation of the Symbian Foundation apps. Based on our previous experience This system had been implemented for and the release of its Symbian OS as an in the lab, we chose to work with GNU/ Windows and GNU/Linux platforms. open source software [13]. This OS was Linux and other OSS packages. We have designed for mobile devices and comes learned that using OSS accelerates the The GNU/Linux version of CRC-DABRMS with libraries, frame- integration of a prototype by reusing was ported to the Openmoko platform

OPENMOKAST Raw Sub-Channel IP External Sub- h(s) UDP Wrapper D ecoder ETI FIC ETI File D ecoder

Raw Sub-Channel MP2 Audio File, File Writer Stream Data Raw, Sub-C h(s) O utput D AB Signal DMB Video File, ... FIC M anager

Sub-Ch(s) Sub-C h(s) RDI RDI FIC MOT Decoder + D ecoder FIC MOT Application: DAB Receiver: Packet Data Decoder Bosch U SB Journaline, R eceiver Slide Show, ...

D R Box 1 USB Rx Command (DCSR) C ontrol FIC Mtech UDR A3L Rx Notification (DCSR) D ecoder IP RTP Wrapper Media Player

DMB Transport IP Media Player DAB D ecoder Ensem ble Output Selection R x Com m and R x C om m and Rx Notification Info DAB IP Tunneling Rx Notification IP Packets

Telnet D ecoder + Ens. Info Request User Command Packet Data Decoder Ensemble Info Interpreter Output Selection Openmokast U ser Output Libraries

Figure 1 Architecture of the Openmokast middleware

2008 – EBU TECHNICAL REVIEW 19 OPEN SOURCE HANDHELDS

(a) (b) (c) Figure 2 Screenshots of a running Openmokast device

and renamed Openmokast. Porting it Two data applications were integrated Openmoko kernel. This should have involved recompiling the application for by re-using source code made available made the integration task easier but the new target AP. It also meant adapting under the OSS project called Dream [14]. unfortunately, the firmware for the unit the code while verifying that all required Dream is an SDR receiver for DRM which could not be uploaded correctly. We libraries would be available on the new includes an MOT transport protocol had similar experiences with other USB platform at runtime. The architecture decoder, as well as two of the missing receivers that did not succeed our initial of the Openmokast middleware is shown apps: Journaline and Slideshow (Fig. test runs. in Fig. 1. 2c). The packet mode decoder had to be developed in-house. Since we were able An alternative in our quest for the The original interface of Openmokast was to re-use code from other OSS projects, appropriate receiver was to obtain the command line. Later we developed these applications could be developed development kits offered by chipset a GUI using Openmoko’s GTK libraries. quickly and efficiently. This experience makers. We contacted many companies Screenshots of a running Openmokast are reinforced our views that OSS carries but none could provide the components shown in Fig. 2. At start-up, Openmokast significant benefits for developers. required. In general, their offerings presents a menu where the input device did not meet the needs of a research must be selected (Fig. 2a). The system The DAB receiver development project like ours. Those can also accept, as input, a locally-stored companies were unwilling to support our DAB multiplex file. This feature enables A fundamental component required in initiative which would likely not lead to off-line testing and development of new the design of the prototype was a DAB significant device sales. Furthermore, the applications without the need for a receiver. Our early analysis suggested development kits offered are expensive physical receiver and a live signal. that a USB receiver could be a good and the USB device drivers provided fit for the unit. First, the FreeRunner are usually built for Windows instead of Different applications were either has one USB port available. We also GNU/Linux, and the usage of those kits developed or were direct integrations of knew that the FreeRunner’s AP was not requires the signature of non-disclosure existing OSS projects. The standard DAB powerful enough to perform DAB signal agreements. We did not want to follow radio application was done with an HTTP demodulation in real-time. We therefore such a path of development and were wrapper which forwards MP2 audio to searched for a USB component where stalled for a while. the Mplayer media player. A package for this task could be performed efficiently Mplayer was readily available. The DAB+ on silicon. We continued our research and found an application was constructed in the same off-the-shelf product which could fulfil manner except that the transport protocol We first tested the USB Terratec DrBox. our requirements. The MTECH USB key had to be removed prior to forwarding the The device drivers (DABUSB) for this model UDR-A3L had all the features we AAC+ stream to Mplayer. device were already included in the needed. The libusb library was chosen

20 EBU TECHNICAL REVIEW – 2008 OPEN SOURCE HANDHELDS

to communicate with it. Libusb is a Unix suite of user-mode routines that control data transfer between USB devices and the host systems. We were able to establish basic communications between the MTECH key and the FreeRunner USB port. A reference to this key was added as a new input source in Openmokast (hardware 1 in Fig. 2a).

Prototype construction

The openness practices promoted by the Openmoko project went beyond our initial expectations. We found that even the CAD drawings for the FreeRunner mechanical casing were released to the public and available freely for download. Using those drawings for reference, we Figure 3 Figure 4 then modified the original mechanical ABS extension for Comparison of thickness between the design and produced a clip-on plastic FreeRunner original device and the Openmokast extension which provides the extra prototype space inside the device to embed the stripped down USB key. Fig. 3 shows this extension.

To manufacture the physical extension part, we contracted the services of a 3D printing shop. We sent them the modified Pro/Engineer formatted 3D file and received the finished ABS part within 48 hours for the price of US$90. We were happy to find that just like with OSS, even mechanical hardware components of a prototype benefit from the “democratization” of production means. The CRC will release the schematics of this extension under a non-restrictive creative commons li- cence.

Fig. 4 shows the thickness of the final Openmokast prototype. Figure 5 Figure 6 Inside the Openmokast prototype The final Openmokast The MTECH receiver was connected prototype directly onto the internal USB test points on the printed circuit board. In this Fig. 5 shows the prototype’s internals device was put together by a small team in configuration, it could draw its power while Fig. 6 shows the final product. a short period of time and it has performed from the device’s main battery. This setup Notice the colourful “skin” with our well since its release. The form factor also had the advantage of freeing the organization’s brand as well as the of the device is appreciated by current external USB connector on the handset. Openmokast logo. testers. Some DAB missing apps which This port remains the most convenient are usually not available on commercial way to recharge the device. Once Some results receivers could be demonstrated. disassembled, the FreeRunner and exten- sion exposed enough internal free space to We are satisfied with the overall test results The Openmokast prototype was install the dedicated L-Band antenna. of the Openmokast prototype so far. This introduced at the IBC 2008 exhibition

2008 – EBU TECHNICAL REVIEW 21 OPEN SOURCE HANDHELDS

and was hailed as the first open mobile the CRC-mmbTools LiveCD transmitter for two audio decoding scenarios broadcasting handset. [15] or from standard commercial (Table 1). equipment. The overall performance of the receiver Table 2 shows the power autonomy component is good. The MTECH The total CPU computational load estimates based on measurements made provided good signal reception in the measured for real-time DAB and DAB+ during three different usage scenarios. L-Band and Band III frequency ranges, audio decoding on the FreeRunner was under various conditions. The device low. The GNU/Linux tops utility was The Openmokast software was installed could receive signals coming from either used to estimate the processing cycles and tested on typical GNU/Linux PCs.

Codec Bitrate Mplayer Openmokast DAB+ Total type (kbit/s)

Scenario 1 Musicam 192 12.70% 1.00% 13.70% OPEN SOURCE HANDHELDS Scenario 2 HE-AACv2 64 14.30% 0.60% 0.60% 15.50% Table 1 CPU usage for two audio decoding scenarios (%)

Source Measured consumption Estimated autonomy

Scenario 1 Receiver in Band III 650-670 mA 1h49 Scenario 2 ETI file 220-230 mA 5h20 Scenario 3 None 190 mA 6h19 Table 2 Openmokast power consumption and autonomy with 1200 mAh battery

Abbreviations

ABS Acrylonitrile Butadiene Styrene (thermoplastic) GPS Global Positioning System AP Application Processor GSM Global System for Mobile communications ATSC Advanced Television Systems Committee GTK Tool Kit www.atsc.org GUI Graphical User Interface BTH Broadcasting To Handhelds HTTP HyperText Transfer Protocol BWS (DAB) Broadcast Web Site IP Intellectual Property CAD Computer-Aided Design LGPL (GNU) Lesser General Public Licence CPU Central Processing Unit www.gnu.org/licenses/lgpl.html DAB Digital Audio Broadcasting (Eureka-147) MNO Mobile Network Operator www.worlddab.org MOT (DAB) Multimedia Object Transfer DLS (DAB) Dynamic Label Segment OHA Open Handset Alliance DMB Digital Multimedia Broadcasting www.openhandsetalliance.com www.t-dmb.org OS DRM Digital Radio Mondiale OSI Open Source Initiative www.drm.org www.opensource.org DVB Digital Video Broadcasting OSS Open Source Software www.dvb.org SDK Software Development Kit EPG Electronic Programme Guide SDR Software Defined Radio FLOSS Free/Libre Open Source Software TPEG Transport Protocol Experts Group FTA Free-To-Air www.tisa.org GPRS General Packet Radio Service

22 EBU TECHNICAL REVIEW – 2008 OPEN SOURCE HANDHELDS

The code could execute successfully on the non-IP-free components must be led handhelds. If they did, chipset an emulator of the Neo 1973 available extracted from the overall distribution. manufacturers could be encouraged to through the Openmoko project. In A carved-up framework can be widely participate in the process. In fact, the this setup, both Openmoko as well as distributed along with the IP-free modules. Openmokast framework could easily be the Openmokast GUI could be tested. The non-IP-Free components have to be adapted to support other technologies We could not repeat a similar test for distributed separately with provision for such as DVB-T or ATSC-M/H by exploiting the FreeRunner configuration since no the correct attribution of the licensing current building blocks redundancy. emulator for this device exists to date. fees. We plan to use this approach for the Openmokast project. The framework In an open device with both BTH and Openmokast’s software could also be could be distributed openly without the mobile telecommunications network tested directly on two different GNU/ module that requires special licensing interfaces, we could hope that some Linux distributions, without the need for (MP2 and HE-AACv2 decoding, etc.. ). synergies would happen. It is a simple an Openmoko device emulator. We can matter of providing software on the hand- conclude from these experiments that Conclusions held to bridge those two networks. With we could deploy Openmokast on other Openmokast, we can claim at last that we platforms and operate DAB devices in In this article, we have identified trends have reached convergence. those environments. shaping the current environment that can impact the development of new BTH In an attempt to support our vision of Opening Openmokast services. For a long time now, Moore’s convergence and the new opportunities law has been the driver for advances in that arise from it, the CRC plans to During this project, we had to face the hardware and functionality that previously release, as open source software, several issue of Intellectual Property (IP) in was provided by specialized circuits – but tools which were developed for the relation to OSS implementations. In fact, is now done on generic chips. The Openmokast project (www.openmokast. it is important to realize that implementing shift to software in the implementation org). most of today’s international open of powerful handhelds is also a well- broadcast standards implies using established trend. With software and its Acknowledgements proprietary IP. As a consequence, an flexibility, developers have been able to implementation like Openmokast has create in a few months, thousands of new The authors would like to thank their to include proprietary algorithms or innovative applications for an Android or colleague Martin Quenneville for his work techniques to fully implement such a an iPhone. on the Openmokast prototype. Particular standard. Most of the time, licence thanks should go to Karl Boutin for his fees must be paid to the rightful holder We believe that a transformational force thorough review and to André Carr, René for each implementation of technology may be lying beyond the possibilities Voyer, Bernard Caron, Silvia Barkany “sold”. created by Moore’s law and flexible and Bruce Livesey for their comments software. The most important trend and support. Consequently, standards including that we have identified during our study proprietary IP appear to be a barrier to is that Open Source Software, once References OSS implementations for two reasons. integrated on handhelds, carries an OSS projects are given away for free enormous potential for innovation. This [1] C. Weck and E. Wilson: and do not generate revenues. It is also is a revolution in the making that could Broadcasting to Handhelds – impossible to control the distribution reach its true potential once the right mix an overview of systems and of such software. Therefore, it would of collaboration and openness is found. services be very hard to collect any licence fees. Broadcasters, if they choose to follow this EBU Technical Review No. 305, Some countries have recognized the need trend, could find new opportunities to January 2006 for free standards and are promoting the promote their technologies. This could [2] E. von Hippel: Democratizing adoption of new standards that are freely catapult broadcaster-led applications Innovation available and that can be implemented at in the foreground and push further the MIT Press, 2006 no charge. The issue of open source and deployment of new BTH networks. [3] www.opensource.org/docs/osd standardization was discussed in a report [4] http://en.wikipedia.org/wiki/ commissioned by ETSI in 2005 [16]. In this article, we presented several Wrt54g open handset projects, similar consumer [5] http://wiki.neurostechnology. One possible solution for implementations electronics devices and the Openmokast com/index.php/OSD2.0_ of non-IP-free standards in OSS is to design prototype developed at the CRC. Development frameworks with licensing schemes that We believe, following our study, that [6] www.dash.net allow the integration of IP-free as well as broadcasters have an opportunity now to [7] www.rockbox.org non-IP-free modules. With this approach, sponsor the development of broadcaster- [8] www.openmoko.com or

2008 – EBU TECHNICAL REVIEW 23 OPEN SOURCE HANDHELDS

www.openmoko.org Appendix A: such as Digital Signal Processors (DSPs). [9] http://code.google.com/android Anatomy of a handheld The AP just does not have the processing [10] http://code.google.com/android/ power needed. Besides, even if it could adc_gallery manage the computational tasks, the [11] http://source.android.com Fig. 7 depicts typical hardware and software related energy consumption requirement [12] http://maemo.org components on today’s handhelds. The would be prohibitive. Instead, DSPs are [13] www.symbianfoundation.org hardware building blocks (HBBs) operate used and these tasks can be efficiently [14] http://drm.sourceforge.net around an application processor (AP). accomplished while consuming less [15] P. Charest, F. Lefebvre: Software Given the portable nature of handhelds, energy. DAB Transmitter on Live CD special features support mobility: wireless Published in the IASTED WOC connectivity, GPS and accelerometers are One such set of DSPs are the wireless 2008 conference proceedings, such examples. Hardware Building Blocks (HBBs) found Québec, QC, May 2007 on the receiver presented in Fig. 7. As [16] Rannou & Soufron: Open Source Although APs can run most processing an example, the broadcast receiver front- Impacts on ICT Standardization tasks, some of the “heavy lifting” has to end filters the signal then digitizes and Report – Analysis Part – VA6, ETSI be performed by specialized processors translates it. The signal then emerges at

François Lefebvre joined the Broadcast Technologies research branch at the Communications Research Centre, Canada, in 1999 to lead its Mobile Multimedia Broadcasting team. Since then, he has contributed to numerous national and international standardization efforts and R&D projects. His recent work has focused on creating and developing open software building blocks for next-generation mobile broadcasting networks, devices and applications.

Mr Lefebvre graduated from Laval University in Electrical Engineering where he also completed his M.A.Sc. in 1989. He then moved to Europe where he worked for ten years as an engineer in R&D laboratories and as a freelancer on several multimedia and Internet projects in Germany.

Jean-Michel Bouffard graduated in Computer Engineering (B.A.Sc.) in 2003 from Sherbrooke University, Canada. He then joined the Broadcast Technologies research branch at the Communications Research Centre, in Ottawa, where he is involved in projects related to mobile multimedia broadcasting systems.

Mr. Bouffard’s expertise in multimedia communications and his interest in convergence led to studies about the applications of the participatory Web concepts to broadcasting and mobility. In the same vein, his current occupation focuses on enabling and promoting broadcasting on open mobile devices. He is also completing his M.A.Sc in Systems and Computer Engineering at Carleton University.

Pascal Charest graduated in Computer Engineering (B.A.Sc.) in December 2000 from Laval University, Quebec City, Canada. He then joined the Broadcast Technologies research branch at the Communications Research Centre, in Ottawa, as a Research Engineer. He has been involved in several research projects in the area of mobile multimedia broadcasting.

Mr Charest’s recent work has focused on system building blocks and open source software, with emphasis on encoding, decoding, modulation, device control and system integration. In addition, he is a member of the Club Linux Gatineau where he has served both as the Vice-President and President in the past.

24 EBU TECHNICAL REVIEW – 2008 OPEN SOURCE HANDHELDS

Figure 7 Typical hardware and software components of a contemporary handheld

2008 – EBU TECHNICAL REVIEW 25 OPEN SOURCE HANDHELDS

an intermediate frequency or directly at for pay services. Fortunately, since this between the layers and some software baseband. At this point, a DSP performs is not a requirement for FTA services, building blocks (SSBs) could actually the demodulation to produce a bitstream the design of broadcaster-led handhelds overlap two or three of those layers. A that can be processed by the AP. is simplified. complete im-plementation, including all SBBs required to operate a device, is Media processing normally requires some The rapid evolution of APs makes us often referred to as a software stack or a degree of powerful yet efficient processing believe that in the foreseeable future, distribution. It provides the device with and is performed using DSPs. Most software defined radios (SDRs) will all of the basic software needed to operate. current systems include specialized media perform the demodulation of broadcast processing units to decode media streams and other signals in real-time, with In the software stack described above, such as H.264 video and AAC audio. versatile wideband front-ends and A/D the lower layer components provide their converters on the AP itself. “Application Programming Interfaces” Another typical hardware component (APIs) to the upper layer components. that needs consideration in the design Three main levels of software are also Device drivers present the APIs of HBBs of a handheld is the conditional access depicted in Fig. 7: the operating system to upper software components. HBB. This function usually relies on (OS), the middleware and the application hardware to perform access management layer. There is no clear separation First published: 2008-Q4

26 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

SVC – a highly-scalable version of H.264/AVC

Adi Kouadio EBU Technical

Maryline Clare and Ludovic Noblet Orange Labs, France Telecom R&D

Vincent Bottreau Thomson Corporate Research

Scalable Video Coding (SVC) is a recent amendment to the ISO/ITU Advanced Video Coding (H.264/AVC) standard, which provides optional but efficient scalability functionalities on top of the high coding efficiency of H.264/AVC. In addition to bringing a cost-efficient solution to the delivery of different formats of the same content to multiple users, it can be used to provide a better viewing experience (enhanced content portability, device power / content-quality adaptation, fast zapping times and fluid forward / rewind functions, efficient error retransmission, etc.).

This article describes the potential of SVC, in terms of applications and performance. A brief overview of SVC functionalities, as well as practical use cases, are given in the following sections. Different performance evaluations, based on test results, are also described.

1. Introduction All these new video applications Communication) and outside of the are becoming a reality, thanks to home (3G, 4G, WiMax). Providing content everywhere is a major developments in transmission, storage goal for video service providers. In and compression technologies. Another Nevertheless, providing content addition to legacy broadcast TV, consumer enabler for this diversity of services everywhere in such an environment, while video applications today span: is the strong penetration of end-user achieving cost efficiency, is still a challenge devices such as HDTV flat-panel for video service providers. Enabling such l IPTV (over managed networks and the displays, portable multimedia players services implies the implementation of open Internet); (PMPs) and 3G mobile devices – and content-repurposing bricks in the service l catch-up TV; the availability of broadband Internet architecture, for transcoding between: l Video-on-Demand services (VoD); access ... providing high bandwidth l Mobile TV; connectivity into the home (xDSL, l multiple image formats (QCIF, CIF, l Web 2.0 content (including user- FTTH), within the home, between QVGA, VGA, SD, HD); generated) and media platforms, etc. devices (Ethernet, Wi-Fi, Power-Line l multiple bitrates (variable or constant

2008 – EBU TECHNICAL REVIEW 27 VIDEO COMPRESSION

according to the access networks); Scalable Video Coding provides the standardization bodies (MPEG, JVT) to l multiple frame rates (50 Hz, 25 Hz, appropriate tools to efficiently implement extend the standard and provide more 12.5 Hz) and; content scalability and portability. It is clarity on SVC performances. l different delivery platforms (with the latest scalable video-coding solution, different coding schemes). and has been standardized recently as 2. SVC overview an amendment to the now well-known Repurposing (transcoding) the content and widespread H.264/AVC standard [1] Scalable coding consists of compressing at any point in the chain generates extra by the Joint Video Team1 (JVT). Other a digital video into a single bitstream in cost, either for the service provider, video scalability techniques have been such a way that other meaningful and the consumer or the network operator. proposed in the past, (even standardized consistent streams can be generated by It also alters the user experience by as optional modes for MPEG-2 [3] and discarding parts of the original compressed being an additional hurdle to content MPEG-4 - Part 2 [4]) but they were less stream. Those sub-streams can be directly portability, which does not necessarily efficient and more complex; moreover, interpreted at different bitrates, different preserve the included DRM (Digital because of the (then) lack of market need resolutions or different time scales. Rights Management). Other alternatives, for scalability (video services limited to such as content simulcasting, would result standard-definition broadcasts), they were SVC organizes the compressed file into in higher bitrate requirements. never used. a base layer (BL) that is H264/AVC encoded, and enhancement layers (EL) Fortunately, while many different video In the following sections, we will first that bring additional information about codecs were used in the past (depending give a brief technical overview of SVC quality, resolution or (see on the targeted device for a given service), functionalities. The second part will Fig. 1). This implies that SVC base-layer there is today a general trend towards outline different practical use-cases streams can be decoded by H.264/AVC H264/AVC [1]. This codec has not only of the standard while the third part products (set-top boxes, PMPs), thus been widely implemented in set-top will describe preliminary performance ensuring backward compatibility for boxes, it is also soon to be generalized evaluation results. The fourth and last consumers not having the SVC upgrade. in mobile devices as well as in Portable part will describe ongoing work and More information about H264/AVC can Media Players (PMPs); it has even been the action taken by the EBU and other be found in [1]. introduced recently in the Adobe Flash 9 and Apple QuickTime media players.

With H264/AVC generalization, transcoding will soon be limited to format and bitrate adaptation, and there will be less need for several output codecs. This is a key point in considering the SVC scalable extension of H264/AVC [2].

Considering the continuously increasing number of possible combinations between formats and bitrates, smart content adaptation today becomes a key issue for achieving the “content everywhere” target.

Because of all these considerations, scalability and flexibility are key points for the near future of video services, whether these are new services or the evolution of existing services. Such scalability is needed not only at the architecture and infrastructure levels, but also at the Figure 1 content level. Overview of SVC layer structure (EL = enhancement layer)

1) The Joint Video Team is a joint working group from the ITU VCEG group and the ISO/IEC MPEG group at the origin of the H.264/AVC standard.

28 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

SVC provides spatial, quality and temporal scalability types (see Section Source video Multi-target compressed video 2.1) that can be combined at each level. The enhancement layers can be fully hierarchical, or not: SVC encoder l a layer can be generated (“predicted”) from another layer and also be a prediction for yet another one; l or a layer can be a prediction for two layers that are not hierarchically inter- 2 SVC enhancement layers dependent. AVC-compatible AVC-compatible base layer base layer Such encoding parameters depend on the targeted application. 1 SVC enhancement layer AVC-compatible base layer A compressed video bitstream is made up of Network Abstraction Layer (NAL) units, and each enhancement layer corresponds to a set of identified NAL units. A NAL unit is a packet with a header of a few bytes (containing Figure 2 information about the payload) and a Spacial scalability payload corresponding to the compressed information. A set of successive NAL units, sharing the same properties, forms The SVC standard’s ability to embed 2.1.2. Temporal scalability a NAL access unit. 4:3 and 16:9 picture aspect ratios is, for example, a very important spatial Temporal scalability defines the difference Depending on the context, the en- scalability feature, typically when in the number of images per second hancement layers may (or may not) be considering SD/HD broadcast. It should (expressed in Hz). Typical frequencies in transmitted by the network, and may be noted that, depending on the standard Europe are 50 Hz, 25 Hz or 12.5 Hz. (or may not) be decoded by the end user profile in use, the ratio between layers can device. In the first case, the network be fixed to a restricted set of values (see SVC extends the tools already provided integrates some adaptation modules – Section 2.2). by H264/AVC (hierarchical P or B slices deciding what to transmit and what to filter coding), structuring the bitstream into a (for instance, depending on the network Spatial scalability is provided by hierarchy of images, thus allowing the bandwidth characteristics). In the latter filtering / upsampling mechanisms and easy removal of the lower level(s) in the case, the terminal extracts the layers it can inter-layer predictions (Motion data hierarchical description. exploit. Such an adaptation mechanism prediction, intra-texture prediction and is based on packet selection / dropping. residual signal prediction). Each spatial Temporal scalability can typically be enhancement layer (EL) is referred to as used in video transmissions over mobile Setting up a service based on SVC a dependency representation. A predicted networks where bandwidth capacity can technology implies two important EL always indicates the reference layer change very often, or if the target terminal considerations, decision and adaptation, representation where it was originally has very low CPU capacities: in such cases, which are further discussed in Section 2.3. predicted. EL macroblocks are predicted it is interesting to drop the enhancement from reference layer macroblocks. They layers and send only the base layer (which 2.1. Scalability inherit motion vector values and other could, for example, contain only half the prediction data (texture and residual) number of images per second). 2.1.1. Spatial scalability from the appropriate reference layer macroblocks, after normative scaling and 2.1.3. Quality scalability Spatial resolution gives the video horizontal merging processes. and vertical dimensions in pixels, resulting Quality scalability is often referred to as in several well-known “video formats” Spatial scalability can typically be used for SNR (Signal-to-Noise Ratio) scalability such as QCIF (176 x 144 pixels), CIF (352 transmission of the same video bitstream and is intended to give different levels x 288), SD (720 x 576) and HD (1280 x to PCs and portable devices (see Fig. 2), of detail and fidelity to the original 720 up to 1920 x 1080). or to SD and HD television sets. video, while having the same spatial

2008 – EBU TECHNICAL REVIEW 29 VIDEO COMPRESSION

and temporal definitions. In an SVC- compressed bitstream, each spatio- Source video Multi-target compressed video temporal layer can have different levels of quality – each of them bringing additional SVC detail and accuracy. encoder

It is up to the encoding process to decide whether more detail will be added to random parts of the video images, or to some specific parts of given images. Differences in quality levels can be me- Client in mobility, Reached a WiFI dium (MGS, Medium Grain Scalability) 3G netw ork or large (CGS, Coarse Grain Scalability). access point CGS provides a quality difference of about AVC-compatible 1 SVC enhancement layer 25% between two layers while MGS base layer AVC-compatible base layer offers 10%. MGS uses a modified high- level signalling, which allows a switching between different MGS layers in any access unit, and the so-called key picture concept, which allows the adjustment of a suitable trade-off between drift and enhancement layer coding efficiency for Figure 3 hierarchical prediction structures. Quality scalability

With the MGS concept, any enhancement 2.1.4. Interlaced and entropy coding, etc.) that can be used to layer NAL unit can be discarded from a progressive scalability build up the stream, while levels specify quality-scalable bitstream, and thus packet- the constraints on key coding parameters, based quality-scalable coding is provided. SVC inherits interlace tools from the such as the number of macroblocks, the The possibility of very fine granularity H.264/AVC system: Paff (Picture adaptive bitrate. (FGS, Fine Grain Scalability), resulting frame/field) and Mbaff (Macroblock in bitstreams that can be truncated adaptive frame/field). SVC specifies four The SVC extension specifies three new anywhere, has been considered during types of scanning-mode scalability – with scalable profiles, which are closely related the joint MPEG/ITU standardization some subjected to coding restrictions, to H.264/AVC profiles: process, but was not finally selected – as depending on the interlaced coding mode the finer the quality scalability is, the of the base and enhancement layers: l Scalable Baseline Profile: aimed at low more complex the encoding/decoding complexity applications; process. Alternatively, the MGS concept l progressive to interlaced; l Scalable High Profile: aimed at allows the EL transform coefficients to be l interlaced to interlaced; broadcasting and video storage distributed among several slices. Thus, l interlaced to progressive; applications; the information for a quality refinement l progressive to progressive. l Scalable High Intra Profile: aimed at picture that corresponds to a certain professional applications. quantization step size can be distributed More information on scanning-mode over several NAL units corresponding to scalability can be found in [5]. different quality refinement layers. The levels are the same as the H.264/ AVC levels. However, the characteristic Quality scalability can typically be used 2.2. Profiles and Levels number of macro-blocks per second in for: an SVC stream is calculated according to Profiles define the set of coding tools the number of layers in the stream (see l HD transmission to customers that (for example, arithmetic or run length formula below). are eligible for HD (full quality) and people not eligible for HD quality, but still equipped with HD screens (top enhancement layer is dropped); l or for extra refinements when the bandwidth increases in mobile environments (see Fig. 3).

30 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

Tools\Profiles Scalable Baseline Scalable High Scalable High Intra Base Layer profile AVC baseline AVC High AVC High intra CABAC and 8x8 transform Only for certain levels Yes Yes Layer spatial ratio 1:1 , 1.5:1 or 2:1 Unrestricted Unrestricted I, P and B slices Limited B slices All Only I slices Interlaced Tools No Yes Yes (Mbaff & Paff) Table 1 SVC profiles – main differences

2.3. Overview of a service Source Reception architecture with SVC application application Once an SVC bitstream is generated, its scalability properties enable it to best D ecision & match the transmission conditions over a network path to a given location and Adaptation end-user device. One or more adaptation mechanisms can be implemented somewhere in the content delivery C ustom er N etw ork network – between the video streaming statistics source and the end-user device. Such parameters an adaptation process needs to obey Service a decision mechanism, based on the constraints complete context at the time the service is offered (terminal properties, subscription description flow content flow characteristics, available bandwidth, error rate, DRM, etc.). Typically, in an IMS/ Figure 4 TISPAN environment, it shall be noticed Bitstream adaptation that previously-mentioned contextual information can be processed in very close unicast session, or by using scalability present a few that are consensually relationship with not only access-control- properties for providing a better trick considered as potentially providing related functions of IMS/TISPAN, but also mode user experience; much benefit. with user-descriptive data. l At a network node – by reorganizing the video units to allow video 3.1. Quality of Service Depending on the application, the decision streaming over a sub-network with and adaptation processes might not be different characteristics; and enhanced user implemented in the same equipment. l At the edge – where a DSLAM can experience Data flows in decision and adaptation dynamically select video units (i.e. mechanisms are illustrated in Fig. 4. Such packets) for QoS, channel change, or 3.1.1. Fast zapping, fluid a decision can be static (made once only, at just eligibility management; forward and rewind the start of the transmission) or dynamic, l At the home gateway – for adaptively and can be implemented in different parts taking into account home networking The goal here is to improve the customer of the service architecture, for example: conditions. experience when using functions such as channel change or video fast forward / l At encoding – by splitting the video 3. Envisaged use cases rewind. If the current video is displayed units into different bitrate levels, all at a given bitrate and is decomposed on, assigned a different output stream, As mentioned earlier, the current at least, a base layer and an enhancement and sent to user groups having audiovisual landscape and its layer, then switching to the lowest layer heterogeneous network / terminal permanent evolution generate a strong allows us to either quickly visualize the capacities; need for scalability. We have already next channel (TV channel switch) or have l At a VoD server – by adapting the identified a few use cases where SVC a more fluid fast forward mechanism (VoD video transmission bitrate in a would be of some benefit; now we function).

2008 – EBU TECHNICAL REVIEW 31 VIDEO COMPRESSION

This is explained by the fact that less information is needed to describe lower- fast forward without SVC: n images/second at full resolution layer images, so the available bandwidth is used to transmit more, but smaller, images (see Fig. 5). Of course, the new decoded stream (a new channel in the case of a switch, or further sequences of the video in the case of a fast forward) offers a lower quality until both layers can finally be sent ... lower resolution images are, in a first phase, received and enlarged artificially to fit to the screen dimensions. It is, however, demonstrated that the human with SVC : n*4 images/second, at lower resolution: visual system needs a couple of seconds fast forward is more fluid before it is really sensitive to quality, so having a black image (channel switch) or => HD images not transmitted because of lack of bandwidth a non-fluid forward is more annoying than quickly seeing information – even if this leads to a poor quality image for up Figure 5 to two seconds. SVC for faster forward at lower resolution

Abbreviations

3GPP 3rd Generation Partnership FGS Fine Grain Scalability PLT Power-Line Transmission/ Project FTTH Fibre To The Home Telecommunication, also ATSC Advanced Television GoP Group of Pictures written PLC, BPL ... Systems Committee (USA) HDTV High-Definition elevisionT PMP Portable Multimedia Player AVC (MPEG-4) Advanced Video HHI Heinrich Hertz Institut PoP Point Of Presence Coding, part 10 (aka (German R&D lab) PSNR Peak Signal-to-Noise Ratio H.264) IC Integrated Circuit QCIF Quarter Common BL Base Layer IMS IP Multimedia Subsystem Intermediate Format CABAC Context-Adaptive Binary IPTV Internet Protocol (176x144 pixels) Arithmetic Coding Televison RWTH Rheinisch-Westfälische CAS Conditional Access System JSVM Joint Software Technische Hochschule CDN Content Delivery Network Verification Model SAMVIQ Subjective Assessment CGS Coarse Grain Scalability JVT (MPEG/VCEG) Joint Video Methodology for Video CIF Common Interchange Team Quality Format (352x288 pixels) LCD Liquid Crystal Display SDTV Standard-Definition DLNA Digital Living Network Mbaff Macroblock adaptive Television Alliance www.dlna.org frame/field SNR Signal-to-Noise Ratio DMB Digital Multimedia MGS Medium Grain Scalability SSIM Structural Similarity Metric Broadcasting MOS Mean Opinion Score STB Set-Top Box www.t-dmb.org NAB National Association of SVC (MPEG-4) Scalable Video DRM Digital Rights Management Broadcasters (USA) Coding DSLAM Digital Subscriber Line www.nab.org SWOT Strengths, Weaknesses, Access Multiplexer NAL Network Abstraction Opportunities, Threats DSP Digital Signal Processor / Layer TISPAN Telecoms & Internet Processing Paff Picture adaptive frame/ converged Services and DVB Digital Video Broadcasting field Protocols for Advanced www.dvb.org PLC Power-Line Networks EL Enhancement Layer Communication, also WG-IPTV Walled Garden IPTV written PLT, BPL ...

32 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

3.1.2. Integration of retransmissions at constant bitrate

In the case of frequent transmission errors, different mechanisms are set up to correct them and improve the quality (UURUVGHWHFWHGEDVHHQKDQFHPHQW 1RHUURUVGHWHFWHGERWK of service. One of these mechanisms DIIHFWHGE\SDFNHWORVV (UURUVGHWHFWHG EDVHDQGHQKDQFHPHQW EDVHUHWUDQVPLWWHGSDFNHWV consists of retransmitting those packets OD\HUVUHFHLYHG RIEDVHLQIRUPDWLRQ 14 14 B itrate B itrate that never reached the terminal side. Stream Bitrate Stream Bitrate However, such mechanisms either require R etries R etries 12 12 Congestion extra bandwidth or they slow down No Congestion the transmission rate because of the 10 Maxim um B andwidth 10 Maxim um B andwidth information overhead. 8 8

An interesting feature of SVC is the 6 6 ability to use the enhancement layer as Stream and Retries 4 4 a retransmission layer: in the case of are adapted errors, retransmission packets will be 2 Retransmissions without 2 R etransm issions with SVC : SVC : congestion no extra bandwidth needed inserted in such layers, thus providing Tim e Tim e 0 0 error correction at constant bitrate (see Fig. 6). The only price to pay is Figure 6 to accept a loss of quality since less SVC for integrating retransmissions at constant bit-rate enhancement data is then received – some enhancement data being replaced by resent base-layer data. Such a available bandwidth/ mechanism guarantees reception of a quality minimum level of information quality, by providing maximal protection to this information.

tim e 3.2. Continuity of service Figure 7 in mobile environment SVC for maintaining service in mobility

Mobile reception cannot rely on a stable bandwidth (it can drop when the 3.3. Portability for Video An immediate illustration of this network cell is overloaded or when the : Open characteristic is the naturally-convergent customer experiences a hand-over), but video platform called the Internet ! You customers are highly annoyed by service Internet illustration can access its portals from PCs, but also ruptures. The advantage of having a video now (and soon even more) from mobile decomposed into layers is that they can SVC offers an extremely simple way phones. As shown in Fig. 8, this results in simply be dropped and later reinserted, of using the “same video” source on an obvious need to watch given content on depending on the available bandwidth different terminals. No operation such a platform that is not decided in advance (see Fig. 7). as transcoding is necessary, so there is (i.e. at downloading time). no extra computing time and no loss of Such capacity is not only used in the case quality. This becomes extremely useful 3.4. Cost optimization of difficult transmission over a single when you don’t know in advance where network, it can also be used for the same and how you will finally use the video for on-demand long tail video transmitted through different types content you are interested in. SVC allows management of networks and towards different types you to buy the content, start watching it on of terminals: the same video is sent with a given terminal and finish watching it on As on-demand catalogues become larger a base layer only to mobile phones, but another. A very important consequence and larger in terms of available titles / PCs connected via 3G cards to the same for content owners is the fact that DRM is references, the costs for preparing network can receive the full video quality preserved, which is an essential advantage and repurposing such content become (base + enhancement). when compared to transcoding. higher and higher, even when using

2008 – EBU TECHNICAL REVIEW 33 VIDEO COMPRESSION

3.5. Fine adaptation to xDSL residential Get//use PC version eligibility from/for TV Transfer to G et TV m obile version 3.5.1. Optimal quality for a given eligibility level: illustration with HD screen without HD eligibility

G et PC G et m obile version version SVC helps here to define intermediate ADSL eligibility levels, to which specific quality can be offered. This can be used, for instance, to provide good SD programme quality to customers who cannot reach HD bitrates, but who can still have a much better picture quality than the one provided by the SD eligibility threshold. Extract mobile version from PC version The programme can be SVC-encoded, Figure 8 with an H264/AVC SD base layer, plus SVC for easy content portability management a first quality and / or resolution en- hancement layer – at a bitrate that is halfway between SD and HD bitrates automated workflows. Moreover, Content preparation and checking are – and a second enhancement layer for this cost inflation is also related to important costs within this process. HD customers (see Fig. 9). This would the target devices and bitrates (access Multiplying the instances to be generated, help improve the bad quality noticed by networks). in order to address multiple devices through customers viewing programmes on their multiple access networks, increases the HD screens if the quality of the video they While there are known solutions to overall on-demand processing costs, receive is not really HD. the cost model for the most popular especially for the long-tail content – even content, the question of efficiently with automated workflows. Such a feature can also apply to FTTH amortizing the preparation costs of the deployments. There will indeed be “long tail” (less popular content) still We believe that using SVC instead of a period of time when FTTH is not remains, even though the associated multiple H.264/AVC instances can available everywhere. SVC can provide a owners’ rights are less important than reduce the and operational natural way to deliver the same content at for block-busters. expenditures for the aforementioned different bitrates, qualities and resolutions, content processing. so when FTTH is present, an ultimate The above-mentioned costs correspond enhancement layer can be dedicated to to the following tasks, needed for the Moreover, we believe that SVC also the FTTH bandwidth capacities, thus general process of providing on-demand allows more efficient schemes for content allowing premium HD content to homes content: management within a content delivery that can receive it. network (CDN). Actually, it allows us l content capture (from tape, turn- to use access-network characteristics 3.5.2. Mutual dynamic access around, file, etc.) and editing may to manage the content distribution to bandwidth inside a home require indexation means as well as down to the edge. In that context, the metadata extraction / generation; combination of spatial (multiple devices) SVC SNR scalability provides an efficient l content preparation (encoding and SNR scalabilities (access-network means for achieving a better simultaneous and transcoding) and metadata capabilities) are probably the most Walled Garden-IPTV (WG-IPTV) and processing; anticipated features. Internet experience for the end user by l content integrity checking; adapting the Walled Garden IPTV video l content packaging (including Finally, storage costs are lowered because bitrate, depending on the bandwidth protection); the SVC global version is leaner than necessary for achieving the correct user l content provisioning; the cumulated AVC files of embedded experience on the Internet side within l content ingest. versions. reasonable limits. Moreover, if the

34 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

: Distances to DSLAM who do not have SVC functionality to => different eligibility still see their usual programmes (SDTV => best matching quality level or HDTV [1080i/25 or 720p/50]) if the latter is provided as the base layer of the SVC stream. SVC-compatible user will be able to access, for example, higher quality (1080p/50) signals stored in the enhancement layer as a premium HD non eligible, yet HD screens => 1st enhancement layer with service. resolution improvement HD eligible zone, HD It is worth mentioning that the support of screens : => 2 nd enhancement interlaced/progressive is very important layer with better for the broadcasting industry in order DSLAM quality to enable a smooth (for consumers) and efficient (for broadcasters) transition from an SD to an HD (hopefully completely progressive) audiovisual landscape. SD eligible zone only => AVC base layer, SD resolution 3.6. Other use cases

Figure 9 Even though they appeared to us as being SVC for fine eligibility level management of lower priority at the moment, other identified use cases should also be further investigated. internet video is encoded with SVC SNR between both programmes, providing scalability, it can also be adapted so that optimal trade-off in quality to each, l Premium content incentive (“teasing”): the impact over the WG-IPTV stream is depending on their properties and on the show free content that is missing not perceptible (including with fade-in/ target terminal characteristics. essential information (e.g. players fade-out mechanisms at the transitions). in a soccer game) but introduce this This adaptation can be either dynamically 3.5.3. Vector for efficient information as soon as content is paid managed on the network side, or under Premium HDTV Services for. full user control. With single-layer enabling l P2P streaming: introduce SVC as a video technologies, the implementation tool to take advantage of information of such dynamic mechanisms would The consumer adoption of H.264/AVC multiple-source distribution. require a transrating operation to be set-top boxes is still in progress. Bringing l User generated content: typical performed somewhere – and probably a completely new scalable system to content accessed by heterogeneous at the edge – which obviously does not market would imply a new set of products terminals and through different seem to be interesting in terms of network and interoperability issues with existing networks, which could make the most architecture and density for massive systems. Simulcasting several streams of SVC. processing and cost-efficiency. would require more bandwidth than using l Provisioning and video preparation: SVC on its own and would also require the how to analyse, index, pre-process, An extension of this use case is bandwidth user to tune in to the channel that provides check and assign DRMs to a single dynamic adaptation, for example if a the required quality/resolution. video within its different layers. second TV set (SDTV) is turned on when l Video mail server optimization: allow another (HDTV) is already being used. SVC base layer compatibility with access to only the lower version of In this case, bandwidth can be shared H.264/AVC would enable consumers the video when transmitted through

Definitions

720p/50 High-definition progressively-scanned TV format of 1280 x 720 pixels at 50 frames per second 1080i/25 High-definition interlaced TV format of 1920 x 1080 pixels at 25 frames per second, or 50 fields (half frames) every second 1080p/50 High-definition progressively-scanned TV format of 1920 x 1080 pixels at 50 frames per second

2008 – EBU TECHNICAL REVIEW 35 VIDEO COMPRESSION

a mobile network, and full resolution more levels included, the better adaptation/extraction to be performed when viewed via a residential access, gain compared to separate on it, somewhere between the stream or appropriate intermediate versions compressions, but the more production area and the end device. according to network and terminal diverse the spatial layers, and l For VoD services, we can compare conditions (SVC saves storage of extra information is required. SVC streams with the sum of the intermediate versions). encoded stored files (storage of the l H andheld terminal battery Assessing a new technology means yellow-coloured streams compared optimization: switch to lower defining tests to evaluate its performance to storage of the light blue-coloured resolution if autonomy (battery life) against alternative existing solutions in stream, noting that the indexing is is lower than a predefined threshold the same context. Defining SVC tests easier in the “multiple-versions-in-a- (and inform the customer that he or is not that easy because it is not a new single-file” solution). she still has a given number of minutes compression standard that you can l For content portability, we can left for visualization at the current compare to another older one, but a new compare the SVC capability with the resolution, and a bigger number of way to compress multiple representations transcoding applied to a first encoded minutes at a lower resolution). of the same information. Next, once the version of the video. l Heterogeneous terminals and access target application is chosen, we need networks for videoconferences, to define the embedded representation SVC performance evaluations on e-learning and video surveillance: SVC of the information, i.e. the different particular use cases have been conducted allows us not to impose the lowest ways the video can be exploited: spatial by different research labs, industries, and network and terminal capacities on enhancements (at which resolutions?), the JVT to quantify SVC performances the rest of the participants. quality enhancements (up to which both objectively (metric-based) and l Efficient signal monitoring on quality?), temporal enhancements (which subjectively (visual quality). A summary contribution links by decoding only a frequencies?) ... or a mix of all these of those test results is described in the low resolution instance of the signal enhancements? following paragraphs. instead of decoding the full video stream. All of this implies that the comparison 4.1. Performance depends on the targeted application: evaluations 4. Technology evaluation l For multicast service environments, we can compare an SVC stream 4.1.1. Visual quality The MPEG committee defined a transmission to the sum of the assessment test by Orange “requirement” document for SVC prior simulcasted information encoded with Labs to the actual standardization process, in H264/AVC as illustrated in Fig. 10. which the most important requirements Simultaneous transmission of the Orange decided to state the criteria that were: yellow-coloured streams (on the left were essential for current audiovisual of the diagram) has to be compared services and to evaluate difficult situations l To provide a standard compatible with with transmission of the unique in order to have a low anchor and the state-of-the-art, i.e. H264/AVC. blue-coloured stream on the right – recognize that only better results can be This point has been fully addressed keeping in mind that it requires stream expected in the near future. since each SVC file base layer is H264/ AVC encoded; l To compress, in a single stream, different versions (e.g. different SD resolutions or different qualities) of the same video: – more efficiently than if the SD different versions are separately +

H264/AVC-encoded, then used HD HD together (Simulcast) – this point is addressed; – with 10% maximum of additional 1 unique bitstream : information compared with an 2 separate AVC SVC bitstreams : multiple data amount for same H264/AVC encoded stream of instances information maximum resolution/quality. This point depends on the use Figure 10 case scenarios,. Basically, the H264/AVC single layer vs. SVC single stream

36 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

Orange chose the “ADSL constant eligibility They used MPEG-ITU JSVM (Joint visual quality than the single-layer level” criteria as a mandatory point for Software Verification Model) to generate HD AVC stream for bitrates greater introducing new technologies; in other both the H264/AVC and SVC files. than 7 Mbit/s. Transmitting both SD words, Orange decided to evaluate what it and HD as single-layer AVC would would mean to provide SVC-encoded TV The visual quality assessment was require a higher bitrate (the sum of and video at the current service bitrates. performed on a 46-inch LCD display, the respective streams’ bitrates) than Orange then had to compare a single layer following the SAMVIQ (Subjective the SVC stream. H264/AVC encoded-decoded HD file Assessment Methodology for Video For an HD AVC single-layer bitrate (bottom left yellow file of Fig. 10) with Quality) method [6][7]. of less than 7 Mbit/s, SVC needs, for the completely encoded-decoded HD SVC providing a similar visual quality, an file containing an embedded SD base layer Fig. 11 and Fig. 12 show both sets of test extra bitrate corresponding to the SD (blue file of Fig. 10). results. AVC single-layer one. l In critical cases, the loss between It is important to notice that, unlike Several conclusions can be drawn from H264/AVC single representation and evaluating the “classical” compression these tests: SVC maximum representation can method where we compare the bitrate reach up to 10 MOS (Mean Opinion difference for a given fixed quality, here l The SVC stream (with embedded SD Score) points, which is a noticeable the quality loss at constant bitrate is being and HD) provides similar to better difference. evaluated. Beware, it doesn’t mean that SVC decreases the quality when compared 100 to previous methods (otherwise, what is the point of considering it to replace previous methods!). It means that this is 80 an evaluation of the side effects caused by scalability: the extra cost required for the “maximum” version (no attention being 60

given to the fact that “minimum” versions Fair Good Excellent are then intrinsically transmitted too since Scores 40 they are embedded in the file) compared

to the way that such a version is encoded Poor with other methods. Subjective quality scores 720p50_AVC 20 for all the scenes 720p50_SVC Explicit Ref. Tests have been performed at different Bad Hidden Ref. bitrates: 12 Mbit/s, 10 Mbit/s, 8 Mbit/s, 0 6 Mbit/s and 4 Mbit/s. 3 4 5 6 7 8 9 10 11 12 13 Bitrate (Mbit/s) For testing bitrate n, an H264/AVC file was encoded completely at n Mbit/s, and 100 the SVC file had a base layer for SD always encoded at 2 Mbit/s, plus an enhancement 80 layer encoded at n-2 Mbit/s. Scores More precisely, the tested scenarios 60 were: Fair Good Excellent a) H264/AVC (720p/50) versus SVC 40 (576i/25, 720p/50) b) H264/AVC (1440x1080i/25 2) versus Poor Subjective quality scores SVC (576i/25, 1440x1080i/25) 20 1080i_AVC for all the scenes 1080i_SVC Explicit Ref. Bad These figures are summarized in Table 2. Hidden Ref. 0 3 4 5 6 7 8 9 10 11 Bitrate (Mbit/s)

2) 1440 samples per line are subsampled Figure 11 (upper) Figure 12 (lower) from a 1920 samples/line source signal. Results for tests A Results for tests B

2008 – EBU TECHNICAL REVIEW 37 VIDEO COMPRESSION

H264/AVC SVC bitrate (Mbit/s) bitrate (Mbit/s) Base Enhancement Global

4 2 2 4 6 2 4 6 1280x720p/50 8 2 6 8 10 2 8 10 12 2 10 12 4 2 2 4 1440x1080i/25 6 2 4 6 8 2 6 8 10 2 8 10 Table 2 SVC/AVC bitrate figures for visual comparisons l In non-critical cases, the difference is be decoded by those having an H264/AVC AVC base layer is either to encode it as always noticeable, but not penalizing, currently-deployed decoder and no SVC 720p/50-60 or 1080i/25-30? Thomson since the range of appreciation is decoder. So we simulate here a base that Corporate Research has performed a generally maintained (e.g. “excellent”, fulfils current TV service requirements. first objective assessment using the SVC “good”). reference software and looking at rate l SVC performs better if the base layer is 4.1.2. Performance evaluation distortion curves. of good quality since all enhancement by Thomson’s Corporate predictions rely on it. Research Test conditions l If a sequence is originally interlaced, then SVC performs better with Since January 2005, H.264/AVC-based We have used the JSVM software for a interlaced-to-interlaced scalability HD broadcasting has been rolling out in 2-layer case: than when mixing interlaced with the market with millions of receivers being progressive layers. shipped (DirecTV, Echostar, BSkyB,...). l either 720p50/1080p50: progressive- l If a sequence is originally progressive, The HD formats are either 720p/50-60 to-progressive inter-layer prediction; then SCV performs better with or 1080i/25-30. l or 1080i25/1080p50: interlace-to- progressive-to-progressive scalability progressive inter-layer prediction than when mixing interlaced with For premium services (e.g. Sports), using coded field pictures. progressive layers. broadcasters want to broadcast 1080p/50- 60 while maintaining the existing customer The encoder settings were as follows: Please note that all these results have been base. Broadcasting of 1080p/50-60 l Scalable High profile; obtained with only the publicly-available SVC bitstreams enhances video resolu- l Hierarchical GoP size 4 (IbBbP coding versions of SVC software (MPEG/ITU tion from 720p/50-60 or 1080i/25-30. structure); JSVM). Needless to say, these versions Enhanced set-top boxes could be provided l Intra period = 32; are not as optimized as future in-dustrial to premium customers, with additional l 100 first frames; implementations will surely be when 1080p/50-60 SVC (and H.264/AVC) l H.264/AVC base layers encoded at available. capability – without the need to exchange fixed (constant) bitrates (CBR) recently shipped set-top boxes. – R0 = 4 Mbit/s; The tests were designed with the very – R1 = 6 Mbit/s; precise goal of identifying the impact on a) 720p/1080p versus 1080i/1080p – R2 = 8 Mbit/s; quality when replacing H264/AVC with comparison – R3 = 10 Mbit/s; SVC at constant eligibility (i.e. constant l For each rate, SVC enhancement layers bitrate, CBR) for current services. This Today, broadcasters are mainly using were encoded using quantization constant bitrate affects the highest level, 1080i/25-30 for HDTV. Tomorrow, parameter QpEL = QpBL + {0, 4, 8, here HD resolution, since we impose source capture will be more and more 12}. that the SD base layer within the SVC 1080p/50-60. SVC enables a potential file is encoded at today’s SD eligibility migration towards 1080p/50-60 using Results level, i.e. 2 Mbit/s. This is indeed the a backward-compatible solution with way to ensure compatibility with existing H264/AVC (base layer). The main Some results of these tests are given in Figs services because the SD layer can simply remaining question for the H.264/ 13 to 18 on the next two pages.

38 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

Figure 13 Results for sequence EBU_dance at rate R0

Figure 14 Results for sequence EBU_dance at rate R3

Figure 15 Results for sequence SVT_CrowdRun at rate R0

2008 – EBU TECHNICAL REVIEW 39 VIDEO COMPRESSION

Figure 16 Results for sequence SVT_CrowdRun at rate R3

Figure 17 Results for sequence SVT_ParkJoy at rate R0

Figure 18 Results for sequence SVT ParkJoy at rate R3

40 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

Conclusions critical sequences it needs up to 10% more configured to allow read access only, bitrate to achieve similar or even better using the parameters specified in Table 3. Using an H.264/AVC 720p/50 or 1080i/25 quality than the single-layer stream. For Write access to the JSVM software base layer with a 1080p/50 enhancement further information, please refer to the server is restricted to the JSVM software layer provides approximately the same following document [8]. coordinators group. PSNR (Peak Signal to Noise Ratio) results, in terms of compression effi- 4.2. Implementation and 5. Standardization ciency (a difference of less than 0.5 dB). Furthermore, it can be noted that the gain complexity information in bitrate compared to simulcast is shown and confirmed (~30%). At the moment, only the reference As the audiovisual ecosystem has evolved software implementation, JSVM, of the in multiple dimensions (services, terminals, Knowing that PSNR doesn’t always well standard (now in version 9.12) is freely network access) and the services may also correlate with visual perception, further available. Research labs such as HHI or be rendered through retail devices, the subjective visual quality assessments the industry are constantly developing op- need today for interoperability has never should be conducted on a wider set of test timized versions of the encoders. been so critical. Standardization bodies sequences to assess the PSNR results. become interested in a technology when Encoder-complexity evaluations still need many service operators and industrial 4.1.3. Performance evaluation to be done, even if it doesn’t seem be an companies begin to take an interest in it. by JVT issue with the software version. Thus, many standardization bodies and open forums have announced working Early this year, the JVT group released a Additional decoder complexity in groups on SVC: report on SVC performance evaluation comparison with H.264/AVC is believed over a set of typical test cases [8]. The to be limited. Indeed, the SVC design l DVB: DVB is of considerable influence tests included: specifies a single motion compensation in the world of audio/video codecs, loop (by imposing constrained intra and the DVB H264/AVC groups l objective quality evaluation using prediction within reference layers). Thus, have included SVC in their work the PSNR and SSIM (Structutral the overhead in decoder complexity for programmes for mobile broadcast SIMilarity metric) methods; SVC compared to single-layer coding (DVB-H), 1080p and IPTV. The IPTV l subjective quality evaluation using two is smaller than that for prior video- groups are now considering SVC in assessment methods based on ITU-R coding standards, which all require their content-delivery task forces. Recommendation BT.500. [9] multiple motion compensation loops l 3GPP: SVC will be presented for its at the decoder side. Additionally, each impact on architecture. It has to be noted that the different test quality or spatial enhancement layer NAL l Open IPTV Forum: since the first scenarios only considered progressive-to- unit can be parsed independently of the phase relies on currently available progressive image-format scalability, while lower layer NAL units, which may further codecs and is near to being completed, discarding interlaced-to-progressive (SDI help in reducing the decoder complexity. a second phase will be started soon to 720p/50 or 1080p/50), or progressive- and we have offered to include SVC in to-interlace (less probable), or even In order to keep track of the changes working topics for both device codecs interlaced-to-interlaced (SDi to 1080i/25) in software development and to always and IPTV infrastructure. scalability, which might be valuable for the provide an up-to-date version of the JSVM l ITU Focus Group IPTV: liaisons are broadcasting industry. software, a CVS server for the JSVM active with DVB on the codec topics. software has been set up at the Rheinisch- l EBU: the Delivery Management The overall conclusion of the test was that Westfälische Technische Hochschule Committee (DMC) started a project SVC will provide a 17 to 34% bitrate gain (RWTH) in Aachen, Germany. The CVS group D/SVC in June 2008 to compared to simulcast, depending on the server can be accessed using WinCVS investigate the objective and subjective application. On the visual quality side, for or any other CVS client. The server is performance of SVC in broadcasting. The group is open to both EBU authentication: pserver Members and the industry. host address: garcon.ient.rwth-aachen.de path: /cvs/jvt There is still a need to see if: user name: jvtuser password: jvt.Amd.2 l DLNA (Digital Living Network Alliance, module name: jsvm or jsvm_red home network standardization Table 3 activities) would be interested in CVS access parameters including SVC as part of a work item.

2008 – EBU TECHNICAL REVIEW 41 VIDEO COMPRESSION

Adi Kouadio obtained a B.Sc. and M.Sc. in communication systems at the Swiss federal institute of Technology (EPFL) in Lausanne, Switzerland. After working for Fastcom Technology SA on several projects relating to video-based object recognition, he joined the EBU Technical Department in 2007. Here, he is heavily involved in studies and evaluations of compression systems for various broadcasting applications (HDTV production, contribution and distribution).

On behalf of the EBU, Mr Kouadio liaises with the ISO/IEC JTC1/SG26/WG1 (JPEG) and ISO/IEC JTC1/SG26/WG11 (MPEG) working groups.

Maryline Clare graduated from INSA (Institut National des Sciences Appliquées, an engineering school in France) in 1989. She gained 10 years of still-image coding experience at Canon Research France and was heavily involved in the JPEG2000 standardization effort, during which time she was both “Head of the French Delegation” and the “Transform - Quantization - Entropy coding” group co-chair.

Ms Clare has continued working in the advanced video compression domain, first in Canon Research France then in Orange Labs, France Telecom R&D in Rennes. Starting with the AVC standard (Advanced Video Coding, MPEG-4 - Part 10, ITU H.264), she then closely followed its scalable extension SVC (Scalable Video Coding), on which she is currently leading a project in Orange Labs.

Ludovic Noblet received an Electronics and Computing Systems Dipl.-Ing. from the École Polytechnique de l’Université de Nantes, France in 1992. He started his career at Alcatel where he was in charge of introducing internet technologies within the Alcatel private network (Alcanet). He then moved to Thomson Corporate Research in October 1994 where he was involved in designing and contributing to the development and success of three successive generations of MPEG-2 encoders and one generation of MPEG-2 decoder. In 2002, he started working on the very first H.264/MPEG-4 AVC encoding implementations for Thomson’s first generation of SDTV and HDTV AVC encoders.

In 2004, Mr Noblet joined France Telecom as an IPTV architect and senior technical advisor for the introduction of H.264/MPEG-4 AVC and high-definition within the Orange TV service. In September 2006, he was appointed head of the “Advanced Video Compression” team at Orange Labs.

Since December 2004, Mr Noblet has been the Orange representative at DVB, in both commercial and technical ad hoc groups for defining the use of A/V codecs in DVB applications. He also represents Orange at the Open IPTV Forum on the same topics.

Vincent Bottreau received the french state degree of Physics Engineer from the École Nationale Supérieure de Physique Strasbourg (ENSPS), France, in 1999. He also received the DEA (diploma validating the first year of a Ph.D. programme) in Photonic and Image Processing from the Université Louis Pasteur, Strasbourg, France, in 1999.

From 1999 to 2003, Mr Bottreau was with Philips Research in Suresnes, France where he worked as a Research Scientist in the Video and Communication group. In 2003 he has joined IRISA as a Research Expert in the TEMICS team. Since 2005 he has been an R&D engineer at Thomson R&D, France. His research activities include video coding, with a special focus on motion estimation, scalability and transcoding. He is in particularly involved in the ITU and MPEG standardization activities and, more specifically, on SVC.

42 EBU TECHNICAL REVIEW – 2008 VIDEO COMPRESSION

We believe so, since today there is a to SVC (i.e. simulcasting, multiple Video Quality considerable lack of standardization description coding, transcoding). Report by EBU Project Group B/VIM for retail devices. DLNA might be l Reference test sequences and full (Video In Multimedia), May 2003. a good candidate since it deals with source-quality video should be [7] F. Kozamernik, V. Steinman, P. Sunna home networking matters. provided in all common media formats and E. Wyckens: SAMVIQ – A New l TISPAN would need to investigate (CIF, QCIF, SDTV, HDTV, etc.). EBU Methodology for Video Quality if and how SVC may be of impact l Visual assessments should be redone Evaluations in Multimedia over the next generation network when optimized encoders are available IBC 2004, Amsterdam, pp. 191 – architecture it is defining, mostly in because there is obviously room for 202. terms of flows between functional improvement. blocks, not only in the content traffic l Encoder complexity evaluation. [8] N9577: SVC Verification Test plane, but also in the control one. Report l ATSC (US) and DMB (Korea) have 7. References ISO/IEC JTC 1/SC 29/WG 11, Antalya, announced doing some work on SVC, TR – January 2007. or have included requirements for [1] Thomas Wiegand, Gary J. Sullivan, scalability functionalities. Gisle Bjontegaard and Ajay Luthra: [9] ITU-R Recommendation BT.500-11: Overview of the H.264/AVC Video Methodology for the Subjective 6. Conclusions Coding Standard Assessment of the Quality of IEEE Transactions on Circuits and Television Pictures SVC seems to be of interest and very Systems for Video Technology, Vol. Question ITU-R 211/11, Geneva, promising for current and future 13, No. 7, pp. 560-576, July 2003. 2004. audiovisual services, especially in the area of: [2] Heiko Schwarz, Detlev Marpe and [10] TD 531 R1 (PLEN/16), Draft new Thomas Wiegand: Overview of the ITU-T Rec. H.264 (11/2007) | ISO/ l Video-on-Demand, to reduce the Scalable Video Coding Extension IEC 14496-10 (2008) Corrigendum 1: costs otherwise associated with the of the H.264/AVC Standard Advanced video coding for generic generation of multiple formats of the IEEE Transactions on Circuits and audiovisual services: Corrections same video, especially for the long-tail Systems for Video Technology, Special and clarifications management. Issue on Scalable Video Coding, Vol. l Vector for efficient dissemination of 17, No. 9, pp. 1103-1120, September premium HDTV services (1080p/50- 2007, Invited Paper. First published: 2008-Q2 60). l Mobile video transmissions, to ensure [3] ITU-T Recommendation H.262 a better continuity of service. and ISO/IEC 13 818-2 (MPEG-2): l Finer adaptation to home parameters Generic Coding of Moving Pictures such as device characteristics (HD and Associated Audio Information screens) and available network access – Part 2: Video bandwidth. ITU-T and ISO/IEC JTC 1, 1994. l Quality of experience such as channel- switching time, fast forwards and [4] ISO/IEC 14 496-2 (MPEG-4 Visual rewinds. Version 1): Coding of audio-visual l Dynamic bandwidth access objects - Part 2: Visual management. ISO/IEC, Apr. 1999. l Open Internet IPTV applications. [5] Edouard Francois, Jérome Viéron and SVC is a good candidate technology Vincent Bottreau: Interlaced coding for achieving the “content anytime, in SVC anywhere” target. However, further IEEE Transactions on Circuits and studies need to be conducted to further Systems for Video Technology, Vol. assess its efficiency. Among them we 17, No. 9, pp 1136-1148 September include: 2007. l Other visual assessments to provide [6] EBU BPN 056: SAMVIQ – Subjective comparisons with alternative solutions Assessment Methodology for

2008 – EBU TECHNICAL REVIEW 43 HD PRODUCTION CODECS

HDTVproduction codec tests

Massimo Visca (RAI) 1 and Hans Hoffmann (EBU) EBU Project Group P/HDTP

To address the need for more efficient HDTV studio compression systems, vendors have recently introduced new HDTV studio codecs. In 2007, an EBU project group investigated these codecs and this article describes the methodology used for the tests and summarizes the results obtained.

The introduction of HDTV in Europe l Panasonic (AVC-I); 1. Scope of the tests requires that broadcasters renew l Sony (XDCAM HD422). their production equipment. Whilst The EBU codec tests focused on the HDTV equipment in the past has Existing legacy systems were also included image quality which the individual targeted tape-based solutions, the in the evaluations in order to gather compression algorithm would provide user requirements for modern HDTV an understanding of the improvements after multi-generation processing. This production workflows are file-based, achieved by new technology over legacy codec performance is certainly only a part non-linear and non-real-time – with systems. of the overall equipment performance of shared access via networks and servers a recorder/server or camcorder device, – and, last but not least, they have to be All tests on the new codecs were performed camera, etc. But, in particular for HDTV cost-effective. These requirements put with each of the vendor’s products with its inherent demand for high quality challenges on the video compression individually (i.e. non-comparative tests) images, it is a very important parameter. format applied in mainstream HDTV and the test plan and the results obtained The multi-generation codec assessment production equipment – particularly on were discussed between the EBU project – by means of introducing pixel shifts the trade-off between the data rate and group and the individual vendors. after each generation – simulates how video-quality headroom. the production chain affects the images The tests were carried out between spring as a result of multi-compression and To meet these requirements, the industry and August 2007 and were conducted by decompression stages. has offered several codec solutions for several key EBU Members of the P/HDTP mainstream HDTV production and the project. The AVID tests were performed For SDTV, the agreed method of multi- EBU decided to test these codecs in its by the IRT (Germany), the GVG/Thomson generation testing was to visually compare P/HDTP (High Definition Television and Sony tests by the RAI (Italy), and the the 1st, 4th and 7th generations (including Production) project group. New codecs Panasonic tests by the EBU (Switzerland) pixel shifts after each generation) with the were provided for these tests by: with the assistance of RTVE (Spain). A original image under defined conditions representative of the vendor concerned (reference video monitor, particular l AVID (DNxHD); was present at each test and at the viewing environment settings, and expert l GVG/Thomson (Infinity J2K); subsequent expert viewing sessions. viewers). The same method was adapted to the current HDTV codec tests. In addition, an objective measurement – 1) Massimo Visca acted as project coordinator and team leader for the codec tests. the PSNR – was calculated to give some

44 EBU TECHNICAL REVIEW – 2008 HD PRODUCTION CODECS

general trend indication of the multi- the time of writing this article, are now It should be noted that this article generation performance of the individual commercially available. provides only the information about these codecs. algorithms that is necessary for a general 2.1. “Legacy” algorithms understanding of their functioning. The following Tables provide some basic 2. Algorithms under test The main features of the video compression information (bitrate, bit depth, etc.) but, algorithms employed in the “Legacy” for a complete understanding, the reader equipment are summarized in Table 1. should refer to the bibliography and to The framework of the tests was aimed any official information provided by the at investigating the performance of 2.2. “New” algorithms manufacturers. Moreover, it is worth practically all the HDTV compression underlining that whilst these parameters algorithms available on the market or The new HDTV systems – AVID provide some objective information about under development in the manufacturers’ (DNxHD), GVG/Thomson (Infinity the system resources (e.g. the bitrate vs. laboratories. The test plan included J2K), Panasonic (AVC-I) and Sony storage capacity and network bandwidth), both the so-called “Legacy” algorithms, (XDCAM HD422) in alphabetical order their correlation with the available picture applied in the most widely used systems – employ a wide range of compression quality is much more difficult, if not since the start of HDTV production, and algorithms, differing both in terms of impossible, to determine from them. This the “New” algorithms that were planned bitrate and in the mathematical tools is the reason for the large effort expended for market launch in the shorter term used to perform the compression by the P/HDTP group in testing real at the time of testing: some of these, at itself. implementations of the algorithms.

Video Bitrate Bit Subsampling Compression Format SMPTE (Mbit/s) depth standard HDCAM 116.64 8 1440 Y DCT based 1080i/25 SMPTE

480 Cb /Cr (Intra) 1080p/25 367M-368M DVCPRO 100 8 1440 Y DV based 1080i/25 SMPTE

480 Cb /Cr (Intra) 1080p/25 370M-371M 100 8 960 Y DV based 720p/50

480 Cb /Cr HDCAM-SR 440 10 NO MPEG-4 SP 1080i/25 SMPTE (Intra) 1080p/25 409-2005 720p/50 HDCAM@35 35 8 1440 Y MPEG-2 1080i/25 –

720 Cb /Cr (GoP) 1080p/25 4:2:0 Table 1 – Video compression algorithms employed in the “Legacy” equipment

2.2.1. AVID (DNxHD)

Name Video Bitrate Bit Subsampling Compression Format SMPTE (Mbit/s) depth standard DNxHD 120 8 NO DCT based 1080i/25 SMPTE (Intra) 1080p/25 VC-3 115 8 NO DCT based 720p/50 SMPTE (Intra) VC-3 DNxHD 185 8 NO DCT based 1080i/25 SMPTE (Intra) 1080p/25 VC-3 175 8 NO DCT based 720p/50 SMPTE (Intra) VC-3 DNxHD 185 10 NO DCT based 1080i/25 SMPTE (Intra) 1080p/25 VC-3 175 10 NO DCT based 720p/50 SMPTE (Intra) VC-3 Table 2 – AVID DNxHD video compression codec parameters

2008 – EBU TECHNICAL REVIEW 45 HD PRODUCTION CODECS

2.2.2. GVG/Thomson (Infinity J2K)

Name Video Bitrate Bit Subsampling Compression Format Compression (Mbit/s) depth standard Infinity 50 10 NO Wavelet based 1080i/25 JPEG2000 (Intra) 1080p/25 720p/50 Infinity 75 10 NO Wavelet based 1080i/25 JPEG2000 (Intra) 1080p/25 720p/50 Infinity 100 10 NO Wavelet based 1080i/25 JPEG2000 (Intra) 1080p/25 720p/50 Table 3 – GVG/Thomson video compression codec parameters

2.2.3. Panasonic (AVC-I)

Name Video Bitrate Bit Subsampling Compression Format Compression (Mbit/s) depth standard AVC-I 54.272 10 1440 Y AVC 1080i/25 High 10 Intra

720 Cb /Cr (Intra) 1080p/25 Profile 4:2:0 54.067 10 960 Y AVC 720p/50 High 10 Intra

480 Cb /Cr (Intra) Profile 4:2:0 AVC-I 111.820 10 NO AVC 1080i/25 High 4:2:2 Intra (Intra) 1080p/25 Profile 111.616 10 NO AVC 720p/50 High 4:2:2 Intra (Intra) Profile Table 4 – Panasonic AVC-I video compression codec parameters

Name Video Bitrate Bit Subsampling Compression Format SMPTE (Mbit/s) depth standard XDCAM HD50 50 8 NO MPEG-2 1080i/25 – GoP 1080p/25 L=12 M=3 720p/50 Table 5 – Sony MPEG-2 Video Compression codec parameters

2.2.4. Sony (XDCAM HD422) 3. Methodology

This algorithm employs a Long GoP In order to evaluate the performance of compression algorithms in the (Group of Pictures) with L=12 and M=3, of the different HDTV algorithms SDTV production environment, i.e. the GoP structure is IBBPBBPBBPBBI. in a production environment, the starting from Digital Betacam up to This feature has some important very classical approach of the multi- the more recent IMX and DVCPRO implications on the testing of multi- generation (cascading) test was used. systems. It is considered by the generation performance, as described This method has been extensively used broadcast community to be a reliable in detail in paragraph 3.2.3. in all EBU tests since the introduction methodology which is able to provide

46 EBU TECHNICAL REVIEW – 2008 HD PRODUCTION CODECS

repeatable and stable results for easy l different kinds of content, i.e. natural test of the present 1.5 Gbit/s scenario interpretation. and artificial objects, texture, skin with the future 3 Gbit/s. tones, etc. The method is simply based on the Some sequences were singled out from performance of different compression- Moreover, it is better if at least a subset of material originally shot in one single decompression steps using the algorithm the test material is brand new, in order to HDTV format only. Other sequences under test, in order to simulate the avoid any kind of “optimization” of the were converted from film, or from cumulative effect of the artefacts new compression algorithm using familiar rendered graphics or even taken from the introduced by the compression algorithm sequences. archive in PAL format. All these sequences on the picture quality. were converted into the three HDTV As such a library was not available at formats included in the tests to obtain This method was originally devised to the time of testing, it was necessary an even larger portfolio of 10s-long test stress traditional tape-based equipment, to shoot brand new sequences. The sequences. where each copy implied a decompression shooting was performed using state-of- and compression step. It could perhaps the-art technology (Canon FJS Series Note: All the technical details (equipment, be considered as not fully reflecting the prime lens, Sony HDC 1500 camera and software, etc.) relevant to the conversion workflow of a future IT production-based uncompressed storage on a DVS server via process are available from the authors, infrastructure, where the necessity to HD-SDI). The result was a large portfolio upon request. perform a cascading of compression could of test sequences, each of 10s duration, be reduced. Nevertheless, considering satisfying the above-mentioned criteria. For the 1080i/25 and 720p/50 formats, that in the present production scenario, fourteen 10s-long test sequences were cascading is still a common process, the All the sequences were shot in different concatenated one after the other (no method allows the evaluation of the so- formats – i.e. 1080p/50, 1080i/25, gray or black frames included) to form called “quality headroom” in the system. 1080p/25 (with and without shutter) a single clip. There is a good knowledge base in the and 720p/50 – making the best effort in evaluation of results using this method of order to guarantee the same conditions of For the 1080p/25 (shutter on) formats, assessment and it was agreed to continue lighting and exposure. only eight 10s long test sequences were to use it for analysis of the performance then concatenated to form a single clip. of compression algorithms implemented Note: The 1080p/50 sequences were down The total amount of test material in the HDTV equipment under test. converted to obtain 1080i/25 or employed, its different origins and the 720p/50 versions; they were therefore criteria of selection, guaranteed the 3.1. Selection and not directly used in these tests but they completeness and formal correctness of will provide a library that is available for the tests. Some frames extracted from the shooting of test future needs, such as the comparative sequences are shown in Fig. 1. sequences

The selection of the test sequences to be used in any test is always a very critical issue; even ITU-R Rec. BT.500 – the most important reference document, containing all the procedures for picture quality evaluation based on subjective assessment – provides only a simple guideline, stating that the sequences have to be “critical, but not unduly so”. Obviously, a “biased” selection of test sequences can be used to “drive” the test. The only way to solve this problem is to select a very large amount of material, in order to include sequences that cover all the possible features in terms of: l high-frequency content (details), motion portrayal, colorimetric, contrast, etc.; Figure 1 l indoor and outdoor shooting; Still shots captured from four of the test sequences used

2008 – EBU TECHNICAL REVIEW 47 HD PRODUCTION CODECS

3.2. Standalone chain

The “standalone chain” comprised a Frame Number n n+1 n+2 n+3 n+4 n+5 n+6 n+7 n + 8 n+9 n+10 n+11 n+12 cascading of several compression and Original Sequence decompression processes of the same Compression & I I I I I I I I I I I I I algorithm; each pair of compression Decompression and decompression processes is usually First Generation referred to as a single “generation”. In Compression & Decompression I I I I I I I I I I I I I order to simulate different production scenarios and to investigate different Second Generation Compression & features of the algorithm under test, the Decompression I I I I I I I I I I I I I chain may or may not include processing Third Generation between each generation, as explained (The process continues up to the seventh generation) below. As already mentioned, each Compression & I I I I I I I I I I I I I algorithm under test was subjected to a Decompression multi-generation process up to the seventh S eventh Generation generation.

3.2.1. Standalone chain Figure 2 without processing Standalone chain (without spatial shift) for INTRA codec

The standalone chain without processing simply consisted of several cascaded Frame Number n n+1 n+2 n+3 n+4 n+5 n+6 n+7 n + 8 n+9 n+10 n+11 n+12 generations of the codec under test, Original Sequence without any other modifications to the Compression & I I I I I I I I I I I I I picture content apart from those applied Decompression by the codec under test, as summarized First Generation in Fig. 2. Spatial Shift S S S S S S S S S S S S S

First Gen. with ONE Spatial SHIFT This process accurately simulates the Compression & I I I I I I I I I I I I I effect of a simple dubbing of the sequence Decompression Second Gen. With and is usually not very challenging for the ONE Spatial SHIFT compression algorithm. In fact, the most Spatial Shift S S S S S S S S S S S S S important impact on the picture quality Second Gen. with should be incurred at the first generation, TWO Spatial SHIFT (The process continues up to the seventh generation) when the encoder has to eliminate Compression & I I I I I I I I I I I I I some information, but the effect of the Decompression Seventh Gen. With subsequent generations should be minimal SIX Spatial SHIFT as the encoder should basically eliminate the same information already deleted in Figure 3 the first compression step. Nevertheless, Standalone chain (with spatial shift) for INTRA codec this simple chain can provide useful information about the performance of the processes are currently feasible only in the The shifts were applied variously using sub-sampling filtering that is applied, or uncompressed domain, the effect of the software or hardware, but the method about the precision of the mathematical processing is simulated by spatially shifting used was exactly the same for all the implementation of the code. the image horizontally (pixel) or vertically algorithms under test. The shifts are (lines) in between each compression step, summarized in Table 6 and the process is 3.2.2. Standalone chain with as summarized in Fig. 3. summarized in Fig. 4. processing Obviously, this shift makes the task of the For the horizontal shift (H), a “positive” In a real production chain, several coder more challenging, especially for shift means a shift towards the right, manipulations are applied to the picture those algorithms based on a division of “negative” towards the left. Only even to produce the master, such as editing, the picture into blocks (e.g. NxN DCT shifts are performed to take into account zoom, NLE and colour correction. block), as in any later generation the the chroma subsampling of the 4:2:2 Therefore, a realistic simulation has to content of each block is different to that format. take into account this issue. As all these in the previous generation.

48 EBU TECHNICAL REVIEW – 2008 HD PRODUCTION CODECS

For the vertical shift (V), a “positive” Shift between Spatial Shift shift means a down shift, “negative” an up shift. The shift is applied on a frame First and second generation +4H and +4V basis and is always an even value. For Second and third generation +2V progressive formats, the whole frame is Third and fourth generation –2H shifted by a number of lines corresponding Fourth and fifth generation –2H to the vertical shift applied, while for Fifth and sixth generation –2V interlaced formats each field is shifted by Sixth and seventh generation –4V a number of lines corresponding to half Table 6 – Spatial (vertical and horizontal) applied between each generation the shift applied. For example, a shift equal to +2V means two lines down for progressive formats and 1 line down for each field of an interlaced format. Original Frame Shifted Frame 1920 x 1080 1920 x 1080 Black lines The shift process introduces black in serted pixels on the edges of the frame if/when necessary. SHIFT + 150H 3.2.3. Standalone chain for + 100 V GoP-based algorithms

SHIFT As shown in Table 5 (on page ??), the Black pixels - 150H XDCAM HD50 system exploits the in serted - 100 V C ro p p ed MPEG-2 motion compensation tools area In this example the shift is exaggerated and, in particular, Long GoP coding with (150 pixels and 100 lines) to make L=12 and M=3 (i.e. IBBPBBPBBPBBI); evident the introduction of black even if each MPEG-2 encoder applies a pixels and lines and the cropping of the rather sophisticated strategy to allocate image on the borders. its bitrate resources on the different kinds of pictures, all the MPEG-2 algorithms usually guarantee the best quality for the Intra picture (I), a reduced quality for Figure 4 the Predicted picture (P) and, in the same The shift process in the standalone chain manner, an even lower quality for the Bidirectional picture (B).

Therefore, the GoP structure has some important implications on the way the Frame Number n n+1 n+2 n+3 n+4 n+5 n+6 n+7 n + 8 n+9 n+10 n+11 n+12 standalone chain has to be realized, and

Original Sequence introduces a further variable in the way the multi-generation can be performed Compression & I B B P B B P B B P B B I Decompression – depending on whether the GoP First Generation alignment is guaranteed between each Compression & generation (GoP aligned) or not (GoP Decompression I B B P B B P B B P B B I mis-aligned). Second Gen. With GOP Alignment

Compression & Decompression I B B P B B P B B P B B I As explained in Fig. 5, the GoP is

Third Gen. With considered to be aligned if one frame of GOP Alignment the original picture that is encoded at the (The process continues up to the seventh generation) first generation using one of the three Compression & I B B P B B P B B P B B I Decompression possible kinds of frame – Intra, Predicted Seventh Gen. With or Bidirectional – is again encoded SIX Spatial SHIFT using that same kind of frame in all the following generations: for example, if Figure 5 frame n of the original sequence is always Standalone chain with GoP alignment (without spatial shift) for INTER codec encoded as Intra and frame n+6 as Pre-

2008 – EBU TECHNICAL REVIEW 49 HD PRODUCTION CODECS

dicted. It is therefore possible to have only one multi-generation chain with “GoP alignment”. Frame Number n n+1 n+2 n+3 n+4 n+5 n+6 n+7 n + 8 n+9 n+10 n+11 n+12

Original Sequence On the contrary, if this condition is Compression & I B B P B B P B B P B B I not guaranteed, several conditions Decompression of GoP mis-alignment are possible; First Generation considering the GoP length L=12, Compression & B I B B P B B P B B P B B for the second generation 11 different Decompression Second Gen. With GoP mis-alignments are possible, then GOP Alignment for the third generation 11 by 11 and Compression & Decompression B B I B B P B B P B B P B so on, making the testing of all the Third Gen. With possible conditions unrealistic. It was GOP Alignment (The process continues up to the seventh generation) therefore agreed to apply one “temporal Compression & P B B P B B I B B P B B P shift” equal to one frame between each Decompression generation, as explained in Fig. 6, so Seventh Gen. With SIX Spatial SHIFT that the frame that is encoded in Intra mode in the first generation is encoded Figure 6 in Bidirectional mode in the second Standalone chain without GOP alignment (without spatial shift) for INTER codec generation and, in general, in a different mode for each following generation. It is interesting to underline that the Frame Number n n+1 n+2 n+3 n+4 n+5 n+6 n+7 n + 8 n+9 n+10 n+11 n+12 alignment of the GoP in the different generations was under control (not Original Sequence Compression & random) and that this was considered Decompression I B B P B B P B B P B B I the likely worst case as far as the mis- First Generation alignment effect is concerned, and was Spatial Shift S S S S S S S S S S S S S referred to in the documents as “chain First Gen. with without GoP alignment”. ONE Spatial SHIFT Compression & Decompression I B B P B B P B B P B B I

Taking into account also the necessity to Second Gen. With simulate the effect of manipulation by ONE Spatial SHIFT means of the spatial shift, it was agreed for Spatial Shift S S S S S S S S S S S S S the GoP-based system (XDCAM HD50) Second Gen. with TWO Spatial SHIFT to consider and to realize four different (The process continues up to the seventh generation) possible standalone chains up to the Compression & I B B P B B P B B P B B I seventh generation, as follows: Decompression Seventh Gen. With SIX Spatial SHIFT l Multigeneration chain with GoP alignment (without spatial shift) (see Figure 7 Fig. 5) Standalone chain with GoP alignment AND with spatial shift for INTER codec l Multigeneration chain without GoP alignment (without spatial shift) (see Fig. 6) Note: For the XDCAM @35 chain it and the Sony XDCAM HD50 was l Multigeneration chain with GoP was not possible to obtain the GoP tested by the RAI using a prototype alignment AND spatial shift (see alignment control and therefore this encoder. Engineers from the respective Fig. 7) multi-generation has to be considered manufacturers attended the tests for at l Multigeneration chain without GoP “random” in terms of GoP alignment) least the first day to ensure the correct alignment AND spatial shift (see working of the equipment. Fig. 8) The tests on the AVID DNxHD codec were performed by the IRT, the tests All the abovementioned chains were on the GVG/Thomson codec were carried out on the “Legacy” systems and performed by the RAI using an Infinity on the “New” sys-tems. The resultant prototype camera, the tests on the sequences were all stored in YUV 10-bit Panasonic AVC-I codec were performed format. by the EBU using a prototype encoder,

50 EBU TECHNICAL REVIEW – 2008 HD PRODUCTION CODECS

All the sequences (original and those Frame Number n n+1 n+2 n+3 n+4 n+5 n+6 n+7 n + 8 n+9 n+10 n+11 n+12 subjected to compression) were stored Original Sequence in uncom-pressed form (YUV10 format) Compression & on two video servers that were able to Decompression I B B P B B P B B P B B I be run in parallel. The sequences were First Generation displayed simultaneously in a vertical Spatial Shift S S S S S S S S S S S S S split-screen condition (original on one First Gen. with side, those subjected to compression ONE Spatial SHIFT on the other side). During the expert Compression & B I B B P B B P B B P B B Decompression viewing, the T bar of the that Second Gen. With ONE Spatial SHIFT provided the split-screen effect (a Spatial Shift S S S S S S S S S S S S S Panasonic type AV-HS300) was moved

Second Gen. with to compare the same areas of different TWO Spatial SHIFT versions of the same sequences, as (The process continues up to the seventh generation) was found necessary. The position Compression & P B B P B B I B B P B B P Decompression of the split was made more evident Seventh Gen. With by applying a just-noticeable vertical SIX Spatial SHIFT white bar at the transition between the Figure 8 two images; a real example is shown Standalone chain without GoP alignment AND with spatial shift for INTER codec in Fig. 9.

The viewing distance was marked on 4. Analysis of the where: ori (i,j) = original frame, the floor and was set to 3H, where performance of the cod(i,j) = manipulated frame, H defines the vertical dimension of Ncol = horizontal resolution in pixels, the display, and the observers were algorithms Nlin = vertical resolution in pixel and asked to respect this viewing distance. Vpeak = 210 – 1 = 1023. Sometimes a closer viewing distance, The analyses of the performance of the e.g. 1H, was used to closely observe algorithms was performed both using The results are expressed in dB. small details and artefacts and, when objective measurements (PSNR) and used, this condition was clearly noted visual scrutiny of the picture (i.e. expert It is well known that PSNR does not in the report. viewing), as described in the following correlate accurately with the picture sections. These two methods provide quality and thus it would be misleading It should be made clear that this method different kinds of information and they to directly compare PSNR from very allows the evaluation of very small are considered to be complementary. different algorithms. On the other hand, impairments, near to the visibility this parameter can provide information threshold, and it must be considered a very 4.1. Objective about the behaviour of the compression severe analysis of the picture quality. algorithm through the multi-generation measurements process. As mentioned, the tests focused on the performance of the algorithms at the The PSNR has been computed via software 4.2. Expert viewing first, fourth and seventh generations, and obviously applied a procedure to comparing the picture quality with the reestablish the spatial alignment between Analysis of the algorithm performance (from original (headroom evaluation) or with the original and the de-compressed version the point of view of picture quality) was legacy system (improvement provided of the test sequence. Moreover, it skipped carried out using so-called “expert viewing”. by the new system). A test list, which 16 pixels on the edges of the picture to Even if this method is not formally included summarized in the form of tables all avoid taking measurements on the black in any ITU Recommendation, it is very often the different comparisons planned, was pixels introduced during the shift. used as it provides fast and consistent results. prepared and discussed in advance. Under the generic name of “expert viewing” The formula used to evaluate the PSNR are included several kinds of analysis of the Each expert was provided with a paper via software was: picture quality evaluation. copy of the test list, so she/he was always completely aware of what sequence was For the purposes of these P/HDTP tests, it displayed in each part of the screen. was agreed in advance with manufacturers to use the following interpretation of what For example, in the case of a comparison “expert viewing” would entail. between the “Original” sequence and the

2008 – EBU TECHNICAL REVIEW 51 HD PRODUCTION CODECS

Compressed (Impaired) version the tests, only a deep analysis of these (e.g. Seventh generation vs. Ori_B (Original 4:2:2) documents can provide a complete with spatial shift) appreciation of the results.

4.2.1. Display

The following displays were used during the tests:

l CRT 32” Sony: Type BVM-A32E1WM l CRT 20” Sony: Type BVM-A20F1M l P lasma Full HD 50”: Type TH50PK9EK Panasonic l LCD 47”: Type Focus

The displays were connected through HD-SDI interfaces. The displays were T-bar line aligned according to the conditions described in ITU-R BT.500-11 and the room conditions were set accordingly. T-bar m ovem ent The final assessment was always done Figure 9 while considering the quality perceived Standalone chain with GoP alignment AND with spatial shift for INTER codec on the CRT displays. version subjected to seven generations of visual analysis of a sequence and in this Nevertheless, there was a general agreement algorithm “A”, including a spatial shift case the sequence was repeated. If even in that the flat-panel displays, both LCD and between each generation, the expert was this case an agreement was not possible, the plasma, magnified the impairments. provided with the following table: situation was noted in the report. 5. Results Compressed (Impaired) version vs. Ori_A (Original 4:2:2) (e.g. Seventh generation with spatial shift) It was agreed between the EBU project Ratio: loss due to seven generations with post-processing group and the vendors to make the reports about the test details available to EBU The expert was aware that “A” meant a Members only. In late 2007, the results specific vendor and was given full details Best efforts were made to guarantee that of the test were published as BPN076 to of each test condition (bitrate, GoP aligned the panel of experts was comprised of the BPN079, as noted above. or not, etc.). The table also included a same people during the different expert short sentence describing the “rationale” viewing sessions; this condition was Note: Due to the importance of the subject, of the test and, in order to avoid any readily met during each single day and vendors and the EBU agreed to misunderstanding, the position (right or almost perfectly so during different days. provide some preliminary results in left) of the sequences was the same on the One day of expert viewing was dedicated a PowerPoint presentation given at paper table and on the display; c.f., the to each manufacturer, and representatives the IBC-2007 conference, before the Original sequence on the right side and of the individual vendor took part in the actual BPN reports became available the processed sequence on the left side in expert viewing as well. to EBU Members. This PowerPoint the above table. The results of the expert viewings were presentation contained a short summary collected in a series of EBU BPN documents; of the test results in tabular form. The All the experts were formally requested BPN 076 (results for Avid DNxHD), BPN published reports BPN076 to BPN079 to refrain from expressing their opinions 077 (results for GVG/Thomson JPEG2000), contain a much larger framework during the sessions, in order to avoid BPN 078 (results for Panasonic AVC-I) of test conditions than shown in biasing other people. Only after the and BPN 079 (results for Sony XDCAM the IBC-2007 PowerPoint. Neither complete analysis of the test sequence (for HD50). These documents are pub-lished the test reports nor the PowerPoint its full length) was a discussion started to by the EBU exclusively for its Members. tables are intended, or suited, for summarize in a few sentences the opinions comparative studies. The tabular form of the different experts. Sometimes it was The reader should be aware that, due of the PowerPoint presentation did not not possible to get an agreement on the to the complexity and framework of include information about whether the

52 EBU TECHNICAL REVIEW – 2008 HD PRODUCTION CODECS

Abbreviations

720p/50 High-definition progressively-scanned TV ITU International Telecommunication Union format of 1280 x 720 pixels at 50 frames per www.itu.int second ITU-R ITU - Radiocommunication Sector 1080i/25 High-definition interlaced TV format of 1920 x www.itu.intpublications/sector. 1080 pixels at 25 frames per second, i.e. 50 fields aspx?lang=en§or=1 (half frames) every second JPEG Joint Photographic Experts Group 1080p/25 High-definition progressively-scanned TV www..org format of 1920 x 1080 pixels at 25 frames per MPEG Moving Picture Experts Group second www.chiariglione.org/mpeg/ 1080p/50 High-definition progressively-scanned TV NLE Non-Linear Editing format of 1920 x 1080 pixels at 50 frames per PSNR Peak Signal-to-Noise Ratio second SMPTE Society of Motion Picture and Television AVC (MPEG-4) Advanced Video Coding, part 10 (aka Engineers (USA) H.264) www.smpte.org DCT Discrete Cosine Transform YUV The luminance (Y) and colour difference (U and V) GoP Group of Pictures signals of the PAL colour television system

tests were conducted with or without l If the production/archiving format programme exchange pixel shift. is to be based on I-frames only, the ITU-R Geneva, 2002. bitrate should not be less than 100 [3] SMPTE 296M: 1280 x 720 Scanning, The EBU Production Management Com- Mbit/s. Analogue and Digital Representation mittee then subsequently concluded in a l If the production/archiving format is and Analogue Interface recommendation (EBU R124-2008)2 that: to be based on long-GoP MPEG-2, SMPTE New York, 2001. the bitrate should not be less than 50 [4] SMPTE 292M: Bit-Serial Digital For acquisition applications an HDTV Mbit/s. Interface for High-Definition format with 4:2:2 sampling, no further Television Systems horizontal or vertical sub-sampling should Furthermore, the expert viewing tests SMPTE New York, 1998. be applied. The 8-bit bit-depth is sufficient have revealed that: [5] SMPTE 274M: 1920 x 1080 Scanning for mainstream programmes, but 10- and Analogue and Parallel Digital bit bit-depth is preferred for high-end l A 10-bit bit-depth in production is Interfaces for Multiple Picture acquisition. For production applications of only significant for post-production Rates mainstream HD, the tests of the EBU has with graphics and after transmission SMPTE New York, 1998. found no reason to relax the requirement encoding and decoding at the [6] Sony: www.sonybiz.net and search placed on SDTV studio codecs that “Quasi- consumer end, if the content (e.g. for XDCAM-HD 422 transparent quality” must be maintained graphics or anima-tion) has been [7] Panasonic: www.panasonic.com/ after 7 cycles of encoding and re-coding generated using advanced colour business/provideo/p2-hd/white- with horizontal and vertical pixel-shifts grading, etc. papers.asp applied. All tested codecs have shown l For normal moving pictures, an 8-bit [8] Avid: www.avid.com/dnxhd quasi-transparent quality up to at least bit-depth in production will not [9] Thomson Grass Valley: http:// 4 to 5 multi-generations, but have also significantly degrade the HD picture thomsongrassvalley.com/products/ shown few impairments such as noise or quality at the consumer’s premises. infinity/ loss of resolution with critical images at the 7th generation. Thus EBU Members are Acknowledgements required to carefully design the production 6. Bibliography workflow and to avoid 7 multi-generation The authors wish to acknowledge steps. [1] ITU-R BT.500-11: Methodology for the considerable work carried out by the subjective assessment of the Giancarlo De Biase (RAI), Dagmar The EBU recommends in document R124- quality of television pictures Driesnack (IRT), Adi Kouadio (EBU) and 2008 that: ITU-R Geneva, 2003. Damian Ruiz Croll (RTVE) during the [2] ITU-R BT.709-5: Parameter values codec testing. for the HDTV standards for 2) R124-2008: http://tech.ebu.ch/docs/r/r124.pdf production and international First published: 2008-Q3

2008 – EBU TECHNICAL REVIEW 53 WEBCASTING

Technical trial of the

EBU P2Pmedia portal

Franc Kozamernik Senior Engineer, EBU

The EBU P2P Media Portal (EBUP2P) was developed as a demonstration tool for EBU Member organizations to show their television and radio channels internationally.

A 6-month trial of the portal was set up by EBU Project Group D/P2P in the first half of 2008 to evaluate P2P (Peer-to-Peer) technology – via the example provided by the company Octoshape. This article reports on the outcome of the trial.

EBU Project Group D/P2P (Peer-to-Peer), The Group was encouraged by the large EBU Members should come together chaired by Frank Hoffsümmer (SVT), was attendance at a seminar organized by the and set up a common Internet portal set up in 2006 in order to: EBU Technical Department in February through which Members’ TV and radio 2006 in Geneva, which identified channels could be made available to l evaluate newly-emerging peer-to-peer significant interest among EBU Members the general public worldwide. The technologies; in studying this new technology. portal – if ever established permanently l formulate potential EBU requirements as a non-commercial collaborative EBU for such systems; The initiative for an EBU Media Portal project – could mean much more than l to carry out technical experiments and came from the Group D/P2P at its meeting just a technically advanced and innovative trials on a working P2P system. in June 2007. The Group proposed that project: it could become a one-stop shop

What is Peer-to-Peer?

In this article, P2P is considered as an Internet media distribution system which relies on end-users’ computers to propagate content through existing computer networks. Such a P2P system has nothing to do with napsterization or illegal content- sharing. Quite the contrary, P2P offers an attractive possibility for broadcasters to distribute their content efficiently across the Internet.

As P2P does not require any special emission infrastructure to be installed, the investments and maintenance costs are significantly lower than those of more traditional Content Distribution Networks (CDNs) which may use several thousands of special streaming servers. In addition, P2P does not have a single point-of-failure, so its service reliability is very high.

On the downside, P2P is a relatively new technology which requires a lot of further studies and hands-on experience in order to turn an interesting technical innovation into a viable business proposition.

54 EBU TECHNICAL REVIEW – 2008 WEBCASTING

window of Members’ programming P2P may change this paradigm radically. l low distribution cost (ideally achievements. Video can be streamed via P2P at a independent of location, time, cost that is below € 0.05 per GB1. As a quality and number of users); Prior to the advent of P2P, EBU Members result of the competition from P2P, the l reliable delivery (no glitches or were extensively using different server- cost of CDN services has also dropped interruptions, reasonable end-to-end client approaches such as CDNs (Content significantly but it is still much more latency, fast zapping); Distribution Networks), IP Multicasting expensive than P2P by an order of l high quality levels – SD, even HD and a great variety of proprietary magnitude. (including multichannel audio if streaming solutions from Microsoft, Real required); Networks and QuickTime. A common In addition to the cost factor, P2P l large channel capacity (in principle, characteristic of all these systems is that technologies have many advantages there are no frequency spectrum they are relatively expensive ... because compared to other Internet distribution constraints as in conventional broadcasters must pay an ISP (Internet systems: no central server streaming broadcasting); Service Provider) for each connected “farm” is required and there is no central l the largest number of concurrent stream (i.e. user). Thus, the more users point of failure (assuming a decentralised, users possible (several hundreds of there are, the higher the ISP’s costs will distributed tracker). However, P2P is still thousands of concurrent P2P users be. Effectively, broadcasters pay more in its infancy and many challenges are still have been successfully demon- because of their own success. to be resolved. Setting up EBUP2P is only strated). the first step in the direction of gathering As an example, in 2005, the Eurovision experience and solving potential issues Song Contest was distributed via the relating to P2P. Internet using a CDN system serviced by The P2P trial Akamai. Being a highly popular event, Requirements for it generated a lot of interest worldwide In order to make the whole operation (several tens of thousands of Internet an Internet media- manageable, we had to limit the number users). The calculated cost of video distribution system of Member participants to about ten (see streaming the event was about one CHF Table 1). The P2P system chosen for per user per hour, which amounted to Broadcasters have the following the trial was provided by Octoshape, around CHF 100’000 (€ 65’000) for the requirements for any Internet distribution already described in an earlier edition 3-hour show. system they might wish to use: of EBU Technical Review2.

Channel Organization URL Comment

TV 1 HR Hessischer Rundfunk (HR) www.hr-online.de TV 2 DW - TV Deutsche Welle (DW) www.dw-world.de TV 3 TV SLO 1 RTV Slovenia (RTVSLO) www.rtvslo.si TV 4 TV SLO 2 RTV Slovenia (RTVSLO) www.rtvslo.si TV 5 24H tve RTV Spain (RTVE) www.rtve.es TV 6 DOCU tve RTV Spain (RTVE) www.rtve.es Discontinued TV 7 TV Ciencia on-line TV Portugal www.tvciencia.pt Since May 08 TV 8 iTVP Polish TV (TVP) www.itvp.pl Since May 08 TV 9 Taiwan TV Since June 08 Radio 10 radio-suisse jazz SRG-SSR www.srg-ssr.ch MP3, 192 kbit/s Radio 11 radio-suisse pop SRG-SSR www.srg-ssr.ch MP3, 192 Radio 12 radio 3 RNE www.rne.es WM, 96 Radio 13 radio classica RNE www.rne.es WM, 128 Radio 14 youfm HR www.hr-online.de MP3, 160 – up to the end of April 08 Radio 15 RSI RTVSLO www.rtvslo.si WM, 192 Radio 16 Val 202 RTVSLO www.rtvslo.si WM, 192 Table 1 Member participants in the EBU P2P Portal trial

1) Octoshape states a price of 2 Eurocents per GB on its website: http://www.octoshape.com 2) Visit: http://tech.ebu.ch/docs/techreview/trev_303-octoshape.pdf

2008 – EBU TECHNICAL REVIEW 55 WEBCASTING

Basic portal description On the bottom of the page we included The trial plan a temporary link “Your Comments”, During the trial, there was a highly-visible allowing the users to send in reports The first issue in the process of establishing vignette “P2P Media Portal Trial” on the and comments about their viewing an operational EBU Media Portal was to EBU’s home page, which took the user to experiences. During the trial we received identify a suitable Internet distribution the portal’s own home page 3. Access to more than a hundred comments which technology and to agree some operational the portal was open to all users worldwide were all consistently positive about the parameters. and Fig 1 shows an earlier design of the user experience. home page. A technical trial Group consisting of ten The page also included information about EBU Members held a kick-off meeting on During the trial, the design of the what the user needed in order to access 10 July 2007 in Geneva, in order to define web portal underwent continuous the portal content. the technical and operational parameters improvements, both graphically and in based on peer-to-peer (P2P) technology4. terms of accessibility and user friendliness. These are as follows: Once these parameters had been set, the Fig. 2 shows the current graphical design trial could start informally in autumn of the portal, as produced by Nathalie l a suitable broadband connection; 2007 but officially it started in January Cullen from the EBU’s Communication l a PC running Windows, Mac or Linux; 2008. It then continued for six months Service. l Internet Explorer (IE) or Firefox until the end of June 2008. browser; Each Member participant is represented l Active X (in the case of IE); The Trial Group was coordinated by the by an icon which emulates their logo. On l Windows Media Player (video and Author and operated under the auspices of the top right side, there are eight icons audio); the D/P2P Group. It held three meetings representing TV channels, while below l MP3 player; in order to supervise the development and them are six radio channels. In this new l Octoshape plug-in. monitor the technical quality of the portal. design, users during the trial had the flexibility of using either an embedded player (which starts playing automatically when you first open the page) or Windows Media Player (which opens in a separate adjustable-size window).

Underneath each logo we put a link to the Member’s URL, allowing the users to consult the schedule of programmes and access some additional information about the channel concerned.

Figure 1 Figure 2 EBU P2P Media Portal – Windows Media Final design of the EBU P2P Media Portal (Courtesy: Nathalie Cullen, EBU) – which uses Player only (Courtesy: Nathalie Cullen, EBU) either Windows Media Player or an embedded player in the browser

3) The deep link to the Media Portal can still be found here: http://www.ebu.ch/members/EBU_Media_portal_Trial_1.php 4) Octoshape P2P technology was selected on the basis of some comparative tests conducted by Project Group D/P2P during IBC 2006 and also as a result of Eurovision Song Contest events from 2005 onwards, for which Octoshape was very successfully used.

56 EBU TECHNICAL REVIEW – 2008 WEBCASTING

Test Description Participants

1 Accessibility of contents from Working links, smooth zapping of channels, only one stream at all the EBU web site a time 2 Overall quality performance Different upstream and downstream capacity, different network all (continuity and reliability) under load (resulting in packet loss and jitter) different broad-band network conditions 3 Scalability: 200 and 700 kbit/s Number of concurrent users all Different computer platforms/ PC, Mac, Linux IRT operating systems 5 Different browsers IE, Firefox, etc IRT 6 Geolocation (dynamic) Displaying the messages for non-availability of streams ? 7 Rights Management Different DRM system including MS DRM, DVB CPCM ? 8 Audience statistics Immediate return of data for each broadcaster All, EBU 9 Pre-roll advertising Octoshape inserts a up to 3 s pre-roll ad. Content of add should HR, TVP be mutually agreed by channel owner and Octoshape 10 Audio Watermarking a Embed about 100 bit/s ? 11 Flash codec a In addition to WM, we should test Flash codec ? 12 On-demand delivery a Video on-demand play out of files ? Table 2 a) To be tested in the second phase (subject to agreement with Octoshape) Technical test planned

The principal objective of this trial was Players: Windows Media and embedded Distribution via Internet: P2P and to assess whether or not the Octoshape Octoshape player. HTTP. P2P technology is an efficient and reliable platform for live streaming of Members’ television and radio services. Video We also seized the opportunity to try Codec Bitrate Aspect ratio & some peripheral services such as the resolution graphical design and accessibility of the MS Windows Media about 700 kbit/s 4:3 480 x 360 px; 16:9 Portal (e.g. how the user accesses the site Television 520 x 360 pixels and changes channels – zapping), along Audio with video encoding, geolocation, pre-roll Codec Bitrate Stereo/Mono advertising, etc MS Windows Media 9 64 kbit/s 48 kHz, stereo (A/V) 1-pass CBR Based on the above system and the Audio operational requirements, a test plan was Codec Bitrate Stereo/Mono developed, as shown in Table 2. Radio MS Windows Media 9 96/128/192 kbit/s 44.1/48 kHz, stereo (A/V) 1-pass CBR Mpeg Layer 3 192 kbit/s Technical specification of Table 3 Technical parameters used in the EBU P2P Media Portal the trial The agreed technical parameters for the audio and video streams is given in Table 3.

Abbreviations CBR Constant Bit-Rate EPG Electronic Programme Guide P2P Peer-to-Peer CDN Content Delivery Network FS Full Scale SDI Serial Digital Interface CE Consumer Electronics HTTP HyperText Transfer Protocol WM (Microsoft) Windows Media DRM Digital Rights Management ISP Internet Service Provider XML eXtensible Markup Language

2008 – EBU TECHNICAL REVIEW 57 WEBCASTING

Distribution of work l Ensured a balanced (non-discrimin- Members performed editorial control atory) visibility to all the channels of the ad content. The following is the list of tasks that each involved; participant had to accomplish in order l Provided all the information Legal matters to make the trial a successful operation. required for the end user to access all the channels and other information Throughout the course of the trial, Octoshape required; legal matters – particularly copyright l Enabled the user to download the issues – played an important role in our latest version of the required discussions. Initially, EBUP2P thought Octoshape provided the following services media players and the Octoshape we could be considered a kind of re- to the Portal: plug-in; broadcaster of EBU Members’ content and l Reported regularly to the various be treated as a cable network. To this end, l Information to the EBU to enable EBU EBU bodies on the progress of the the EBU (as owner of the portal) needed Members to encode their streams in technical trial; explicit permission from Members to WM; l Ensured some publicity for the re-broadcast their channels. If required, l Octoshape-specific Source Signal portal in order to maximize the use the EBU also needed to clear the rights Solution (SSS) software to all Members of the site; issues with the collecting societies. All for encoding their material; l Promoted the portal at relevant participants were required to sign a rights l If required by a Member, performed events, EBU seminars, conferences clearance form that there were no legal encoding (or asked third party to do and trade shows; obstacles for the EBU to make the TV it); l Discussed common strategies for channel(s) available to the general public, l Information to EBU to inform EBU pre-operational and regular services free of charge and in unchanged form Members how to send streams to with Octoshape, i.e. the future steps and simultaneously with the terrestrial Octoshape; and business model options to be broadcasts of these channel(s). l User’s plug-in (Octoshape-specific) considered. with regular updates; A later discussion showed that, in practive, l Powered the Portal by providing P2P EBU Members – Trialists the end user who clicks on the icon services for live streaming; on the EBU website to access a certain l Provided audience statistics on request The EBU Members that participated TV channel is merely redirected to the to all participants; in the trial coordinated their activities original stream of the actual content l Provided geolocation services (if through the EBU Technical Department. pro-vider. Therefore, the EBU is not re- required); Participants conducted the following transmitting the stream. It was agreed to l Provided pre-roll advertising (if activities: publish a disclaimer which reads: required). l Provided the content, i.e. selected It should be noted that the users are EBU staff the TV/radio channels or any other redirected to the original stream of streaming content they wished to the actual content provider and The EBU had the following responsibil- publish on EBU website 5; the EBU is not re-transmitting the ities: l Granted rights to the EBU for their stream. content to be published on the EBU l Ensured that the technical, legal website; Another solution for a future portal would and programming interests of EBU l Encoded their streams (by using be for the EBU to simply provide the links Members were fully respected and appropriate technologies); to the Members’ web pages containing taken into account; l Forwarded their streams to Octoshape; embedded players. l Coordinated the evaluation process; l Defined the coverage constraints (if l Developed a dedicated website in and when required) and instructed Evaluation results coordination with Octoshape, and Octoshape how to apply geolocation according to Octoshape requirements, filtering; and provided a link to the EBU home l Provided a link to the EBU website Audience page; on their own website, so that users in l Ensured a proper design of the web their country could easily access other The table below shows how many unique page, with constant improvements to EBU Members’ content; users were able to join the trial across all the look and feel; l In the case of pre-roll adverts, channels offered during the January - June

5) Generally all webcasts should be available for 24 hours a day but it is up to the individual Members to decide on their webcast times. In the latter case, information should be given about the broadcast times.

58 EBU TECHNICAL REVIEW – 2008 WEBCASTING

2008 period. It also shows the aggregate time (in hours) of user media consumption for each month.

Month Unique users Total hours Jan 9’072 32’375 Feb 8’157 39’777 March 8’652 44’898 April 7’031 41’519 May 3’808 26’247 June 2’936 24’184

Fig. 3 shows an example of the audience variations on two Hessicher Rundfunk channels, one TV and one radio, during Figure 3 the political campaign before the elections Audience variations on HR’s radio and TV channels at Hessen, Germany. Members of political party SPD were able to watch the TV stream via our P2P stream, as they did not Overall quality Resolutions: have TV sets in their offices. performance 4:3 – 512 x 384 pixels 16:9 – 512 x 288 pixels Please note that the Octoshape system There has been no evidence that network was limited to serve no more than 600 traffic load, asymmetricity or last- De-interlacing problems should be concurrent users, as the broadcaster did mile issues affected the overall quality handled in the hardware (encoder card). not notify Octoshape in advance (see the performance of the Octoshape system Complexity coding should be enabled. If section on “Scalability” below). in any significant manner. It should be possible, performance could be enhanced stressed however that only large-scale by a 2-step encoding. Use of SDI sources Accessibility laboratory tests, allowing for controlled is highly recommended while composite repeatability of results, could yield sources (to reduce cross-colour and cross- All web links were working satisfactorily, scientifically-valid results. According luminance interference) is to be avoided zapping of channels was smooth and to the reports received, our experience only one stream at a time was available about the service quality was positive – we Audio level: 0 dB FS level (full scale). (as required). Some occasional glitches can deduce that Octoshape performed (e.g. missing sound, lip-sync problems) correctly on all networks. Scalability occurred due to poor encoding. The Spanish Documentary channel is not The service quality therefore mainly On a major Internet event in May, available outside of Spain, due to depends on the encoding quality. We HR experienced a service breakdown. copyright, and the geolocation filtering detected some errors performed by Octoshape explained that the 600 6 user was applied to implement this. The end Members in encoding video material, in limit was configured on the Octoshape user was informed about this via a message particular regarding the correct aspect P2P network by default (which means that popped up “This stream cannot be ratio when the source material was that the service was effectively cut down viewed in your country”. produced in HDTV (16:9). when more than 600 peers joined the network). This was explained as a normal Some users had problems with down- Should there ever be a regular service, precautionary measure, as Octoshape loading the Octoshape plug-in. Some the question of correct encoding does not know the network (ISP) limits people, particularly those located in requires extremely careful consideration. which may vary from one network to large corporations (including some large Differences between different sources another. Octoshape is however able to broadcasting organizations) could not down- should be avoided, so that zapping from scale bandwidth to the ISP limits. load the plug-in at all, and consequently one channel to another does not result in were not able to access the Portal services. level and other differences. Broadcasters Such a service breakdown could have been This element of the Octoshape system should adopt a common set of coding avoided if Octoshape had been informed needs to be considered, as it represents an parameters. Square pixels should be in advance of an event where it was likely obstacle to audience acceptance. consistently used. that a larger number of peers may join.

6) Octoshape explained that the number 600 is really arbitrary and can well be set to a much higher value if required (especially for live events).

2008 – EBU TECHNICAL REVIEW 59 WEBCASTING

Franc Kozamernik graduated from the Faculty of Electrotechnical Engineering, University of Ljubljana, Slovenia, in 1972.

He started his professional career as an R&D engineer at Radio-Television Slovenia. Since 1985, he has been with the EBU Technical Department and has been involved in a variety of engineering activities covering satellite broadcasting, frequency spectrum planning, digital audio broadcasting, audio source coding and the RF aspects of various audio and video broadcasting system developments, such as Digital Video Broadcasting (DVB) and Digital Audio Broadcasting (DAB).

During his years at the EBU, Mr Kozamernik has coordinated the Internet-related technical studies carried out by B/ BMW (Broadcast of Multimedia on the Web) and contributed technical studies to the I/OLS (On-Line Services) Group. Currently, he is the coordinator of several EBU R&D project groups including B/AIM (Audio in Multimedia), B/VIM (Video in Multimedia) and B/SYN (Synergies of Broadcast and Telecom Systems and Services). He also coordinates EBU Focus Groups on Broadband Television (B/BTV) and MultiChannel Audio Transmission (B/MCAT). Franc Kozamernik has represented the EBU in several collaborative projects and international bodies, and has contributed a large number of articles to the technical press and presented several papers at international conferences.

Octoshape can scale down the quality Geolocation the Octoshape system would be able to from 700 kbit/s to 200 kbit/s 7 either interface to any other geolocation system automatically or manually. Throughout the trials, the Group gave including Akamai or Quova if required. very serious consideration to issues Members agreed that EBUP2P should In order to implement the scalability and relating to geolocation filtering, as meet the highest level of geolocation ensure continuous services (even when the this tool is essential to limit coverage performance and security as required. number of users increases), it is necessary to specific areas, mainly for copyright to encode the streams in two (or three) reasons. The geolocation system must In the case when geofiltering is applied, bitrates. In this way, Octoshape can obey very strict requirements in terms of copyrighted content is only available perform an automatic switch to a lower security, accuracy and reliability, in order within the copyrighted zone, whereas bitrate, as soon as the number of peers to prevent any leakage of content outside outside this zone a message board should reaches a certain limit. the granted zone. be displayed on the screen. The text on the message board may say the It should be pointed out that dual bitrate The Octoshape geolocation system is following: encoding can be implemented by a single an advanced commercial product called PC. “IP2location”. This system enables “Due to copyright restrictions, the identification of the geographic location currently broadcast programme Octoshape has already demonstrated on and Internet domain name by means of is only available within the a number of occasions that it is a very an IP address. The IP2location database authorized zone. Your location scalable system, e.g. for the Eurovision is used to match an incoming IP address lies outside this zone, therefore at Song Contest during which it was able to the country, region, city, latitude, the moment you have no access to to support around 45’000 simultaneous longitude, zip code, Internet Service the content. We apologize for any streams without any problems. Provider (ISP), time zone, network speed inconvenience.” and domain name of the Internet user. Computer platforms and Octoshape merely provides an interface Preferably the above message is shown to this database. Octoshape believes that in the national language and in English; browsers the geolocation system they are using other languages may be added of course. There was no evidence of any problems provides excellent accuracy and security. Geofiltering can only be applied to video, resulting from the use of different In the many years that they have been while leaving the audio available. computer platforms, operating systems using this system, no difficulties have been and browsers. experienced whatsoever. If necessary, The of geofiltering could be two-fold: [a] pre-scheduled (pre- determined start and end times) or [b] 7) The lower limit of 200 kbit/s has been increased to 350 kbit/s to improve flexible (time stamps or manual activation the lower quality limit. of geofiltering).

60 EBU TECHNICAL REVIEW – 2008 WEBCASTING

Rights management l EBUP2P is future proof and will be Required future work extended towards CE (consumer Digital Rights Management (DRM) was electronics) devices. Running a Media Portal is a complex not tested: it is independent of the P2P issue and technologies evolve very rapidly. system in use. In spite of the very limited human It is not enough to show that the P2P and financial resources available for distribution system functions correctly Pre-roll advertising conducting the EBU P2P Media Portal and according to our expectations. For trial, we brought the trial to a successful a possible future commercial portal, the A pre-roll system is an example of a end, according to the schedule planned. following additional issues may also be business model in which the end user The participants in the trial performed a considered: receives all video and audio streams large number of technical and operational for free. For the testing of EBUP2P, tests. l Watermarking (and Fingerprinting) – Hessischer Rundfunk used a 5s pre-roll optionally embed audio watermarking service. Octoshape confirmed that they The number of participating EBU signals to detect the originator of the were able to provide a pre-roll advertising organizations was restricted to about ten. content (if required); technology which is very similar to the We could not accept more participants, l Allow for both customized media Zattoo one. However, if pre-roll is to be as our logistic resources are so limited. players (which open in a separate used for an operational service, several Also the number of end users were quite page) and embedded players; related open questions apply: modest, as we did not carry out any l Optionally embed DRM in the stream significant promotion of the portal. (if required to control consumption of l Who provides the pre-roll ad con- the media received) 8; tent? These EBU tests cannot be considered l Develop an XML-based template for l Should the ad content be adapted to rigorous scientific tests. They were EPGs and optionally provide an EPG the destination market? more akin to “proof of concept” and (daily, weekly) for each channel; l Is the ad content both channel- and “experience-gathering” evaluations. l Provide additional coverage of special zone-specific (leading to different Octoshape is not the only commercial events (if required); pre-rolls for different markets)? P2P system available in the market but l Extension of portal services to embrace we selected it because our previous content downloading and VoD (on- experience with this system was positive. demand) services (documentaries, Preliminary conclusions archives, recorded sports events, etc); The main conclusion of the trial is l Hybrid TV receivers with broadband The principal conclusions of this trial can that Octoshape is an excellent Internet (Ethernet or Wi-Fi) connection: be summarized as follows: distribution system for carrying audio embedded P2P client in commercial and video streams to PC users. The TV sets and set-top boxes (such as in l EBUP2P represents a state-of-the- system is scalable, reliable, easy to manage DVB); art technical solution and fulfils all and interoperable with a number of the tested technical and operational codecs, operating systems, browsers and requirements – in terms of the service geolocation systems. In the course of Acknowledgements quality, scalability, video and audio the project a large number of issues were quality, accessibility, security and user- successfully resolved, although several The EBU would like to thank all participants friendliness. issues were left open for future activities for their very kind cooperation in this field l EBUP2P has no technical limitations (see the next section). trial. Particular thanks should go to the regarding the number of radio and Octoshape staff – Stephen Alstrup, Theis TV channels to be accommodated in The P2P Media Trial has shown that Rauhe and Lasse Riis – for their fast and a Portal. Members can flexibly join in Octoshape can be used by our Members efficient resolution of all problems that and opt out at any time. as a viable and technically appropriate may have arisen during the tests. Many l EBUP2P can fulfil our requirements system for the distribution of audio and thanks also go to Nathalie Cullen from concerning copyright, by applying video streams across the Internet. It can the EBU Communication Service for the territorial filtering (ge-olocation) and be used either as a standalone distribution attractive and functional web design of watermarking. system or in combination with some CDN the portal. And last (but not least), thanks l EBUP2P enables a number of business or IP Multicasting technologies. should also go to the EBU management models. for providing continuous support and encouragement for conducting this project.

8) Participants will be able to apply different DRM (digital rights management) tools in order to protect their content. First published: 2008-Q3

2008 – EBU TECHNICAL REVIEW 61 AUDIO OVER IP

Streaming audio contributions over IP– a new EBU standard

Lars Jonsson Mathias Coinchon Swedish Radio EBU Technical Department

Audio-over-IP end units are increasingly being used in radio operations for the streaming of radio programmes over IP networks, from remote sites or local offices into main studio centres. The IP networks used can be well-managed private networks with controlled Quality of Service. However, the open Internet is increasingly being used also for various types of radio contribution, especially over longer distances. Radio correspondents will have the choice in their equipment to use either ISDN, the Internet via ADSL or other available IP networks to deliver their reports. ISDN services used in broadcasting will be closed down in some countries.

The EBU has created a standard for interoperability in a project group, N/ACIP (Audio Contribution over IP). This standard, which has been jointly developed by members of the EBU group and manufacturers, is published as EBU Tech 3326-2007. The standard has quickly been implemented by the manufacturers. A “plug test” between nine manufacturers, held in February 2008, proved that earlier incompatible units can now connect according to the new standard.

Internet Protocol (IP) is used worldwide on the Internet and also on the IP-based corporate or private networks used by Audio over IP application A Audio over IP application B broadcasters. It is independent of the underlying data transmission technology and many IP adaptations exist for the Audio stream / Packets Audio stream / Packets physical layers, e.g. Ethernet, ATM and SDH over copper, fibre or radio links. RTP RTP TCP UDP UDP TCP Applications can communicate in a IP IP standardized way over interconnected IP NC NC networks. Host computers can be reached almost instantly wherever they are IP network located. Fixed or temporary connections for audio contributions can share similar

62 EBU TECHNICAL REVIEW – 2008 AUDIO OVER IP

types of applications. The connection is established by dialling a number or an e-mail-like name. The audio stream is then sent using standardized protocols (SIP, RTP, UDP). Many types of audio coding formats can be used at various bitrates. Higher bitrates will allow stereo audio or multi-channel with linear PCM coding. The permitted maximum audio bitrate will depend on the bandwidth and if the network has a good QoS.

The open Internet, as well as closed IP networks, are constantly being improved and are moving towards higher bandwidths, which will allow for the transfer of high-definition TV pictures and high quality audio to all consumers. Typical infrastructure using SIP It will be possible to use both corporate

IP networks and the Internet for audio- GSM + Wi-Fi only contribution and distribution. The VoIP Phone IP Net cost of using networks will decrease and the Quality of Service will be improved. Studio-side NGN (Next Generation Networks) is AoIP Codec the common name for improved public Firewall/NAT networks that will offer considerably Location/ SIP higher bandwidth. Wi-Fi VoIP Redirect Proxy Phone Server Server

Mobile IP networks will be available with good population coverage in many countries. 3G/UMTS and LTE systems The location server associates VoIP names with IP addresses and/or (4G) will soon offer higher bitrates telephone numbers upstream. WiMAX and WLAN hotspots • SIP name: sip:[email protected] (picture courtesy: Steve Church, Telos) are other possible evolutions which may • Telephone number: 1 216 2417225 offer solutions for radio contribution. However, the robustness required for live contribution may not always be guaranteed in these systems because of interference and shared access. The added value of these new networks will be near- File transfer instant access for news reporters in all outside - broadcast Interview Discussion PMP urban areas. The improvements gained by using the presence functionality in SIP (Session Initiation Protocol), will be that bidirectional with bidirectional unidirectional narrowband return broadband the reporter can easily be reached almost anywhere in the world with one identity, irrespective of the platform being used, temporary permanent temporary + permanent such as a mobile phone or Audio over connection connection = temporary IP codec.

undefined / shared defined and constant undefined + defined Types of connections bandwidth bandwidth = undefined

Two types of connections can be used: Codec(s) or endpoint(s) Codec(s) and unknown endpoint(s) known l Permanent connections – which are : rare combination generally based on managed private

2008 – EBU TECHNICAL REVIEW 63 AUDIO OVER IP

networks with constant and well- sometimes has higher priority than TCP Networks known bandwidth and Quality of in routers. Service. For permanent connections, Telecom operators are offering an the audio codec types are usually Further information on the EBU N/ACIP increasing number of IP-based services. known in advance. project can be obtained at : http://www. Traditional services based on ATM, PSTN, ebu-acip.org ISDN, SDH and PDH will gradually be l Temporary connections – which may phased out or will become expensive niche be based on previously-unknown Types of equipment products. They will tend to be replaced networks with shared and unknown by all-over-IP services carried over copper, bandwidth over the Internet or over Different types of equipment can be fibre or wireless links. temporary leased private networks. identified: The codecs and endpoint may be Some EBU members have experience unknown. The audio codec type can l General contribution equipment – of real-world testing of Audio over IP be found through negotiation using meant for all types of contribution codecs over various types of IP networks. the SIP and SDP protocols. (fixed or remote); Determining the most suitable type of network from the solutions offered by EBU standard providers requires careful evaluation.

In the past, Audio over IP end units from In particular, measurements and real- different manufacturers have not been world tests with audio must be made compatible. Based on an initiative coming by the broadcaster in order to analyse from German vendors and broadcasters, the service level agreement and to verify the EBU started a project group called the performance offered by the network N/ACIP (Audio Contribution over provider. The network must be tested with IP). One of its tasks is to suggest a the applications that are to be used. Long- method for interoperability. After a term testing (months) of uninterrupted joint meeting between broadcasters and Example of an ISDN and Audio over IP audio throughput is recommended. manufacturers in September 2007, an fixed end unit interoperability recommendation was The distinction between well-managed published, which manufacturers have IP networks and the open Internet is already implemented [1]. l Portable contribution equipment – important. On the open Internet, no used mainly for monophonic speech mechanisms yet exist to achieve a good In a plug-test held at the IRT in Munich contribution at low bi-trates. QoS. The Internet is a “best effort” in February 2008, nine manufacturers network with no guaranteed Quality of demonstrated interoperability. The EBU Service at all. Over a ten-year period is working on a reference implementation the packet loss, delay and jitter over the software, which can be used to verify Internet have slowly improved, but the compatibility. network performance still poses a major problem to the developers of Audio over The requirements for interoperability are IP units. based on the use of RTP over UDP for the audio session and SIP for signalling. The Last mile access packet payload audio structure is defined by IETF RFC documents for commonly- There are many access solutions to connect used audio formats in radio contribution, the end user to the Internet or private IP such as G.722, MPEG Layer II and linear networks. Here’s an overview: PCM. By using SIP, a negotiation can be made to automatically find a common Fibre optics audio coding system on unknown units at each end. l This is the highest quality access solution, offering low error rates and The EBU standard suggests using RTP low delays. It is ideal for contribution over UDP rather than TCP. A one-way purposes but is still expensive and not RTP stream with a small header (low widespread. overhead) is more suitable for audio Example of a portable ISDN and Audio l Some cities have started to deploy transfers. Moreover, RTP over UDP over IP unit FTTH (Fibre to the Home) access.

64 EBU TECHNICAL REVIEW – 2008 AUDIO OVER IP

among users of the same radio Abbreviations cell. So, at the moment, this is not ideal for contribution despite the 3G 3rd Generation mobile RSTP Real-Time Streaming great advantage of mobility. Many communications Protocol operators also filter the traffic and ADSL Asymmetric Digital RTCP Real-Time Control block access for Voice/Audio over IP. Subscriber Line Protocol l Some operators plans to offer access to AoIP Audio over IP: broadband RTP Real-time Transport private IP networks in the future. We audio Protocol may expect that future solutions will ATM Asynchronous Transfer SAP Session Announcement have better Quality of Service. Mode Protocol HSDPA High-Speed Downlink SDH Synchronous Digital Satellite Packet Access Hierarchy HSUPA High-Speed Uplink Packet SDP Session Description l It is possible to get Internet access Access Protocol through a satellite by using a IETF Internet Engineering Task SDSL Symmetric Digital transmitter system for the return Force www.ietf.org Subscriber Line channel. It is used in remote location IP Internet Protocol SIP Session Initiation Protocol where all other access technologies ISDN Integrated Services Digital SMTP Simple Mail Transfer are not available. DVB-RCS (Return Network Protocol Channel over Satellite) is used in LTE Long Term Evolution TCP Transmission Control most cases. The access is generally (4th generation mobile Protocol shared by users and so it is difficult to networks) UDP User Datagram Protocol have a guaranteed Quality of Service. PCM Pulse Code Modulation UMTS Universal Mobile Inmarsat BGAN is also another PDH Plesiochronous Digital Telecommunication option. Hierarchy System l It may be possible in the future to have PPP Point-to-Point Protocol VoIP Voice-over-IP: access to private satellite networks PSTN Public Switched Telephone narrowband audio with enhanced Quality of Service. Network WiMAX Worldwide l The delay for the transmission over QoS Quality of Service interoperability for satellite is generally long (about 500ms RFC Request For Comments Mobile Access roundtrip delay). It is also necessary to (IETF standard) have a direct view to the satellite.

Wi-Fi Copper with xDSL (ADSL, l SDSL: Symmetrical uplink/downlink ADSL2+, VDSL, SDSL) bitrate. This type of access is preferred l Wi-Fi is not really a last mile access but for contribution because of the higher more a home network solution. It is l This type of access is now in widespread uplink bitrate for sending. however available as a last mile access use to make Internet connections. in some cities (sometimes for free). Business providers also have solutions Mobile communication: 3G/ l The frequency band is shared without for connecting to private IP networks UMTS, HSDPA, WiMAX coordination by many users and also using xDSL with a guaranteed Quality other systems (microwave ovens, of Service. This type of solution is l Mobile high bitrate accesses to DECT telephones). So it is impossible pre-ferred to straightforward Internet the Internet are starting to emerge. to have guaranteed Quality of Service access for contribution purposes. However the problem is the lack on such accesses but they may give l ADSL: Asymmetrical uplink/downlink of guaranteed Quality of Service. good results for a local access in not- bitrate. In many cases the access is shared too-crowded places.

2008 – EBU TECHNICAL REVIEW 65 AUDIO OVER IP

Synchronization Delay of delay. When using Audio over IP in combination with video contribution, Variations in the delivery time of packets Buffers at the receiving end can lip sync will be an issue. occur, mainly due to varying delays introduce a considerable amount of in routers and the sharing of available delay. The delay buffer size is a trade- Conclusions capacity with other data traffic. Buffering off between an acceptable delay and a is required to compensate for this variation reliable transmission. In addition, the The continuous development of IP (see the diagram). There is generally IP network itself has a delay, from a few networks, combined with more no clock transported so it must be tenths of milliseconds in well-managed sophisticated Audio over IP end units will reconstructed at the receiving end. Many networks up to 500 ms or more on very lead to more use of this technology in the different clock recovery algorithms exist. long distances over the open Internet. future. The EBU group N/ACIP has made a The difficulty is to estimate the slow The audio encoding itself may give proposal for interoperability. Connections clock drift correctly and separate it from delays of just a few milliseconds for over the Internet with different types of the short-time network jitter. Streaming PCM up to more than hundreds of telephony and professional units for audio with high quality is dependant on milliseconds for some bitrate-reduced broadcasting will improve telephone a guaranteed and stable clock rate at both coding formats. audio quality and worldwide access for the sending and receiving ends. Another reporters. Small handheld units and also possibility is to use an external clock The time to fill in packets must also software codecs in laptops or mobile source, such as GPS or from a common be considered: longer packets will phones will provide very efficient tools non-IP network clock. mean increased delay, especially at for reporters. SIP will provide a very low bitrates. In the case of a two-way powerful way of finding the other end, conversation, a total round-trip delay and negotiate a suitable audio coding which is less than 50 ms is generally format. Fixed Audio over IP units will preferred, otherwise a conversation begin to replace older synchronous point- becomes difficult, especially when to-point equipment for contribution of persons from the general public are stereo or multichannel audio. interviewed. Experienced reporters may be less sensitive to higher values First published: 2008-Q1

Lars Jonsson was born in 1949 and received an M.Sc. in Electronic Engineering at the Royal Institute of Technology in Stockholm in 1972. He joined the Research Team of Swedish Radio and, in the early days, worked on the development of video and RF systems. Later he worked in Operation and Development at the Local Radio Company. He rejoined the SR development team in 1992. During the last decade, Mr Jonsson has worked on digital audio quality issues, archiving and the audio computer infrastructure within Swedish Radio. He is a member of several working parties within the Audio Engineering Society and the EBU. Lars Jonsson is currently the chairman of the EBU Audio Contribution over IP working group, N/ACIP.

Mathias Coinchon was born in 1975 and graduated in 2000 in communication systems engineering from the Swiss Institute of Technology in Lausanne (EPFL), Switzerland, and the Eurecom Institute in Sophia-Antipolis (France). He developed his diploma thesis at BBC R&D in Kingswood Warren, studying and developing propagation analysis solutions for Digital Radio Mondiale (DRM) field trials. He then joined as technical project manager, a startup company called Wavecall, which is active in the development of a physical propagation prediction tool for the mobile telecommunication industry. After Wavecall, Mr Coinchon spent four years at RSR public swiss radio, first with responsibility for contribution and then in charge of a group dealing with distribution, contribution and IT networks. During this period, he was also actively involved in the technical study group of the Swiss broadcasting corporation (SRG-SSR idée suisse) for the re-launch of DAB digital radio in Switzerland. Since 2006 Mathias Coinchon has been a senior engineer in the EBU Technical Department. He is currently secretary of the N/ACIP and N/VCIP groups dealing with audio and video contribution over IP. He is also involved in digital radio matters and is vice-chairman of the WorldDMB technical committee. His other areas of work include audio, distribution, open-source software, IP-based TV studios and traffic information. In his spare time, he is involved in helping a community radio station.

66 EBU TECHNICAL REVIEW – 2008 Visit EBU Technical’s new website at : http://tech.ebu.ch

2008 – EBU TECHNICAL REVIEW 67 Find EBU Technical - Publications, including EBU Technical Review, at http://tech.ebu.ch/publications