Institutionen för datavetenskap Department of Computer and Information Science

Final thesis

Simulation of Set-top box Components on an X86 Architecture by Implementing a Hardware Abstraction Layer

by Faruk Emre Sahin Muhammad Salman Khan

LITH-IDA-EX—10/050--SE

2010-12-25

Linköpings universitet Linköpings universitet SE-581 83 Linköping, Sweden 581 83 Linköping

Linköping University Department of Computer and Information Science

Final Thesis

Simulation of Set-top box Components on an X86 Architecture by Implementing a Hardware Abstraction Layer

by Faruk Emre Sahin Muhammad Salman Khan

LITH-IDA-EX—10/050—SE

2010-12-25

Supervisors: Fredrik Hallenberg, Tomas Taleus R&D at Motorola (Linköping)

Examiner: Prof. Dr. Christoph Kessler Dept. Of Computer and Information Science at Linköpings universitet

Abstract

The KreaTV Application Development Kit (ADK) product of Motorola en- ables application developers to create high level applications and browser plugins for the IPSTB system. As a result, customers will reduce develop- ment time, cost and supplier dependency. The main goal of this thesis was to port this platform to a standard PC to make it easy to trace the bugs and debug the code. This work has been done by implementing a hardware abstraction layer(HAL)for Linux . HAL encapsulates the hardware dependent code and HAL provide an abstraction of underlying architecture to the oper- ating system and to application software. So, the embedded platform can be emulated on a standard Linux PC by implementing a HAL for it. We have successfully built the basic building blocks of HAL with some performance degradation. We are able to start up the application platform, use graphics mixing features and play a video via filtering the data from the transport stream and decoding it. But there is still a lot of work to do to build the whole HAL for all the platform to be run smoothly as they do on a set-top box hardware.

Keywords : Simulation, Hardware Abstraction Layer

iii iv Acknowledgements

We are thankful to our supervisors, Fredrik Hallenberg and Tomas Taleus, whose concern, guidance and support from the initial to the final stage of the thesis enabled us to perform a successful work. We also want to thank to our examiner Christoph Kessler for his helps and advices throughout the thesis.

v Preface

This thesis has been performed at Motorola, Kreatel Communications AB by Faruk Emre Sahin and Muhammad Salman Khan. This company pro- vides a platform called KreaTV for fast high level application development and integration on different STBs to its clients offering reduced develop- ment time, cost and supplier dependency. Faruk mainly worked on graphics mixing fixtures while Salman worked on multiplexer and video decoding part. Accordingly, chapter 1, section 2.2, 3.1, 3.3, 4.1, 4.2, 5.1-5.4 and 6.1 have been written by Faruk and reviewed by Salman. Section 2.3, 5.5-5.8, 6.2 are written by Salman and reviewed by Faruk. Section 2.1,3.2 4.3, 4.4 and chapter 7 are co-authored by both Faruk and Salman.

vi Contents

1 Introduction 1 1.1 Background ...... 1 1.2 Problem ...... 2 1.3 Method of Thesis Work ...... 3 1.3.1 Philosophy ...... 3 1.3.2 Work Process ...... 3 1.4 Reading Instructions ...... 4

2 Background 5 2.1 Inter Process Communication ...... 5 2.2 Graphics Handling ...... 7 2.2.1 Pixel ...... 7 2.2.2 Blitting ...... 7 2.2.3 Alpha Blending ...... 8 2.2.4 Color Keying ...... 8 2.2.5 Clipping ...... 8 2.3 MPEG...... 9

vii CONTENTS viii

2.3.1 Transport Stream ...... 9 2.3.2 TS Packet fields ...... 10 SYNC Byte ...... 10 Transport Priority (TPR), Payload Unit Start (PUS), Error Indicator (EI) ...... 11 Packet Identifier (PID) ...... 11 2.3.3 Scrambling Control (SCR) ...... 11 2.3.4 Adaption Field (AF) ...... 11 2.3.5 Continuity Check Index (CC) ...... 12 2.3.6 Program Specific Information (PSI) ...... 12 Program Association Table (PAT) ...... 13 Program Map Table (PMT) ...... 14

3 System Architecture 15 3.1 General Architecture ...... 15 3.2 Hardware Abstraction Layer Overview ...... 16 3.2.1 Hardware Abstraction Interface ...... 17 3.2.2 Resources ...... 17 3.3 Inter-Process Communication ...... 20

4 Problem Analysis 22 4.1 Problem ...... 22 4.1.1 Debugging in Embedded Systems ...... 22 4.2 Solution ...... 23 4.3 Requirements ...... 24 4.3.1 Restrictions ...... 24 CONTENTS ix

4.3.2 Hardware Abstraction Interface ...... 25 4.3.3 Graphics Memory Resource ...... 25 4.3.4 Graphics Layer Resource ...... 25 4.3.5 Graphics Blitter Resource ...... 25 4.3.6 Demuxer Resource ...... 26 4.3.7 Media Decoder Resource ...... 26 4.3.8 Video Layer Resource ...... 26 4.4 Review of Related Work ...... 27

5 Design and Implementation 29 5.1 Graphics Handling ...... 29 5.1.1 Library Choice ...... 30 DirectFB ...... 31 SDL...... 31 Comparison of SDL and DirectFB ...... 32 5.2 PC-HAL Memory Resource ...... 33 5.2.1 Unix Mmap function ...... 33 5.3 PC-HAL Graphics Layer Resource ...... 35 5.3.1 Update Screen ...... 35 5.4 PC-HAL Graphics Blitter Resource ...... 36 5.5 PC-HAL Stream Input Resource ...... 37 5.6 The PC-HAL Demuxer Resource ...... 38 5.6.1 Connect Input Interface ...... 39 5.6.2 Get Format Interface ...... 40 5.6.3 Set Format Interface ...... 41 5.6.4 Create Output Interface ...... 41 CONTENTS x

5.6.5 Destroy Output Interface ...... 42 5.6.6 Open Output Pipe Interface ...... 42 5.6.7 Close Output Interface ...... 44 5.6.8 Add Packet Filter Interface ...... 44 5.6.9 Remove Packet Filter Interface ...... 45 5.7 The PC-HAL Media Decoder Resource ...... 46 5.7.1 Interface Description ...... 47 Decoder States ...... 47 5.7.2 Reset Interface ...... 48 5.7.3 Halt Interface ...... 49 5.7.4 Run Interface ...... 49 5.7.5 Set Format Interface ...... 50 5.8 Testing of PC-HAL Resources ...... 51

6 Results and Evaluation 53 6.1 Graphics Mixing ...... 53 6.2 PC-HAL Demuxer Performance ...... 54 6.2.1 Testing with Same Buffer Size ...... 55 6.2.2 Testing with different Buffer Sizes ...... 55

7 Conclusion and Future Work 58

Glossary 60

Acronyms 63

Bibliography 65 Chapter 1

Introduction

1.1 Background

A set-top box which is often abbreviated as STB is a small computer with limited number of resources connected to a digital television. It differs from conventional personal computers due to limited computation power, small memory, and dedicated I/O resources. The input of STB is a source of signal and STB turns the signal into an understandable content which at the end is displayed on TV or another display device. The STB handles different types of inputs such as inputs from satellite and digital streams from the IP based network. A typical digital set-top box contains one or more microprocessors to run the operating system, possibly Linux or Windows, and to parse the MPEG transport stream. A set-top box also includes Random Access Memory (RAM), an MPEG decoder chip, and more chips for audio decoding and processing. The contents of a set-top box depend on the Digital Television (DTV) standard used. The term set-top box basically describes the platform device connected to TV set as its output and broadband network connection as input. The STB accepts commands from the end user often with a remote device and

1 Introduction 2 transmits these commands to the network operator through some sort of return path. Most set-top boxes deployed today have return path capabil- ity for two-way communication. The typical applications related to STB include digital video channel broadcasting, video on demand, web browsing and similar other common applications. This thesis has been performed at Motorola, Kreatel Communications AB. This company provides a platform called KreaTV for fast high level appli- cation development and integration on different STBs to its clients offering reduced development time, cost and supplier dependency.

1.2 Problem

The KreaTV platform is currently running on dedicated hardware. This hardware consists of multiple resources to accomplish the necessary tasks related to IPTV applications. These hardware resources are designed specif- ically to handle the business requirements of the IPTV users. Therefore, they have less capabilities and consume less power in comparison to a stan- dard Personal Computer (PC). From the user perspective, the IPSTB hard- ware is good enough to satisfy video broadcasting, web browsing and sim- ilar requirements. However, from the developers and testers perspective, developing and testing the system on an embedded machine can be quite tedious. In the past, the IPSTB has been restricted by low broadband connections and high cost of infrastructure to transport the IPTV traffic to consumer home. But nowadays, IPTV is growing at a very fast pace as high speed broadband connection is available to millions of households worldwide and it is expected to continue growing in the near future. Due to the increasing demands and expectations from the consumer market, IPSTB software and hardware is growing to satisfy the user requirements. The Hybrid Set- top boxes allow traditional TV broadcast such as terrestrial, satellite or cable TV providers to be brought together with the video delivered over the Internet. Therefore, vendors are providing some other applications such as browsers with their set-top boxes to attract consumers with all- Introduction 3 in-one pattern. With the increasing number of applications on IPSTB such as Internet browsers and multimedia applications, the complexity of set-top boxes increases. The software quality becomes the challenge for the software engineers as many bugs are introduced with the increasing complexity of the software. So, Motorola wants to make it easier to find and trace the bugs in their application platform. In addition, the hardwares for the set-top boxes are improving day by day. So a need of testing the current software on a more powerful computer arises to see the compatibility and effectiveness of the current platform on a better architecture before the architecture is already available in the market.

1.3 Method of Thesis Work

This section describes the philosophy and the work process used in the thesis.

1.3.1 Philosophy

The philosophy of the thesis work can be described as follows.

• Software reuse. If a library or an open source code available for any required part of the project, the code should be used. There is no need to reinvent the wheel.

1.3.2 Work Process

The first step of the thesis was to identify and understand the problem. Then we have done a literature study to see what has been done so far to simulate embedded systems. We also tried to understand the existing system for a remarkable time. Then we have done a research to find avail- able and reusable software components to be used in different parts of the Introduction 4 project. Then we have applied the implementation, got the results and then evaluated our results with comparing the performance of our imple- mentation with an average set-top box hardware.

1.4 Reading Instructions

The report has been structured as follows.

• Chapter 2 describes the background that is necessary to understand the rest of the report. • Chapter 3 explains the existing systems architecture.

• Chapter 4 looks to the problem in detail and discusses the require- ments to be implemented within the limits of this thesis • Chapter 5 explains our design and implementation for the require- ments listed in chapter 4. • Chapter 6 presents the results of the implementation with perfor- mance comparisons. • Chapter 7 discusses our findings and suggests for future improve- ments. Chapter 2

Background

This chapter explains the underlying technical concepts and terms related to implementation.

2.1 Inter Process Communication

IPC is a set of techniques and mechanisms to provide a communication between a set of processes. It provides communication by exchanging data between multiple threads in one or more processes. The processes may be running on the same machine with shared memory or it may be running on separate machines, each having its own local memory interconnected by a network. In embedded systems, CPU and memory are the most impor- tant resources because of their power consumption, real time constraints from applications and complexities of multi-processing environment. Also the processes have strict CPU and memory requirements. In this kind of systems, IPC plays a major role to fulfill the communication constraints for the processes. The performance of these systems is highly dependent upon the selection of a particular IPC mechanism. In multi-process systems, when there is a large amount of data flow in the

5 Background 6 system, the processes are usually dependent on the shared data; The high performance IPC technique always plays a major role for the application response time. Various design techniques have been proposed to improve the IPC mechanism at kernel level [1]. The IPC mechanism can also be improved by the use of libraries which can access the system API calls. During the library function calls, the interrupts are disabled to provide atomic IPC operations. Hence, the overhead of context switching is min- imized. IPC can also be improved by the use of hardware and software features as proposed by Hsiehm at al. [2]. IPC provides a set of API which allows some standard set of operations on the interprocess communication. Some of the proposed techniques for the UNIX based systems are as follows;.

• Socket Communication Socket communication provides two way point to point communica- tion between two processes. A socket itself is a point of communica- tion which has one or more associated processes [3]. • Message Queue A set of processes exchange information via access- ing to a common system message queue. The sending process puts a message to the queue which is read by another process [3].

• Pipe and Named Pipe Piping is a process where the input of one process is made the input of another process [3]. Named pipes are used to exchange data between unrelated processes and processes that are on different computers where server has a well-known name and client accesses to the server via the known name [4].

• Semaphore Semaphores are a programming construct designed by E. W. Dijkstra in the late 1960s often used to monitor and con- trol the availability of system resources such as shared memory seg- ments [3].See section 5.2 for further information.

• Shared Memory Background 7

Shared Memory is an efficient way of passing data between processes. One process creates a memory portion which another processes can access [3]. See section 5.2 for further information. • Message Passing Message Passing IPC mechanism lets processes send and receive mes- sages, and queue messages for processing in an arbitrary order [3].

• Memory Mapped File See section 5.2.

2.2 Graphics Handling

Some necessary technical terms related to graphics management are ex- plained in this section.

2.2.1 Pixel

The pixel is the smallest addressable and controllable screen element. The more pixels used for an image, better results achieved. The number of pixels in an image is called resolution. For example a 640 x 480 display has 640 pixels horizontally and 480 vertically resulting 640 x 480 = 307.200 pixels. A pixel can only be one color at a time. The total number of colors a pixel can be is determined by the number of bits used to represent it. For example, 8-bit color allows up to 28 = 256 colors to be displayed. At this color depth, one can be able to see spotted colors when multiple pixels are joined together to form an image. If the color depth is increased to 16,24 or 32-bit, the color blending is smooth, so one can not see spotted colors.

2.2.2 Blitting

Low level 2D graphics involves copying blocks of memory. For example to display a texture, we must copy the texture block of memory into the Background 8 correct part of the screen block of memory. In addition, during the copying we may want to take into account an alpha value for transparency. (See section 2.2.3) In computer graphics, this process is called a Block Image Transfer (Blit).

2.2.3 Alpha Blending

Alpha Blending is to mix an image with a background image producing a new image. The degree of the front image color’s effect differs from completely opaque to completely transparent. If we do alpha blending on the pixel level, if the foreground color is completely transparent, the blended color will be the background color. The same way, if the foreground color is completely opaque, the blended color will be the foreground color. 32 bit graphics systems contain four channels; Three 8-bit channels for red, green and blue and one 8-bit for alpha channel. The alpha channel is a mask specifying how the pixel’s color should be merged with another pixel when two are put together on top of another. Generally, alpha channel is defined per object instead of pixel by pixel basis.

2.2.4 Color Keying

Color keying is to blend two images together where a color from one image is removed revealing another image behind it. The default behavior of a blit operation is to copy a block of pixels from one image to another. However this is not good enough. Most of the time, a portion or portions of the image should be invisible in blitting operation leaving that pixel unchanged. This is done by setting a color key.

2.2.5 Clipping

Clipping is to write graphics only to a certain portion of the screen, keeping the area around it free of graphics. The region against which an object is to be clipped is called clipping window. Background 9

2.3 MPEG

Recent progress in digital technology has made widespread use of com- pressed audio and video signals. It is important to make standardization for the development of common compression methods so that the services interact each other through a common platform. Moving Picture Expert Group (MPEG) is an expert working group by ISO and IEC to work upon the standardization of audio and video data compression. It specifies the complete standard for Audio and Video compression [5]. According to the MPEG-2 standard, there are two types of data streams i.e. Program Stream and Transport Stream. We are only concerned with Transport Stream in the scope of this thesis.

2.3.1 Transport Stream

MPEG transport stream or simply Transport Stream (Transport Stream (TS)) is a standard format for transmission and storage of audio, video, and user defined data. The format of transport stream is specified in MPEG- 2 part-1 systems (ISO/IEC 13818-1). The main intention is to provide a standard way of multiplexing digital audio, video and other data streams to synchronize the output. Transport stream also provides some other useful features such as error correction for unreliable transmission media. It is actively used in broadcasting allocations such as DVB and ATSC. MPEG- TS contains a sequence of fixed size packets of 188 bytes, each packet consists of 4 bytes of header and 184 bytes of payload. MPEG-TS has a fixed size packet format and also each packet has a Sync byte. Therefore, it is easy to detect the start and the end byte of a TS packet. MPEG transport stream (MPEG-TS) uses a fixed length packet size and a packet identifier identifies each transport packet within the trans- port stream. A packet identifier in an MPEG system identifies the Pack- etized Elementary Stream (PES) of a program channel. A program (such as a television show) is usually composed of multiple PES channels (e.g. video and audio). MPEG-TS carries multiple programs and to identify the programs carried Background 10

Figure 2.1: MPEG Transport Stream Packet Format [6] on an MPEG-TS, a program allocation table and a program mapping table is used. These tables are periodically transmitted to provide a list of the programs contained within the MPEG-TS. These program tables provide a list of programs and their associated Program Identifier (PID)s for a specific program which allows the MPEG receiver and decoder to select and decode the correct packets for that specific program.

2.3.2 TS Packet fields

The Transport Stream contains multiple header fields to provide the ser- vices to the MPEG elementary streams. The details of packet header fields are as follows.

SYNC Byte

The header starts with the sync byte which is responsible for packet iden- tification and synchronization. It provides the exact start of the packet in a stream. The sync byte value is ”0x47”. Background 11

Transport Priority (TPR), Payload Unit Start (PUS), Error In- dicator (EI)

A set of flags (3 bits in total) provide the information about how the packet is processed at the receiver side. The Transport Priority flag indicates high or low priority of a packet in the stream. A Payload Unit Start flag alerts the receiver about the start of a new Packet Elementary Stream (PES) packet. The error indication flag shows the error status in previous stream transmission.

Packet Identifier (PID)

The Packet Identifier (PID) is used to identify the streams in a whole transport stream. It is a 13-bit field generated by the multiplexer at the sender side. There are some predefined PIDs which are used to provide the control information for various streams. At receiver side, if PID is unknown or not needed then it is silently discarded. For example, the PID value of 0x1FFF contains NULL packets. Therefore receiver silently ignores these packets.

2.3.3 Scrambling Control (SCR)

The two scrambling control bits are used to provide conditional access procedure to some of the confidential TS packets. It encrypts the payload of the TS packet so that only the concerned receiver can have access to the contents.

2.3.4 Adaption Field (AF)

An adaption field identifies if an adaption field is present in the packet header. The presence of an adaptation field is indicated by the adaption field control bits in a transport stream packet. If it is present, the adaption field comes right after the 4 bytes packet header, before the start of any Background 12 user payload data. The adaptation field contains a variety of data which is used for timinig and control of different streams.

Bits Value Action 01 No adaptation field exist, payload only. 10 Adaptation field only exist, no payload. 11 Adaptation field followed by payload. 00 Not used, reserved for future use.

Table 2.1: Adaptation Field Values

2.3.5 Continuity Check Index (CC)

The 4 bits continuity check index or continuity counter ensures the sequen- tial order of the packets. It is incremented only when payload is present i.e. adaptation field value is 01 or 11.

2.3.6 Program Specific Information (PSI)

Transport stream contains multiple programs for transmission. Each pro- gram is defined by various audio, video, and control streams as shown in following figure. The receiver determines the particular PID of a program and then filters out TS packets related to the program based on the PID value. To determine the PID value corresponding to the required program, a set of tables are used. These tables are also called “Signaling tables”. These tables are transmitted with the description of programs contained in a TS packet stream. Signaling tables are sent out separately from the PES and they are not synchronized with the elementary streams. Figure 2.2 shows a composition of different elementary streams in a transport stream. These elementary streams can be identified by using Program Association Table (PAT) and Program Map Table (PMT). In MPEG-2 terminology, the signaling tables are called “Program Specific Information” (PSI) [7]. Each PSI table contains a sequence of PSI sections. The length of each section is variable and each section allows a decoder to Background 13

Figure 2.2: Program Specific Information of Transport Streams identify the next section in a packet. To verify the integrity of the table, a CRC (checksum) field is used.

Program Association Table (PAT)

PAT is used to list down all the available programs in a transport stream. The programs are identified by the 16-bit value inside PAT known as “Pro- gram Number”. For further detail about each program, PAT uses a unique PID value for the identification of the program in a Program Map Table (PMT). The PAT is sent with the well-known PID value of 0x0000 in a transport stream. Background 14

Program Map Table (PMT)

PMT lists detailed information about the programs. Each program has its own PMT identified by the unique PID value. PMT describes a pro- gram number and a list for elementary streams to make a single MPEG-2 program. Each stream is presented with its stream type value such as au- dio or video stream. Some optional descriptors of MPEG-2 programs and elementary streams are also present in PMT. There are also some other tables present in transport streams such as Con- ditional Access Table (CAT) and Network Information Table (NIT) but MPEG-2 standard does not specify the format for them. [7] Chapter 3

System Architecture

We will start with describing the general architecture of the STB and then more specifically explain the Hardware Abstraction Layer (HAL).

3.1 General Architecture

KreaTV runs on different set-top box models with a specific Hardware Abstraction Layer on each of them. The Hardware abstraction layer im- plements all the hardware specific details and hides the low level hardware dependent code from the upper layers. KreaTV application platform talks with HAL with the Hardware Abstraction Interface (HAI) that HAL ex- poses. The service layer provides services that are available to the applica- tion layer. There are a number of different services provided. For example, one can use the naming service to look for applications to download from the operators server to the set-top box. Application layer resides on the top layer. One can run several independent applications simultaneously on the set-top box. Figure 3.1 demonstrates the overall architecture.

15 System Architecture 16

Figure 3.1: Overall Architecture

3.2 Hardware Abstraction Layer Overview

Different hardware platforms need different implementations of HAL but still provide the same interface(HAI). HAL also includes the operating sys- tem kernel, drivers and standard and C++ libraries. Different hardware platforms have different features and capabilities. The reason for this might be for instance price, specific market demands, customer demands or de- mands from the environment in which the hardware exists. The result of this is for instance: System Architecture 17

• Different hardware architectures (dual or single chip solutions).

• Different Central Processing Unit (CPU) architecture (x86, Pow- erPC,MIPS etc). • Different set of connectors. • Different set of supporting chips (Smart Card controller, audio, graph- ics and video mixing etc). • Different hardware support for audio/video decoding, transport pars- ing, video and graphics mixing, decryption etc.

The Hardware abstraction layer provides a uniform abstraction layer to the upper layers implementing all the hardware dependent code.

3.2.1 Hardware Abstraction Interface

The Hardware Abstraction Interface (HAI) separates the hardware specific code and the hardware independent code and gives hardware independent code access to hardware specific implementations in a general way.

3.2.2 Resources

The HAL consists of a number of abstractions of hardware related re- sources. The resources together has the following characteristics and fea- tures:

• They provide an abstract view of the underlying hardware resources. • They allow for implementations on multiple hardware platforms.

• They provide low level error handling. • They provide functionality through well-defined interfaces for use by hardware independent code. System Architecture 18

• They contain hardware specific code.

• They can provide information about the capabilities of the underlying hardware platform and the implementation, such as available video connectors and codecs.

There are two types of resources. Traditional resources and IPSTB spe- cific resources. Traditional resources are accessed using standard Linux system interface, e.g the network stack and the flash file system. IPSTB specific resources are accessed with device dependent libraries provided by the vendors of the set-top boxes. HAL Server and KreaTV application platform residing in application layer are independant processes where application layer calls the remote objects of HAL. Figure 3.2 illustrates how client processes(application layer) uses the Hardware Abstraction Layer to access the HAL Server.HAI consists of several C header files, one for each interface, and there is one interface for each resource and an additional interface to open resources. The following traditional resources are part of HAL:

• Audio: Standard Linux Digital Signal Processing (DSP) and mixer devices. • Ethernet: Standard Linux ethernet driver.

• Flash: Standard Linux Memory Technology Device (MTD) device. • USB: Standard Linux USB driver (core and Human Interface Device (HID)).

The following IPSTB specific resources are part of HAL:

• DVB Receiver: Controls DVB receivers, be it terrestrial (T), cable (C), or satellite (S). The receiver type is determined by the system when opening the DVB receiver resource. • Firmware: Handles firmware upgrading. System Architecture 19

Figure 3.2: HAL Communication

• Front display: Provides functionality to change contents and proper- ties of the front display. • Infrared (IR): Handle the input from remote controls. This is done by a kernel driver that converts IR signals to Linux input events. • LED: Provides functionality for Light-Emitting Diode (LED) control. • Media Decoder: Provides the common interface for the media decod- ing functionality of the IPSTB. It gives access to a specific underlying decoding resource such as a video decoder or an audio decoder.

• Smart Card: Provides control of the smart card hardware. System Architecture 20

• Stream Input: Provides an efficient data channel between the client and the connectible resources in the HAL server (e.g. Demuxer and Media Decoder). • Time Stamp: Provides functionality to synchronize events with de- coded video data. • Demuxer: Performs hardware filtering on MPEG-2 transport streams (TS). The interface provides functionality for raw stream filtering and section filtering. It is also possible to set up descrambling on a specific filter. • Watchdog: Provides control of a hardware or software watchdog.

• VBI: Provides functions for sending data during the VBI on the TV display. • Video Output: Provides control of the different video outputs. • Graphics Memory: Provides the functionality to allocate and bind a portion of memory dedicated to graphics.

• Graphics Blitter: Provides extensive blitting functionality to mix dif- ferent graphics surfaces. • Graphics Layer: Provides functionality to edit the mapped memory portion and refresh the screen.

• Video Layer: Provides the functionality to display the video stream on a graphical plane.

3.3 Inter-Process Communication

Various applications and services on the same layer as well as different layers communicate with each other using the internal IPC mechanism developed by Motorola. This IPC framework is similar to conventional IPC frameworks which allow a process to call functions belonging to another System Architecture 21 process. The object that its methods need to be invoked remotely has an Interface Definition Language (IDL) entry where the interface specification is written. The IDL compiler parses this IDL file and generates the codes for the caller and dispatcher. Chapter 4

Problem Analysis

The problem is analyzed in detail in this section.

4.1 Problem

With the increasing number of applications on IP-STB such as Internet browsers and multimedia applications, the complexity of set-top boxes in- creases. The software quality becomes the challenge for the software en- gineers as many bugs are starting to be introduced with the increasing complexity of the software. The main operation to find bugs in the devel- opment phase is to debug the code.

4.1.1 Debugging in Embedded Systems

A Set-Top Box is an designed to perform a few dedicated functions. It is resource limited and too specialized to support a fully featured debugger. Typically debugger for embedded systems address this low resource limitation by distributing the debugger between the host and

22 Problem Analysis 23 the target system. The two end points of the system communicate over a communication channel like a serial or ethernet port. The portion of the debugger that resides in the target machine is often called debug kernel and the code in the host machine is usually called the debugger front end or GUI. Debugger front end and debug kernel is communicated through an interrupt routine. However most embedded systems place their code into non-volatile memory such as EPROM. But the debug kernel needs to update the code image, modify the program, set breakpoints. These systems need to substitute RAM for the normal code memory, usually via some kind of ROM emulator. [8] Besides the overwhelming process of copying the code to the target machine and the long boot up procedure, debugging in embedded systems need a stable memory subsystem in the target and is not suitable for initial hardware/software integration. Also, it is not very stable as debugger may not have control over the system all the time. In addition, lots of popular and high quality debugger tool suits like Valgrind are not working good with embedded systems.

4.2 Solution

Running the KreaTV platform on a standard PC could help developers in various ways. When developers are working with an STB, they need to copy the code from the host machine to target machine and run the boot- up procedure to be able to see the results. Instead of this overwhelming process, it would be much more easier if they could run the system on the machine that they are developing on. It will be faster to run the platform on a PC as the kernel is already loaded and only thing a developer has to do is to restart the application rather than the whole platform. This simulation can be done by writing a hardware abstraction layer for a standard PC. Yoo, Bacivarov, Bouchhima, Paviout and Jerayya [9] proposed to have an adaptation layer which in our case called ’Hardware Abstraction Layer’ for fast and accurate SW simulation models. In addition, when debugging the code on a PC, it would be much easier Problem Analysis 24 for the developers to find bugs. There are huge number of debugging tools that can be used if the platform can be run on a standard PC. For example, Valgrind - an extensive debugging and profiling tool for C++ - can not be used with an embedded machine while it is fully supported by a standard PC. A limited amount of time is defined for this project, therefore there had to be some priorities regarding which resources of HAL should be imple- mented. So the question is which resources are more important to build a basic hardware abstraction layer. Just to be able to start the application on a window, we need graphics memory and graphics layer resources to be set up. In order to draw some shapes and mixing graphics surfaces which a standard application usually requires, the blitter resource has to be working. To play a simple video, we need the demuxer layer to filter the video stream and program specific information from multiple streams residing in a transport stream. This filtered stream is encoded in a form of ele- mentary streams. So, after filtering the video stream, it has to be decoded at the server side back to its original form. To enable this, we need the decoder layer. After decoding a video stream by the decoder, it will be displayed on the screen. So, to set the graphic surface and video layer parameters for the screen, we need the video layer.

4.3 Requirements

This section describes the requirements for the selected resources.

4.3.1 Restrictions

• The platform shall run on a standard PC using Linux operating sys- tem. Problem Analysis 25

• The programming language shall be C++.

4.3.2 Hardware Abstraction Interface

• The implementation shall be consistent with the provided hardware abstraction interface. • Input and output parameters for the interfaces shall be preserved.

4.3.3 Graphics Memory Resource

• The developer shall be able to allocate a memory portion dedicated to graphics surface • The allocated memory portion must be shared between client and server process

4.3.4 Graphics Layer Resource

• The developer shall be able to start the application in a window mode. • The developer shall be able to edit a graphic surface in the memory and update the screen with the modified memory

4.3.5 Graphics Blitter Resource

• The developer should be able to do blitting on surfaces. • It shall support scaling. • It shall support cropping. • It shall support color keying. • It shall support alpha blending. • The blit rate should not be less than an average STB. Problem Analysis 26

4.3.6 Demuxer Resource

The demuxer resource filters the MPEG transport stream.

• It shall separate different streams based on the requirements from the user. • It shall provide a facility for section filtering. • It shall manage the buffering capability to handle different type of streams.

4.3.7 Media Decoder Resource

Media decoder is a unified resource to provide the decoding facility to video and audio transport streams.

• Decoder shall take the filter streams from the demuxer output, decode and send them to the video layer to display the decoded streams. • It shall provide the synchronization between the related streams in a program.

4.3.8 Video Layer Resource

The video layer resource is used to display the decoded video on screen.

• It shall set the video mode and video surface.

• After regular intervals, it shall blit the video streams on the video surface in order to display the video. • The resolution and video position shall be set by using this resource. Problem Analysis 27

4.4 Review of Related Work

Testing and debugging embedding systems is a hard and tedious task not just in this specific project but in general. So, various work has been done to make this process easier and effective. Yoo and Jerraya [10] define HAL as all the software that is directly de- pendent on the underlying HW including boot code, context switch code and codes for configuration and access to HW resources. Further more, the authors touch on how HAL can be used to ease OS porting on a new hardware architecture. Yoo, Bacivarov, Bouchhima, Paviot, Jerraya [9] depict that HAL shall be simulated in order to build fast and accurate SW simulation models. They state that specific implementations of operating system API’s shall be de- termined at HAL level and since the original HAL can not be run on the simulation host, a simulation model for HAL shall be built. Seo, Sung, Choi, Kanga [11] talks about the importance of a simulator in order to solve the hardware dependency for embedded software testing. They use HAL to initialize and configure the target hardware in order to automate embedded software testing on an emulated target board. Viper [12] allows to model and simulate distributed embedded hardware architecture. This is done by virtualizing each node of the system in a process that runs on ad-hoc port of the real-time operating system Tram- poline. The HAL of the embedded software of the node communicates with ViPER through a dedicated low-level API based on POSIX IPC. Haglund [13] proposed a design of hardware abstraction layer to emulate set-top box hardware components on an x86 architecture at Motorola also. The author discusses the benefits of emulating set-top box components on a standard PC as to be a helper for developers and to be a full product for customers. The author describes the current architecture of the platform and compares different libraries to be used in each layer of the proposed HAL. Chunrong, Shibao, and Feng [14] proposed a design for two layers filtering for Transport Streams. These filters include PID filter and section filter. Problem Analysis 28

Their design can help to extract every type of data exclusively from a TS stream. Morris and Anthony [15] proposed a detailed programming model for PID and Section filtering. In their book, they used a library for implementing the PID and Section filtering. Chapter 5

Design and Implementation

In this chapter, we will describe our design and implementation for the problem. The generic resource names will be renamed with PC-HAL prefix for our design and implementation.

5.1 Graphics Handling

We have mentioned that 3 major resources have to be implemented to be able to handle user interface and graphics functionality of the platform. These are graphics layer resource, graphics memory resource and graph- ics blitter resource. The usual way of accessing these resources from a developers perspective is shown in the figure 5.1. The developer initially gets a graphics Memory resource to get a memory dedicated to graphics that can be directly written into. Then, he/she can get a graphics layer resource to set up a screen with setting its video mode. (height, width etc.) Then he/she can use blitting functionality. The blitter

29 Design and Implementation 30

Figure 5.1: Sequence diagram to access graphic resources depends on graphics layer resource and the graphics layer resource depends on graphics memory resource.

5.1.1 Library Choice

HAL supports wide range of graphics and video mixing capabilities. Each functionality is implemented with device specific libraries provided by the producer of the set-top box model for each model. So, we need to find a suitable library to use in the hardware abstraction layer for a standard PC. SDL (Simple Direct Media Layer) and DirectFB (Direct Frame Buffer) are two different libraries offering to handle the graphic library requirement. [13] Design and Implementation 31

DirectFB

DirectFB is an abbreviation for Direct Frame Buffer. It is a software library for Unix based operating systems providing graphics acceleration, input device handling, windows system management on top of Linux framebuffer. DirectFB is free software licensed under the terms of the LPGL. DirectFB allows applications to talk directly to video hardware to speed up and simplify graphic operations. [16] DirectFB allows a wide range of drawing and graphics mixing functionality with hardware support including rectangle drawing and filling, blitting, alpha blending, color keying. It has no library dependencies except of libc which is already on the development machines.

SDL

SDL is abbreviation of Simple Direct Media Layer which is a cross-platform multimedia library providing low level access to graphics, audio, keyboard, mouse, joystick and 3D hardware via OpenGL. [17] SDL is a wrapper for the operating system specific functionality.

Figure 5.2: SDL Architecture [18] Design and Implementation 32

The source code for SDL is divided into separate modules for different operating systems. When SDL is compiled, the correct modules are chosen for the specific target and that specific calls are made to the underlying system. In Linux, SDL uses Xlib to communicate with the X11 system for graphics. SDL is distributed under GNU LGPL version 2 which allows to use SDL as long as the dynamic library is linked in the program.

Comparison of SDL and DirectFB

When we look to the requirements of the thesis for graphic functionality, both SDL and DirectFB seem to be enough. Table 5.1 compares the features of SDL and DirectFb.

Features DirectFB SDL Alpha blending X X Color Keying X X Clipping X X Zooming/Resizing X X OpenGL support X X Support for hardware acceleration X X Display in a window X X Audio Support X X

Table 5.1: Comparison of SDL and DirectFB

As we can see from table 5.1, both of them seem to be providing what is required to implement for this thesis. But, DirectFB is mainly intended to be run on embedded systems and mostly used in embedded systems. So the new architecture of a standard PC can introduce some problems that are not very well documented by the directFB work group. For example, on a standard PC there can be problems with updating the pixels on the screen when resizing the main window when working in a window mode instead of a full screen mode. This kind of problems do not seem to be arising with SDL, as SDL’s main target audience is not embedded systems. So, we have decided to proceed with SDL. Design and Implementation 33

5.2 PC-HAL Memory Resource

Client processes need to modify the graphics memory to be able to use graphics mixing functionality. But this memory is allocated and managed by the server process. So, we need a mechanism to share this memory between processes. Shared memory is the fastest form of IPC available. Once the memory is mapped into the address space of the processes that are sharing the memory region, no kernel involvement (no system calls) occurs in passing data between the processes. However, there needs to be some form of synchronization between the processes that are storing and fetching information to and from the shared memory region. [19] An example for a client-server shared memory architecture flow can be as follows:

1. The server accesses a shared memory object using a semaphore. 2. The server accesses the shared memory object, for example read a file content from an input file to the shared memory. 3. When the read is complete, the server notifies the client, using a semaphore. 4. The client writes the data from the shared memory object to another file.

5.2.1 Unix Mmap function

The mmap is a posix-compliant Unix system call that can map either a file or a Posix shared memory object into the address space of a process. This function can be used for 3 purposes:

1. with a regular file to provide memory-mapped I/O 2. with special files to provide anonymous memory mappings 3. with shm open to provide Posix shared memory between different processes Design and Implementation 34

The C function for the mmap implementation is as follows.

#include void *mmap (void *addr, size_t len, int prot, int flags, int fd, off_t offset); addr can specify the starting address within the process of where the de- scriptor should be mapped. Usually a null pointer is passed here, so the kernel chooses the starting address. For both of the cases the return value is the starting address of the mapped descriptor. If an error occurs, it re- turns MAP FAILED. len is the number of bytes to map into the address space of the process, starting at offset bytes from the beginning of the file. Normally, offset is 0. [20] The prot argument specifies the protection of the memory mapped region. The values that prot argument can take is shown in the table 5.2

PROT READ data can be read PROT WRITE data can be written PROT EXEC data can be executed PROT NONE data cannot be accessed

Table 5.2: Prot Argument [20]

The flags are specified by the constants. Either the MAP SHARED or the MAP PRIVATE flag must be specified. If MAP PRIVATE is specified, then modifications to the mapped data by the calling process are visible only to that process and do not change the underlying object (either a file object or a shared memory object). If MAP SHARED is specified, modifications to the mapped data by the calling process are visible to all processes that are sharing the object, and these changes do modify the underlying object. [21] Design and Implementation 35

MAP SHARED changes are shared MAP PRIVATE changes are private MAP FIXED interpret the addr argument exactly

Table 5.3: Flags [20]

One important goal of mmap is to provide shared memory between dif- ferent processes. In this case, the actual contents of the file become the initial contents of the memory that is shared, and any changes made by the processes to this shared memory are then copied back to the file (providing file system persistence). This assumes that MAP SHARED is specified, which is required to share the memory between processes. The client and server processes in Motorola’s IPC framework are unrelated processes and all the modifications in shared memory should be both visible to both of the processes. So, mmap is used to implement this functionality.

5.3 PC-HAL Graphics Layer Resource

The main responsibility of this resource is to set up the SDL environment on a window mode (with the desired width and height) and flush the changes made in the memory to the screen. To implement this functionality, basi- cally SDL is being started when the user wants an instance of this resource.

5.3.1 Update Screen

The developer can allocate a memory region with the PC HAL Memory Layer resource and directly edit that memory portion to draw something on the screen. But, to update the screen, the update screen method of PC HAL Graphics layer should be called.

PC_HAL_Result PC_HAL_Graphics_Update_Screen(PC_HAL_Viewport* viewport) Design and Implementation 36

Input Parameters

• viewport: A wrapper to the graphical surface that is going to be updated.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC HAL error handling in case of an error

5.4 PC-HAL Graphics Blitter Resource

This resource basically implements all the blitting functionality. HAI sup- ports copying and filling rectangle, scaling, clipping, alpha blending, color keying, and various blending schemas to support complex drawings and editing on the graphics surface. The chosen graphics library SDL supports all these functionality.

PC_HAL_Result PC_HAL_Graphics_Blit(SDL_Surface* srcSurface, SDL_Surface* outputSurface, PC_HAL_Blitter_Operation& op)

Input Parameters

• srcSurface : The source surface that will be blitted.

• outputSurface : Destination surface that will be blitted onto. • op: The wrapper defining all the details of the blitting operation.

Return Value Design and Implementation 37

• PC HAL OK: on success.

• Error code in accordance with general PC HAL error handling in case of an error

5.5 PC-HAL Stream Input Resource

This resource provides an efficient data channel to be used between the client and the connectable resources in the HAL Server. The Stream Input resource is opened with the instance name set to empty string. This tells the PC-HAL server to open a Stream Input resource and the PC-HAL server creates a new name for this instance. The PC-HAL Stream Input resource can only have one output and the resource can be connected to the connectable resources such as the PC-HAL Media Decoder or the PC-HAL Demuxer resource. The interfaces provide a function called ”ConnectInput”which is used to set up a connection. Figure 5.3 illustrates how the media streaming inside PC-HAL server is made. To make the data channel efficient, the Stream Input resource has to be pro- vided with a shared memory buffer to read the data from the stream. The stream format is set by the “SetFormat” function. The PC-HAL Stream Input resource forwards the format descriptor to the receiving PC-HAL resource. The format descriptor will reach the receiving resource synchro- nized with stream data, i.e. data pushed prior to the call to this function will be received before the format descriptor, and data pushed following the call will be received after the format descriptor. Thus, the receiving HAL resource is constantly up-to-date with the format of the data stream it re- ceives. If the PC-HAL Stream Input resource is connected to the PC-HAL Demuxer, the format setting has no effect. The PC-HAL Stream Input resource sends the data by describing the seg- ments in the shared memory buffer which contains the payload. This is done with ”Push” function. Before pushing new payload, the old one should be checked for removal by using “Pop” function. To verify that the connected resource has consumed the payload, an observer should be registered first. Design and Implementation 38

Figure 5.3: Simplified View of Media Streaming inside PC-HAL Server

To remove all the segments that have been pushed but not yet consumed by the PC-HAL server, the function ”Flush” can be used.

5.6 The PC-HAL Demuxer Resource

The PC-HAL demuxer resource is a crucial resource that performs soft- ware based filtering on MPEG-2 transport streams. A demuxer in general seperates different streams at the reciever side which are muliplexed at the sender side. Transport stream contains multiple programs with high data rate. So, this resource provides the functionality for filtering MPEG-2 transport stream (TS) by filtering the specific PID program to outputs. The interface provides functionality for raw stream filtering and section filtering. The demuxer can be connected to an input resource which is providing the transport stream at its output. The Input can be a Stream Input resource or a DVB Receiver resource. A demuxer can have several outputs and an output is created with create output function. Each output is given a unique output id to recognize it. To receive data from a specific output id, the pipe functionality is used with Unix sockets. This pipe gives the client a file descriptor that can be used for reading the output data. There are two layers of filtering that can be performed on an output: Trans- Design and Implementation 39 port streams PID filtering and Section filtering. These filters are set up by using unique PIDs for transport stream packets. Several streams identified by its packet identifier can be filtered on the same output. It is also al- lowed to have several section filters on the same output. However, it is not possible to have both of the filter types on the same output. Before the output data is written to the pipe, the data is commonly stored in an in- ternal buffer. If there are a lot of data filtered by the client, it is likely that this buffer needs to be enlarged. When changing the block size, there is a discontinuity in the output data. But the data is always checked to be TS aligned. An output from the PC-HAL demuxer resource can be connected to a PC-HAL Media Decoder resource. Thus, the data path will be more efficient.

5.6.1 Connect Input Interface

The Connect Input interface provides the connectivity of PC-HAL demuxer resource with the other resources of PC-HAL layers. During the operation, the input of demuxer can not be explicitly disconnected. The only possible way to change the input of demuxer resource is to reconnect it with an- other PC-HAL resource. In each call of Connect Input function, first, the previous input is disconnected and then the new input resource is set to be connected. If either the demuxer resource or the resource that the input is connected to is closed, then the input of the demuxer is automatically disconnected. It is also reset if the demuxer is implicitly disconnected.

PC_HAL_Result PC_HALDemuxer_ConnectInput(PC_HAL_Resource handle, const char* resource_name, const char* resource_instance_name, int index )

Input Parameters

• handle: Handle to the mpeg-2 transport stream demuxer resource. Design and Implementation 40

• resource name: The name of the resource to connect to demuxer. • instance name The name of the current instance of the resource. • index: The index of the output of the resource to identify different outputs resources. The index value starts with zero.

Return Value

• PC HAL OK: on success. • PC HAL INVALID PRECONDITION: if the connected resource is an input stream instance is not synchronized. • Error code in accordance with general PC HAL error handling

5.6.2 Get Format Interface

Get Format interface is used to get the current format of the input stream. The PC-HAL demuxer supports transport and program streams. The transport stream is set to be default when a new resource is used unless it is specified.

PC_HAL_Result PC_HAL_Demuxer_GetFormat(PC_HAL_Resource handle, PC_HAL_Demuxer_Format* format)

Input Parameters

• handle: Handle to the mpeg-2 transport stream demuxer resource. • format: The desired input stream format i.e. expected to be set in the format attribute.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC HAL error handling. Design and Implementation 41

5.6.3 Set Format Interface

Set Format is used to set the current input stream format input to the demuxer. This format has to be set before connecting any input to the demuxer resource.

PC_HAL_Result PC_HAL_Demuxer_SetFormat(PC_HAL_Resource handle, PC_HAL_Demuxer_Format format)

Input Parameters

• handle: Handle to the mpeg-2 transport stream demuxer resource.

• Format: The desired input stream format to set.

Return Value

• PC HAL OK: on success. • PC HAL INVALID PRECONDITION if input already is connected. • PC HAL OPERATION NOT SUPPORTED if format is not sup- ported or in accordance with general PC-HAL error handling.

5.6.4 Create Output Interface

It is used to add a new output to the demuxer. The demuxer can handle multiple outputs at a time.

PC_HAL_Result PC_HAL_Demuxer_CreateOutput(PC_HAL_Resource handle, uint* outputid)

Input Parameters Design and Implementation 42

• handle: Handle to the mpeg-2 transport stream demuxer resource. • outputid: The unique id for each of the demuxer output.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC-HAL error handling.

5.6.5 Destroy Output Interface

It is used to add a new output to the demuxer. The demuxer can handle multiple outputs at a time.

PC_HAL_Result PC_HAL_Demuxer_DestroyOutput(PC_HAL_Resource handle, uint* outputid)

Input Parameters

• handle: Handle to the mpeg-2 transport stream demuxer resource. • outputid: The unique id for each of the demuxer output.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC-HAL error handling.

5.6.6 Open Output Pipe Interface

This interface provides the communication between the HAL server and the clients. It opens an output pipe between one of the demuxer outputs Design and Implementation 43 and the client using Unix domain datagram sockets. There are two types of output pipes which are known as “Packet Pipe” and “Section Pipe”. An output pipe only sends complete filtered packets or sections at a time. In PC-HAL demuxer, output pipes are only supported if the input format is transport stream packets. The PC-HAL client is connected with the Unix domain socket and it is responsible of supplying the socket name to the PC- HAL server that is ready to serve a connection from the transport stream demuxer. To avoid a deadlock between the demuxer and PC-HAL client, the connection is established asynchronously. So, if the demuxer can not connect to the socket it will silently fail. However, there are some checks inside the demuxer that verifies if the socket name is valid.

PC_HAL_Result PC_HAL_Demuxer_OpenOutputPipe(PC_HAL_Resource handle, uint outputid, const char* socket_name, PC_HAL_Demuxer_OutputPipeType pipe_type)

Input Parameters

• handle: handle Handle to the mpeg-2 transport stream demuxer re- source. • outputid: The unique id for each of the demuxer output.

• socket name: The socket name the PC HAL demuxer connects to the client. • pipe type: The type of output pipe to open i.e. Packet output pipe and Section Output Pipe.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC-HAL error handling. Design and Implementation 44

5.6.7 Close Output Interface

This interface is responsible for closing the output pipe between the de- muxer outputs and the client. The socket connection under output pipe is also destroyed as pipe closed.

PC_HAL_Result PC_HAL_Demuxer_CloseOutputPipe(PC_HAL_Resource handle, uint outputid)

Input Parameters

• handle: handle Handle to the mpeg-2 transport stream demuxer re- source. • outputid: The unique id for each of the demuxer output.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC-HAL error handling.

5.6.8 Add Packet Filter Interface

The packet filter contains the program ids that are allowed to pass through the filter. This interface adds a packet pid to the set of allowed pids that the transport stream demuxer does not discard during the filter process. Packet filter supports transport streams and program elementary streams.

PC_HAL_Result PC_HAL_Demuxer_AddPacketFilter(PC_HAL_Resource handle, Design and Implementation 45

uint outputid, uint pid )

Input Parameters

• handle: handle Handle to the mpeg-2 transport stream demuxer re- source. • outputid: The unique id for each of the demuxer output. The output id has to be a pid pipe or an output connected to the another resource such as decoder. • pid: The new pid to add to the demuxer output for reference.

Return Value

• PC HAL OK: on success. • PC HAL RESOURCE BUSY: if the desired pid is already used with a section pipe because the same pid on packet and section filtering is not supported simultaneously. • Error code in accordance with general PC-HAL error handling.

5.6.9 Remove Packet Filter Interface

This interface removes a packet pid from the set of pids sent to a specific output on the transport stream demuxer. The pid should be added before to remove it.

PC_HAL_Result PC_HAL_Demuxer_RemovePacketFilter(PC_HAL_Resource handle, uint outputid, uint pid) Design and Implementation 46

Input Parameters

• handle: handle Handle to the mpeg-2 transport stream demuxer re- source. • outputid: The unique id for each of the demuxer output. The output id has to be a pid pipe or an output connected to the another resource such as decoder. • pid: The new pid to add to the demuxer output for reference.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC-HAL error handling.

5.7 The PC-HAL Media Decoder Resource

The PC-HAL media decoder resource provides a generic interface to the underlying concrete decoding sources such as audio and video decoding. By using the Media Decoder Resource, it is possible to access the underlying decoder resources with generic set of interfaces irrespective of actual types and implementations of decoders. The available instance of a media decoder resource is a software based video decoder. At creation time of a media decoder resource, the decoder type is determined by the instance name. To control a video decoder instance of the Media Decoder resource, the PC-HAL MPEG Video Decoder interface has to be used. The Media Decoder is called to be a connectable resource and the input connection is set up by using the Connect Input function. There are two types of resources that can be connected to a Media Decoder. These are PC-HAL Stream Input Resource and PC-HAL Demuxer Resource respec- tively. The PC-HAL demuxer can have several outputs and the output index argument of the connect input function tells which output to be used as input for this resource. Design and Implementation 47

When opening a Media Decoder resource, it is implicitly reset by default. The format settings and all options are set to default values and the decoder is set to a depleted state. In order to know when the decoder changes state (see section 5.7.1 for the possible states), the user has to register as an observer. This observation is done by PC-HAL Media Decoder Resource Observer. By using the media observer resource, the decoding is stopped when the decoder has data under-run. Similarly media observer resource is used to get signaled when the decoder has enough data to start decoding the stream. The media observer resource also provides an event driven function to know when format description of media decoder changes. The client can get the current state and information anytime through the interface functions of media observer resource. The Media Decoder resource provides the interface functions to set the op- tions for decoding. The available options depend on the number of opened decoder instances. The value of a decoder option can be retrieved anytime by the client in order to control the decoding process during the execution. When the options are set and decoder is in ready-to-run state, the run function starts the decoding process. It is possible to halt the decoding process and start it again at anytime during the execution.

5.7.1 Interface Description

Decoder States

At any given time, the media decoder is in exactly one of three possible states. Valid transitions between states are limited to those mentioned in the description for each individual state.

• Depleted State: On open and reset, the decoder is in depleted state. In this state, the decoder examines data on the input and buffers it. If the buffer becomes full, the decoder ensures to keep the last received data. When the decoder has received enough data and a run or halt command is received, it enters the running/halted state. If data received on the input is corrupted and can not be used for decoding, it is simply discarded. Design and Implementation 48

• Running State: The decoder enters running state as soon as it has buffered enough data to start and received the run command. In this state, decoding is running continuously and the input data is consumed accordingly. If decoding is synchronized to a clock but there is not enough data on the input to perform decoding in the required speed (i.e. buffer under run), an input of empty event will be emitted and the decoder will make a state transition to depleted state. • Halted State: The decoder enters the halted state as soon as it has buffered enough data to start and received the halt command. If it is called from the running state, decoding is immediately halted. In this state, the decoder is ready to start decoding the input stream. It does not consume any input data in this state.

5.7.2 Reset Interface

This interface resets the decoder to idle state. It uses default format set- tings and sets all options to default values. The event queue is also cleared. These default values should be set at the the initialization of the PC-HAL decoder resource. The input is still connected to the resource it was con- nected to prior to the call to this interface. The decoder is reset implicitly when opening the decoder resource . The client is not required to call this function before playback can commence.

PC_HAL_Result PC_HAL_MediaDecoder_Reset(PC_HAL_Resource handle)

Input Parameters

• handle: Handle to the generic media decoder resource.

Return Value Design and Implementation 49

• PC HAL OK: on success. • Error code in accordance with general PC HAL error handling

5.7.3 Halt Interface

It halts the decoding process. The decoder will stop consuming data from the input stream once it has buffered enough data to be able to start decoding. If it is called when the decoder is in running state, the decoder is halted immediately.

PC_HAL_Result PC_HAL_MediaDecoder_Halt(PC_HAL_Resource handle)

Input Parameters

• handle: Handle to the generic media decoder resource.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC HAL error handling

5.7.4 Run Interface

When this interface is called, decoding process is started as soon as the decoder has received enough data to be able to start. If it is called from the halted state, decoding is started immediately.

PC_HAL_Result PC_HAL_MediaDecoder_Run ( PC_HAL_Resource handle ) Design and Implementation 50

Input Parameters

• handle: Handle to the generic media decoder resource.

Return Value

• PC HAL OK: on success. • Error code in accordance with general PC HAL error handling

5.7.5 Set Format Interface

It sets the current input stream data format of the decoder. Initially, in the opening and reset phase of the decoder, the format is unset. It has to be set before starting to decode. The format can only be changed when the decoder is in the depleted or halted state.

PC_HAL_Result PC_HAL_MediaDecoder_SetFormat ( PC_HAL_Resource handle, const void* descriptor, uint size )

Input Parameters

• handle: Handle to the generic media decoder resource. • descriptor: A pointer to a buffer which contains the decoding format descriptor. • size: Size of the format descriptor buffer supplied by the client to PC-HAL decoder.

Return Value Design and Implementation 51

• PC HAL OK: on success.

• Error code in accordance with general PC HAL error handling

5.8 Testing of PC-HAL Resources

The interaction between PC-HAL server and PC-HAL client is done through the common interface. Each resource residing in PC-HAL server must im- plement its common interface so that PC-HAL client can call the generic functions of the server resource. Figure 5.4 shows an example of the se- quence of the calls for opening a demuxer resource by PC-HAL client. The remaining calls works on the same pattern as shown below. The “CloseRe- source”function is called at the end by a PC-HAL client to close the opened PC-HAL server resource.

Figure 5.4: Calling sequence for the function calls to PC-HAL Server

Each resource is opened using the common interface which provides a han- dle that must be used in each call to the resource. The resources can be Design and Implementation 52 opened either in an exclusive mode or in a shared mode. In a shared mode, the resource can not be opened by any another client or a process until it is closed by the client that opened it. In the shared mode, the resource can be opened in shared mode by any other client or a process at the same time. The PC-HAL server considers each calling process to be a different user; Therefore it is not possible to share resource handles between threads in a process. Each user must open its own handle to a resource which means that exclusive type resources are opened exclusively to a single thread. The local interfaces at the client and server side provide the specific features implemented at client and server side respectively. Chapter 6

Results and Evaluation

We have successfully built the skeleton of the HAL for a standard PC answering all the requirements stated in chapter 3. In this chapter, we will point out our results, our findings and evaluate our result.

6.1 Graphics Mixing

Simple Direct Media Layer is used as a graphical library to handle the graphical requirements of the HAL. SDL’s default library was enough to do cropping, alpha blending, color keying. However, an extra library called SDL gfx was used to do scaling as SDL’s default library did not offer any functionality to fulfill this requirement. SDL was able to create surfaces from preallocated memory region which made it easy to use SDL’s graphical functions using a shared memory por- tion between client and server process. Also, it let the pixels to be modified individually which allows to modify the surface in a free way from the client process. To start a simple HAL and make drawings and animations, the current

53 Results and Evaluation 54 implementations performance is enough. Table 6.1 shows the blitting per- formance results compared to an average STB. In this comparison each pixel is 32 bits and the unit for the results is megapixels per second.

Blitting Operation SDL speed Average STB hardware speed 720x576 fill 48 141 720x576 copy 34 92 1280x720 fill 55 164 1280x720 copy 37 107 256x556 fill 54 155 256x256 copy 33 96

Table 6.1: Comparison of SDL Blitting Speeds (megapixels per second)

The performance of the current implementation seems to be 3 times slower than an average STB. This is normal as we did not take advantage of hardware acceleration in SDL. The blitting speed is enough to basically start the application, draw and animate some simple things. But in the future, when every part of the HAL is implemented, it can cause delay when playing MPEG streams or browsing the Internet which can be very irritating for the end user. Fortunately, SDL has an OpenGL back-end which allows to read, write and render pixels directly with OpenGL functions. [22] OpenGL API is an industry standard for high performance computer graphics. So, even if SDL’s hardware acceleration can’t meet desired requirements, OpenGL can be used to speed it up more. This could not be tested in the limits of this thesis because of the time constraints.

6.2 PC-HAL Demuxer Performance

The demuxer is a fundamental resource for the overall performance of PC- HAL. The filtering process can cause delay for high throughput streams such as MPEG video streams. To test the functional requirements for demuxer resource, we used different transport streams for filtering. These Results and Evaluation 55 transport streams have different bit rate for video streams.

6.2.1 Testing with Same Buffer Size

To test the performance of the demuxer resource, we used different files that have different bit rates. The output of a PC-HAL demuxer is connected to PC-HAL client through UNIX sockets. We started with low bit rate streams first by sending the transport stream from PC-HAL client to PC- HAL server and we checked the response of demuxer output. For low bit streams like audio streams and PSI tables, the demuxer works fine. The video stream contain higher throughput in a transport stream therefore output buffer discard the filtered video stream packets when it overflows. We have tested the demuxer multiple times and took an average value for each filter out stream. The results are shown in the table 6.2.

Parameters TS1 TS2 TS3 TS4 Overall Bit Rate 3.816 5.48 5.48 5.68 Video Stream Bit Rate 2.862 3.12 4.11 4.61 TS Packet Loss % 3.6% 4.1% 17% 23%

Table 6.2: Test results for the same buffer size

The table 6.2 shows that for low bit rate streams, the packet loss is not high as compared to high rate streams. The output buffer used for the filtered TS packets is 100x188 bytes (TS packet size is 188 bytes). Figure 6.1 shows that packets are lost rapidly with the increased data throughput.

6.2.2 Testing with different Buffer Sizes

The UNIX socket use the sending and receiving buffer for immediate data storage of data packets. We set different buffer sizes for an output buffer at client side. The basic reason to change the buffer size is to see the impact of buffer size to the packet loss. We have repeated the tests again and Results and Evaluation 56

Figure 6.1: PC-HAL Demuxer throughput with Packet Loss

checked the response from the demuxer. The results with different buffer sizes are shown in table 6.3. By looking table 6.3, we can conclude that if the buffer size increases, the packet loss decreases. However, it is not feasible to increase the buffer size for a high throughput stream. Therefore, the communication mechanism between PC-HAL server and client can be optimized to reduce the effect of packet loss. By looking to figure 6.2, we can deduce that increasing the buffer size is not a very effective strategy to overcome the packet loss. In future, PC-HAL Results and Evaluation 57

Parameters TS1 TS2 TS3 TS4 Overall Bit Rate 3.816 5.48 5.48 5.68 Video Stream Bit Rate 2.862 3.12 4.11 4.61 TS Packet Loss % Buffer Size = 100x188 3.6% 4.1% 17% 23% TS Packet Loss % Buffer Size = 1000x188 3.4% 3.7% 14.8% 21% TS Packet Loss % Buffer Size = 5000x188 3.3% 3.55% 12.6% 20.3%

Table 6.3: Test results for the different buffer size demuxer resource should be optimized for higher throughput streams.

Figure 6.2: PC-HAL Demuxer throughput with Packet Loss (Different Buffer Sizes) Chapter 7

Conclusion and Future Work

We have built the skeleton of the HAL for a standard linux PC with this thesis. But there is still a lot of work to do to build the whole HAL for all the platform to be run as they do on a set-top box hardware. Once it is completed, the developers can have the full functionality to test their applications on the same computer they are developing and also debug their code easily. Right now they can do this partially, for some specific part of the code which calls the already implemented interfaces from the PC-HAL server that is described in chapter 5. Graphics mixing using SDL works fine for now as not all the resources of PC-HAL are implemented, but as mentioned in section 6.1, the perfor- mance of the blitting operations is not fast enough to correspond to the requirements of MPEG stream playback. This shall be improved to get through the anticipated delay. The video stream contains a large number of TS packets as compared to other streams such as audio streams and PSI. The PC-HAL demuxer re- source is performing well on low bit rate media streams. For higher bit

58 Conclusion and Future Work 59 rate transport streams, the throughput of the demuxer becomes too high specially for video transport stream and this high throughput is difficult to handle with the conventional methods. In this project, we used socket and pipe method for connecting the output of demuxer resource to some other resources such as media decoder and the PC-HAL client. These methods shall be analysed and improved or some other methods shall be proposed to fully support high bit rate transport streams. The media decoder is working with MPEG-2 video streams. In future, this process shall be extended to decode audio streams as well. Some other video stream formats should also be supported in future such as avi, mp4 etc. As described in section 3.2.2, there are lot of resources to be implemented to fully support the HAL for PC. The next step to start with can be the audio decoding resource after filtering the audio stream with the demuxer resource. Glossary

ATSC ATSC (Advanced Television Systems Committee) is a group defin- ing digital television transmission standards over terrestrial, cable and satellite networks..

DVB Digital Video Broadcasting (DVB) is an internationally accepted open standard for broadcasting digital television.

EPROM An EPROM or erasable programmable read only memory is a type of memory that can retain its data after its power supply is switched off.

Hybrid Set-top box A device that connects to a TV to decode and dis- play DVB streams from some external and internal source. The input signals are from various sources such as from satellite, cable and IP based communication. HSTB also has capability to run various In- ternet services and perform tasks such as web surfing.

IPC Interprocess communication is a set of mechanisms used for data exchange between a set of processes . IPSTB Internet Protocol Set-top box(IPSTB) is a type of set-top box which enables users to use multimedia services delivered over IP net- work .

60 Glossary 61

IPTV Internet Protocol television (IPTV) is a system through which mul- timedia services are delivered over the traditional IP network. The services of IPTV is customized to provide live facilities like television, time shifted multimedia or video on demand . ISO International Organization for Standardization. It is an international standard body composed of representatives from various standard organizations. libc The GNU C library(libc) is used as the C library which defines the system calls and other basic facilities such as open, malloc etc. in the GNU system and most systems with the Linux kernel. Linux Linux is an open source family of operating systems based on Unix- like operating system having its own kernel. It manages all the re- sources of a machine and provides system calls for applications to access these resources. The linux is available as a full functional operating system including its kernel and third party user-space ap- plications .

MIPS Microprocessor without Interlocked Pipeline Stages, is an instruc- tion set architecture (ISA) which is developed by MIPS Computer Systems. MPEG-2 The MPEG-2 standard codecs compress lossy data which con- tains moving pictures and audio streams. It is a standard set by ISO and IEC for audio and video compression.

OpenGL OpenGL (Open Graphics Library)is a standard, cross-language and cross-platform API specification for writing applications produc- ing 2D and 3D computer graphics.

PowerPC PowerPC (Performance Optimization With Enhanced RISC Performance Computing, abbreviated as PPC also) is a RISC ar- chitecture created by AppleIBMMotorola alliance (AIM). Glossary 62

VBI The vertical blanking interval (VBI) exists in analog television, VGA, DVI and other signals which is the time difference of the last line of a frame of a raster display and the the beginning of the next frames first line. x86 The x86 is a generic name that refers to instruction set architectures of intel microprocessors family originally based on 8086 CPU. Acronyms

Blit Block Image Transfer.

CPU Central Processing Unit.

DSP Digital Signal Processing. DTV Digital Television.

HAI Hardware Abstraction Interface. HAL Hardware Abstraction Layer. HID Human Interface Device.

IDL Interface Definition Language. IR Infrared.

LED Light-Emitting Diode.

MPEG Moving Picture Expert Group. MTD Memory Technology Device.

PC Personal Computer.

63 Acronyms 64

PES Packetized Elementary Stream.

PID Program Identifier.

RAM Random Access Memory.

TS Transport Stream. Bibliography

[1] Jochen Liedtke. Improving IPC by kernel design. In SOSP 93 : Pro- ceedings of the fourteenth ACM symposium on Operating systems prin- ciples, 1993. [2] William E. Weihl Wilson C. Hsieh, M. Frans Kaashoek. The persis- tent relevance of IPC performance: New techniques for reducing the ipc penalty. In Proceeding of the fourth Workshop on Workstation Operating Systems, 1993. [3] http://www.cs.cf.ac.uk/Dave/C/, December 2010. [4] http://msdn.microsoft.com/en-us/library/aa365574(v=vs.85).aspx, December 2010. [5] http://www.ietf.org/rfc/rfc2250.txt, November 2010. [6] http://www.iptvdictionary.com/, November 2010. [7] http://knol.google.com/k/mpeg 2-transmission, November 2010. [8] Arnold S. Berger. Embedded Systems Design: An Introduction to Pro- cesses, Tools, and Techniques. CMP Books, 2002. [9] Sungjoo Yoo, Iuliana Bacivarov, Aimen Bouchhima, Yanick Paviot, and Ahmed A. Jerraya. Building fast and accurate SW simulation models based on hardware abstraction layer and simulation environ- ment abstraction layer. In Proceedings of the conference on Design,

65 BIBLIOGRAPHY 66

Automation and Test in Europe - Volume 1, DATE ’03, pages 10550–, Washington, DC, USA, 2003. IEEE Computer Society. [10] Sungjoo Yoo and Ahmed A. Jerraya. Introduction to hardware ab- straction layers for SoC. In Proceedings of the conference on Design, Automation and Test in Europe - Volume 1, DATE ’03, pages 10336–, Washington, DC, USA, 2003. IEEE Computer Society.

[11] Jooyoung Seo, Ahyoung Sung, Byoungju Choi, and Sungbong Kang. Automating embedded software testing on an emulated target board. In Proceedings of the Second International Workshop on Automation of Software Test, AST ’07, pages 9–, Washington, DC, USA, 2007. IEEE Computer Society.

[12] Jean-Luc B´echennec, Mika¨el Briday, S´ebastien Faucou, Florent Pavin, and Fabien Juif. Viper: a lightweight approach to the simulation of distributed and embedded software. In Proceedings of the 3rd In- ternational ICST Conference on Simulation Tools and Techniques, SIMUTools ’10, pages 74:1–74:9, ICST, Brussels, Belgium, Belgium, 2010. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering). [13] David Haglund. Emulation of set-top box hardware components on an x86 architecture. Master’s thesis, Link¨oping University, 2004. LITH- IDA-EX–04/074–SE.

[14] Chunrong Zhang, Shibao Zheng, Feng Wang, and Chi Yuan. Design and implementation of transport stream demultiplexer in HDTV de- coder SoC. Consumer Electronics, IEEE Transactions on, 51(2):642 – 647, May 2005. [15] Anthony Smith-Chaigneau Steven Morris. Interactive TV Standards. Elsevier Inc, 2005. [16] http://www.directfb.org/, November 2010. [17] http://www.libsdl.org/, November 2010. [18] http://en.wikipedia.org/wiki/SimpleirectMediaayer, November 2010. [19] John Shapley Gray. Interprocess Communications in Linux. Prentice Hall, 2003. [20] W. Richard Stevens. UNIX Network Programming, Volume 2, Second Edition: Interprocess Communications. Prentice Hall, 1999. [21] http://www.cs.cf.ac.uk/Dave/C/node27.html, November 2010.

[22] Ernest Pazera. Focus on SDL. Premier Press, 2002. 2010-12-25

Avdelning, Institution Datum Division, Department Date PELAB, Dept. of Computer and Information Science 2010-12-25 581 83 LINKOPING¨

Spr˚ak Rapporttyp ISBN Language Report category — Svenska/Swedish Licentiatavhandling ISRN 2 Engelska/English 2 Examensarbete 4 4 C-uppsats LITH-IDA-EX--10/050--SE 2 D-uppsats Serietitel och serienummer ISSN 2 Ovrig¨ rapport Title of series, numbering — 2 2 2 URL f¨or elektronisk version http://www.ep.liu.se/exjobb/ida/2010/dd-d/ 050/

Titel Simulation av Set-top-box Komponenter p˚aen X86 Arkitektur genom Implementation av en H˚ardvaruabstraktionslager Title Simulation of Set-top box Components on an X86 Architecture by Im- plementing a Hardware Abstraction Layer

F¨orfattare Faruk Emre Sahin, Muhammad Salman Khan Author

Sammanfattning Abstract The KreaTV Application Development Kit (ADK) product of Motorola enables application developers to create high level applications and browser plugins for the IPSTB system. As a result, customers will reduce development time, cost and supplier dependency. The main goal of this thesis was to port this platform to a stan- dard Linux PC to make it easy to trace the bugs and debug the code. This work has been done by implementing a hardware abstraction layer(HAL)for Linux Operating System. HAL encapsulates the hard- ware dependent code and HAL APIs provide an abstraction of under- lying architecture to the operating system and to application software. So, the embedded platform can be emulated on a standard Linux PC by implementing a HAL for it. We have successfully built the basic building blocks of HAL with some performance degradation. We are able to start up the application platform, use graphics mixing features and play a video via filtering the data from the transport stream and decoding it. But there is still a lot of work to do to build the whole HAL for all the platform to be run smoothly as they do on a set-top box hardware.

Nyckelord Keywords Simulation, Hardware Abstraction Layer Copyright

Svenska

Detta dokument h˚alls tillg¨angligt p˚aInternet - eller dess framtida ers¨attare - under 25 ˚arfr˚an publiceringsdatum under f¨oruts¨attning att inga extraordin¨ara omst¨andigheter uppst˚ar. Tillg˚ang till dokumentet inneb¨ar tillst˚and f¨or var och en att l¨asa, ladda ner, skriva ut enstaka kopior f¨or enskilt bruk och att anv¨anda det of¨or¨andrat f¨or ickekommersiell forskning och f¨or undervisning. Overf¨ ¨oring av upphovsr¨atten vid en senare tidpunkt kan inte upph¨ava detta tillst˚and. All annan anv¨andning av dokumentet kr¨aver upphovsmannens medgivande. F¨or att garantera ¨aktheten, s¨akerheten och tillg¨angligheten finns det l¨osningar av teknisk och administrativ art. Upphovsmannens ideella r¨att innefattar r¨att att bli n¨amnd som upphovsman i den omfattning som god sed kr¨aver vid anv¨andning av dokumentet p˚aovan beskrivna s¨att samt skydd mot att dokumentet ¨andras eller presenteras i s˚adan form eller i s˚adant sammanhang som ¨ar kr¨ankande f¨or upphovsmannens litter¨ara eller konstn¨arliga anseende eller egenart. F¨or ytterligare information om Link¨oping University Electronic Press se f¨orlagets hemsida http://www.ep.liu.se/

English

The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Link¨oping University Electronic Press and its proce- dures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/

c Faruk Emre Sahin Muhammad Salman Khan Link¨oping, January 14, 2011