Technische Universitat¨ Ilmenau Fakult¨at fur¨ Elektrotechnik und Informationstechnik

Diplomarbeit

Further development of VoIP softphone based on ’Microsoft RTC Client API’

vorgelegt von: Carla Garc´ıa S´anchez

eingereicht am: 15. 11. 2006

geboren am:

Studiengang: Elektrotechnik und Informationstechnik

Anfertigung im Fachgebiet: Kommunikationsnetze

Fakult¨at fur¨ Elektrotechnik und Informationstechnik

Verantwortlicher Professor: Prof. Dr. rer. nat. habil. Jochen Seitz

Wissenschaftlicher Betreuer: Dipl.-Ing. Yevgeniy Yeryomin Thanksgiving

Many people have helped me in one way or another during the course of this project. Through these lines, I would like to express to them my most sincere gratitude. To my professors, thank you for guiding and advising me at any moment. Every suggestion has been constantly useful to improve this work. I appreciate all the support from the personnel of the department of Communication Networks. To my family and friends, thank you for your unconditional support, for encouraging me in the hardest and most stressful moments. I appreciate that you have been there for me and trusted me. Especially, I want to show my gratefulness to my roommates and close friends in Ilmenau, because they have been sharing the everyday life with me these last months. Finally, I would like to thank TU - Ilmenau for allowing me to develop this project. Once again, thank you everyone. Abstract

In the time being, VoIP has become a widespread technology because enhances real- time communication making it easier and more natural, regardless where people are located. Voice over Internet Protocol (VoIP), like its name says, is a technology that enables voice communication over the network. This project intends to achieve the further development of a VoIP softphone based on SIP that was implemented as part of a PhD thesis in the department of Communication Networks. One of the aims of the project is to study the availability of this technology on a mobile environment and the adaptation of this softphone to mobile devices. A softphone is a software used to establish telephone calls from one computer to other softphones or conventional telephones making use of VoIP technology. Besides, it supports additional functionalities that can help and facilitate exclusive services to the final user that would not be possible with the current telephone network; for example, location of users independently of where they are connected or multiple videoconference calls. Before beginning with the development of the software application, it is essential to understand the operation and the structure of softphones based on Session Initiation Protocol (SIP), a protocol responsible of the establishment of the VoIP session between users. For that purpose, the first part of this project consists in a survey about VoIP technology and the protocols related to the VoIP environment, such as Session Initia- tion Protocol, Session Description Protocol (SDP) and Real-time Transport Protocol (RTP). Nowadays, there are many types of softphones running on diverse operating systems and programmed in different languages. Although they must follow the same basic structure, they can be totally differentiated because of the extra features they provide and the platform on which they are built. In this case, this application uses Microsoft RTC Client API, that supplies the libraries and interfaces required to implement the functionalities of the VoIP protocols previously mentioned. Some of the new features that will be added to this software application are:

• Management of the contact list: It will allow users to storage information about their contacts and access to it easily. Furthermore, it informs users about the presence availability of their buddies. • Videoconference call: In order to improve people communications, multimedia calls with audio and video become more real.

Although only a few functionalities are going to be developed, the capabilities of the softphone could be increased by adding new ones in function of future people needs and communication requirements. Contents i

Contents

1 VoIP Technology based on SIP 1 1.1 Introduction...... 1 1.2 VoIPFeatures...... 1 1.3 Advantages ...... 2 1.4 TypesofVoIPcalls...... 2 1.5 Operation ...... 3 1.6 VoIPprotocols ...... 3 1.6.1 Session Initiation Protocol (SIP) ...... 4 1.6.1.1 Introduction ...... 4 1.6.1.2 Protocol Design ...... 4 1.6.1.3 SIP Clients and Servers ...... 5 1.6.1.4 SIP Messages ...... 7 1.6.2 Session Description Protocol (SDP) ...... 10 1.6.2.1 Introduction ...... 10 1.6.2.2 Operation...... 11 1.6.3 Real-time Transport Protocol (RTP) ...... 12 1.6.3.1 Real-time Transport Control Protocol (RTCP) . . . . 13 1.7 VoIPClients...... 16 1.7.1 VoIP Clients running on different OS ...... 17 1.7.2 VoIP Clients for mobile devices ...... 20 1.7.3 Structure and operation of softphones ...... 21 1.7.3.1 Registration procedure ...... 23 1.7.3.2 Multimedia session establishment ...... 26 1.7.4 Softphones for Windows Mobile OS ...... 31 1.7.5 OS for mobile devices ...... 32

Diplomarbeit Carla Garc´ıaS´anchez Contents ii

2 Microsoft RTC Client API 34 2.1 Introduction...... 34 2.2 Object Model Overview ...... 35 2.3 Architecture...... 36 2.4 .NET Platform ...... 36 2.4.1 Introduction...... 36 2.4.2 Operation ...... 37 2.4.3 Advantages ...... 38

3 Development of VoIP softphone for Windows 2000/XP 39 3.1 Understanding the code source ...... 39 3.2 New functionalities ...... 40 3.2.1 Volume bar for microphone and speakers ...... 40 3.2.2 Sending DTMF signals ...... 40 3.2.3 Addition of videoconference ...... 41 3.2.4 ContactList...... 42 3.2.5 Encryption of media ...... 42 3.3 Testing the program and results ...... 44 3.3.1 Volume bar for microphone and speakers ...... 45 3.3.2 Sending DTMF signals ...... 45 3.3.3 Addition of videoconference ...... 47 3.3.4 ContactList...... 49 3.3.5 Encryption of media ...... 52 3.4 Softwaretools...... 54

4 Adaptation of the VoIP softphone for mobile devices 55

5 UML Structure 58 5.1 Classdiagram...... 61 5.2 Use case diagram ...... 61 5.3 Sequencediagram...... 73 5.4 Statediagram...... 74 5.4.1 Buddy state diagram ...... 74 5.4.2 Watcher state diagram ...... 75 5.4.3 Session state diagram ...... 75 5.4.4 Client state diagram ...... 76

Diplomarbeit Carla Garc´ıaS´anchez Contents iii

6 Getting Started 79 6.1 Software requirements ...... 79 6.2 Getting an account ...... 79 6.3 Description of Graphical User Interface ...... 80

A UML Diagrams 83 A.1 Classdiagram...... 83 A.2 Use case diagram ...... 83 A.3 Sequence diagram ...... 83 A.4 Buddystatediagram ...... 83 A.5 Watcher state diagram ...... 83 A.6 Session state diagram ...... 83 A.7 Clientstatediagram ...... 83

Bibliography 91

List of Figures 92

List of Tables 93

List of Abbreviations and Symbols 94

Thesis of Diplomarbeit 97

Erkl¨arung 98

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 1

1 VoIP Technology based on SIP

1.1 Introduction

VoIP (Voice over Internet Protocol) is simply the transmission of voice traffic over IP - based networks. It is also called IP Telephony, Internet telephony, Broadband telephony or Digital Phone. Companies providing VoIP service are usually known as VoIP providers, and protocols used to route voice signals over the IP network are identified as VoIP protocols. Although the Internet Protocol (IP) was originally de- signed for data networking, the success of IP in becoming a world standard for it has contributed to its use to voice networking. VoIP uses a broadband internet connection for routing telephone calls, as opposed to conventional switching and fibre optic alternatives. This process provides lower cost for communication consumers. Maybe the most interesting point of the technology for the user is that the current infrastructure is not needed to be reconfigured. The only requirements are to adapt the internet functionality and a conventional phone into one single service with software and hardware support.

1.2 VoIP Features

The biggest advantage of VoIP is that the customers can make and receive calls from anywhere in the world where a broadband internet connection is available without changing their phone number. This is known as mobility. It is not necessary to have multiple numbers (office, home, mobile, and so on) from the same person because the calls can be automatically routed to the VoIP phone where the user is registered. The customers can take their IP phones with them on national and international trips and still can manage to access what is essentially an individual’s domestic phone line. On the other hand there are the softphones, which are a software application that loads the VoIP services onto the desktop or laptop. Some even simulate an interface that looks like a telephone, with which you can place VoIP calls to anybody around the world, through a standard broadband connection.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 2

Most VoIP services come with the caller id, call waiting, call transfer, repeat dialling, or multi-conference call features. For additional features such as call filtering, forward- ing a call, or sending calls directly to the voice mail, the service provider may assess an additional fee. Most VoIP services also allow the user to check his/her voicemail over the web or attach messages to an e-mail that is sent to his/her PDA or PC. The facil- ities and components provided by VoIP phone system suppliers and service operators may vary in significant ways because not all of them support the same functionalities.

1.3 Advantages

Since calls can be placed across the Internet, using the Internet connection for both data traffic and voice calls allows consumers to save amounts of money. Thereby, the major reason to change to VoIP technology for telephone service could be cost reduction, for instance, the cost of the call is independently of the destination place, so there is no extra charge for long distances. VoIP is able to provide some additional features which make this technology even more attractive and may be difficult to achieve with conventional telecommunication companies, such as:

• Incoming phone calls can be automatically routed to your VoIP phone, regardless of where you are connected to the network.

• Call center agents using VoIP phones can work from anywhere with a sufficiently fast and stable Internet connection.

• Other features: multi - conference call, call forwarding, automatic redial, caller ID, and so forth.

1.4 Types of VoIP calls

There are three techniques of connecting to a VoIP network:

• Using a VoIP telephone.

• Using a conventional telephone with a VoIP adapter.

• Using a computer with speakers and a microphone.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 3

VoIP telephone calls are routed to other VoIP devices or to normal telephones on the PSTN (Public Telephone Switch Network). Depending on the device, there are two types of VoIP calls:

• PC - to - Phone call: from a VoIP device to a conventional telephone.

• PC - to - PC call: from a VoIP device to another VoIP device.

• Phone - to - PC call: from a conventional telephone to a VoIP device.

• Phone - to - Phone call: from a conventional phone to another conventional phone.

Note that a VoIP device may not be a PC.

1.5 Operation

The most common way VoIP works is that the end user establishes a high speed broad- band connection, using a router and a VoIP gateway. Instead of a standard telephone line, the router sends the telephone calls over an internet connection. The VoIP gate- way, placed somewhere in direct proximity of the connected Internet is responsible of connecting the VoIP network with the PSTN network. All the transmission data (SIP signalling, audio/video data and so on) are divided into smaller pieces called packets, before sending it over the internet. These packets are sent to their final destination and instructions for bringing back into an understandable form are embedded in them. It then goes through a VoIP gateway where the packets are reconverted into the orig- inal format utilizing a PSTN (Public Telephone Switch Network), thereby routing the call to the number the caller has dialled blending old technology and high technology delivery in a seamless and instantaneous way.

1.6 VoIP protocols

In this point, the main protocols required to implement a VoIP softphone based on SIP are described.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 4

1.6.1 Session Initiation Protocol (SIP)

1.6.1.1 Introduction

Session Initiation Protocol (SIP) is an application-layer control protocol that can es- tablish, modify, and terminate multimedia sessions (conferences) such as Internet tele- phony calls. These sessions can include one or more participants, invite new par- ticipants, add and remove media streams owing to SIP is a flexible and transparent protocol that allows the addition of more features in existing sessions. The prime signalling functions of the protocol are detailed below:

• Location of the end user to guarantee the communication regardless where he is placed.

• Determination of the availability of the end user to establish a session.

• Determination of the media capabilities and allowance the media negotiation between the participants involved in the communication.

• Negotiation of the features supported by the end users.

• Modification of the parameters or features in an already established session.

SIP is not a service provider, whereas SIP presents signalling capabilities that can perform different services. Consequently, SIP should work in concert with other proto- cols in order to supply the requirements of the users. If spite of that, SIP functionality and operation is completely independent of the rest of the protocols due to SIP is only involved in the signalling portion of a communication session. One obvious example is the operation of a VoIP call, where SIP is responsible for supporting of the session, Real - time Transport Protocol (RTP) for delivering real - time data, and Session Description Protocol (SDP) for describing multimedia sessions.

1.6.1.2 Protocol Design

SIP is a peer - to - peer protocol. It means that SIP qualities are defined in the communicating endpoints, not in the network. As it was explained previously, SIP is an application-layer protocol, following the TCP/IP model. The protocol structure can be divided in four different logical levels:

• Low layer: it is entrusted with the syntax and encoding of the SIP messages.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 5

Figure 1.1: A typical SIP network with gateways

• Second layer (Transport layer): it describes the sent requests and received re- sponses in the client and server sides that are transmitted over the network. There is a transport layer in every SIP element.

• Third layer (Transaction process layer): it manages the concordance between the requests and responses that have been transmitted using the transport layer, considering also the possible retransmissions and timeouts.

• Upper layer (Transaction user): all the SIP elements, except the stateless proxy, are defined as a transaction user. It could be said that it is responsible for analyzing and completing the tasks of the transaction process layer.

1.6.1.3 SIP Clients and Servers

There are five SIP entities whose behaviour is detailed as follows.

• User Agent Client (UAC): builds SIP request and sends them to the UAS.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 6

• User Agent Server (UAS): receives and manages SIP request from the UAC and prepares SIP responses.

• Stateless Proxy Server: gets requests from the transport layer and routes it to the next step using the message content, but without storing any information related to that request. For that reason, it is unable to distinguish between an original message and a retransmission. Stateless proxies do not provide any SIP timers and cannot build provisional responses like 100 Trying or 180 Ringing.

• Stateful Proxy Server: develops a deeper analysis of the requests received than the stateless proxy. It verifies the validation of the request and the consignee, routes the message and stores state information. Stateful proxies use timers to determine if the message must be retransmitted in case of not receiving a response. Furthermore, they can demand user agent authentication.

• Registrar Server: is a server that receives and handles REGISTER requests. The user information contained in these messages are validated (user agent au- thentication is required) and used for detecting the user location in the network. User agents send this type of requests periodically in order to update their loca- tion information.

Figure 1.2: SIP clients and servers

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 7

1.6.1.4 SIP Messages

It is defined two kinds of SIP messages:

• Request: it is sent from the client to the server

• Response: it is sent from the server to the client

Besides, they differ in the syntax and type of fields that form the message. There are defined six main requests (also called methods) in the SIP specification:

• INVITE: Invites a user to take part in a session.

• ACK: Acknowledges the reception of an INVITE request.

• BYE: Ends an existing session.

• CANCEL: Interrupts a current transaction.

• OPTIONS: Asks for information about a server’s capabilities.

• REGISTER: Informs about the user’s current location.

Here it is shown some examples of the exchange of the SIP messages:

Proxy user1 user2 Server

INVITE INVITE 180 RINGING 180 RINGING OK OK

ACK ACK

BYE BYE

OK OK

Figure 1.3: Example of SIP INVITE

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 8

Figure 1.4: Example of SIP REGISTER

Supplementary requests have been defined in SIP extensions, like Session Initiation Protocol (SIP)-Specific Event Notification (RFC 3265). This document describes how UACs can subscribe to specific events, like presence of their contacts, and how they receive the notification of these events.

Figure 1.5: Example of a SIP extension: SUBSCRIBE - NOTIFY

Each response message has a status code which is used to specify the significance of the transaction. According to the first digit of the status code, SIP responses are classified in six different groups or families:

1xx : Provisional - Informs about the status of a received request.

2xx : Success - Indicates that a request has been successfully processed.

3xx : Redirection - It is not possible to manage the request. The client must retrans- mit or revise the request.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 9

4xx : Client Error - The request has not succeeded because of a client error. Further action needs to be taken according to the response like modifying the original request.

5xx : Server Error - The request has not succeeded because of a server error and the server is not able to process it.

6xx : Global Failure - The request cannot be processed. The client should not retry it.

The most important header fields of a SIP message are:

• Request-URI: It should contain the value of the SIP URI in the To field (except in case of REGISTER request, which refers to the domain where the registrar server is located).

• Via: It indicates the type of transport used for the transmission of the message and the location where the response must be sent. There can be several Via fields to route to packet to the next hop. This field must also contain a branch parameter, which is an identifier for the transaction with the same value by both UAC and server.

• To: It contains the SIP address of the request’s recipient. This address is a SIP URI.

• From: It contains the SIP address of the user who has sent the request (these value is only the same as the To field in case of REGISTER request).

• Call-ID: It is a unique identifier for each call that allows the server to detect delayed messages that have arrived out of order.

• CSeq: It contains a sequence number and the method name. This sequence number is incremented by one for each message request that is sent by the same user. It allows detecting lost messages and maintaining the order.

• Contact: It contains a SIP URI of the user’s current location.

• Max-Forwards: It is an integer identifier used to limit the number of hops of a request on the way to its destination. Its initial value is usually 70 to guarantee the reception and it is decremented by one at each hop. If it reaches 0 before arriving to its destination, the request is rejected.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 10

• Content-Type: This field describes the media type of the message-body sent to the recipient. It is only present if the body is not empty. In this case, it will be an application/sdp content-type indicating that the SIP message includes a SDP packet with the session description.

• Content-Length: It contains the size of the message body sent to the recipient in decimal number of octets.

1.6.2 Session Description Protocol (SDP)

1.6.2.1 Introduction

In order to establish videoconferences, VoIP calls or other type of session, it is necessary to communicate media capabilities, transport addresses and other session description information to the final users. SDP presents a standard representation that describes and provides this information in such an understanding way to the participants that allows them to make a decision about whether to participate in a session. SDP does not provide any kind of transport method or negotiation parameters. SDP is simply a system for session description. It does not incorporate a transport protocol, and it can work in conjunction with different transport protocols as suitable. One example could be SIP, which incorporates SDP in its messages. An SDP session description must include the following information: IP address, port number, media type and media encoding format. Moreover, SDP contains extra information like subject of the session, start and stop times or contact information about the session. These are the most important header fields in a SDP packet:

• Session description

– v: It shows the version of the Session Description Protocol. – o: It contains the originator of the session (username and user address) and a session identifier. – s: It is the session name. – : It contains connection information including network type, address type and connection address. – b: It specifies the proposed bandwidth to be used by the session or media.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 11

• Time description

– t: It indicates the start and stop time for a session. If these values are zero, the session is considered as permanent.

• Media description (repeated for each type of media)

– m: It contains media types, ports, transport protocol, media format... – k: If the packet is transported over a secure and trusted channel, this field is used to convey encryption keys. – a: It defines different media attributes. Normally, there are many lines of this kind of field.

1.6.2.2 Operation

This point describes the negotiation method between two participants to agree about the corresponding parameters to establish a media session using SIP. This negotiation method is known as offer/answer model because one participant offers a description of his/her available media streams and the other participant answers to the offer. Both, offer and answer, have to be a suitable SDP message, following the recommendations in RFC 4566. The offer must contain all the media streams he/she wants to use, including the IP addresses and the ports to receive them. For each media stream, the type of RTP payload and the codecs have to been specified. If the offer contains multiple formats for one media stream, it means that all of them can be used during the session, but they have to be listened in preference order. The other participant should use the type of media with the highest position in the list, if it is possible. The answer must contain a corresponding media stream for each stream in the offer, indicating the IP addresses and the ports to receive them. Besides, it must inform about what media streams and codecs are supported. If there are no media formats in common for a single media stream, it must be rejected by setting the port to zero. If there no media formats in common for any media stream, all the media session must be rejected. When the participant who sent the offer receives the answer, he/she must identify the accepted streams and formats and can start sending and receiving media. Since SIP allows modifying the parameters in an established session, both partici- pants can generate a new offer at any time in order to update the session with a new negotiation.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 12

1.6.3 Real-time Transport Protocol (RTP)

RTP is an application-layer protocol which defines a standardized packet format for delivering data with real-time characteristics, such as audio and video, over the Inter- net. The services provided by RTP incorporate payload type identification, numbering sequence, timestamping, and delivery monitoring. Although RTP does not guarantee quality - of - service or time delivery by itself, it includes appropriate functionality for the detection of some of the problems produced by the transmission in an unreliable IP network such as packet loss, variable transport delay, out of sequence packet arrival or asymmetric routing. In an equivalent manner as it happens in SDP, RTP is not responsible for the packet delivering, whereas it usually operates and relies on transport protocols like UDP (User Datagram Protocol) or TCP (Transmission Control Protocol) to deal with this functionality. Moreover, RTP packets are not able to be transmitted by themselves over the network. They are usually encapsulated in UDP packets. The Payload field of the RTP packet contains real-time data and the information about it, like the source, size, format and so on, is transported in the header fields. The complete header structure of a RTP packet is detailed below:

• Version (V): This field identifies the version of RTP.

• Padding (P): If the padding bit is set, the packet contains one or more additional padding octets at the end which are not part of the payload. Padding may be needed by some encryption algorithms. Otherwise, padding should only be applied, if it is needed, to the last packet.

• Extension (X): If the extension bit is set, the fixed header is followed by exactly one header extension. This extension mechanism allows individual implementa- tions to experiment with new payload format independent functions that require additional information to be carried in the RTP data packet header. In any other case, it may be ignored.

• CSRC count (CC): It contains the number of contribution count identifiers that go behind the fixed header.

• Marker (M): It is used to carry specific profile information in some applications.

• Payload type (PT): It defines the RTP payload and its understanding by the application.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 13

• Sequence number: It is used to detect lost or out of sequence packets and restore the original source. This identifier is randomly selected with the trans- mission of the first packet of a stream and then it value is incremented by one for each RTP sent packet.

• Timestamp: It reflects the moment of sampling of the first byte in the RTP payload. Several consecutive packages will have the equal timestamp value if they are part of the same stream or data source. The delivering of audio/data packets in a media session uses different channel and port transmission. For that rea- son, this identifier is very important to allow the receiver to restore audio/video data packets and, furthermore, to synchronize a complete videoconference, for example.

• Synchronization Source (SSRC): The synchronization source is a random number used to identify the source of the RTP stream for each RTP session. A user can receive RTP packets from the same endpoint at the same time, but two different synchronization sources will not have identical SSRC identifier in the same session, and so, it will be possible to differentiate the original source of each one.

• CSRC list: It identifies the contributing sources for the payload contained in this packet. The maximum number of contributing sources that it allows to recognize is 15.

RTP supports, but not provides, encryption of the media flows. Generally, it use IPSec or SRTP. It is said that RTP consists of two differentiated protocols:

• Real-time Transport Protocol (RTP): it conveys real-time data.

• Real-time Transport Control Protocol (RTCP): it contains information regarding the quality of the RTP session and the participants in the session.

1.6.3.1 Real-time Transport Control Protocol (RTCP)

RTP Control Protocol (RTCP) is a communication protocol that provides control in- formation and quality services associated with a data flow for a multimedia application. It works in concert with transport and packed RTP, but it does not transport any data by itself. This protocol gathers connection statistics and information about sent bytes,

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 14 lost packages, jitter and so forth. It is important to notice that RTCP by itself does not offer any kind of authentication or flow coding. The information provided by this protocol is used to control the flow and the conges- tion in the network. For example, if the statistics show that there are a huge number of lost packages, the sender can modify its transmissions limiting flow or changing the format of the media stream to another one with low compression codec. RTCP packets are also used to realize and determine problems on the network. On the other hand, participants in a session use RTCP packet to exchange some basic identity data, like the username and the domain that is using. The types of RTCP packets are:

• SR: Sender report, for transmission and reception of statistics from participants that are active senders.

• RR: Receiver report, for reception of statistics from participants that are not active senders.

• SDES: Source description items.

• BYE: Indicates end of participation.

• APP: Application of specific functions.

RTCP packet structure depends on the type of packet. The packet structure detailed below corresponds to a sender report (SR) packet. The only difference between the sender report (SR) and receiver report (RR) forms, besides the packet type code, is that the sender report includes a 20-byte sender information section for use by active senders. This kind of packet is more complex than the others and has greater number of fields.

• Version: It identifies the version of RTP, which is the same in RTCP packets as in RTP data packets.

• Padding: This field has the same functionality as in RTP packets, but related to RTCP packet.

• Reception report count (RC): It defines the number of reception report blocks contained in this packet. Its value can be zero.

• Packet type (PT): In this case, it contains the constant 200 to identify that this as an RTCP SR packet.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 15

• Length: It is the length of this RTCP packet in 32-bit words minus one, including the header and any padding.

• SSRC: It is the synchronization source identifier for the originator of this SR packet. It should make reference to the SSRC field of a RTP packet.

• Network Time Protocol (NTP) timestamp: It is used to indicate the wall- clock time (absolute date and time) when the report was sent in order to used it in combination with timestamps returned in reception reports from other re- ceivers to measure round-trip propagation to those receivers. It has two subfields: most significant word (MSW) and least significant word (LSW).

• RTP timestamp: It corresponds to the same time as the NTP timestamp, but in the same units and with the same random offset as the RTP timestamps in data packets. This timestamp may not be equal to the RTP timestamp in any adjacent data packet. Rather, it must be calculated from the corresponding NTP timestamp using the relationship between the RTP timestamp counter and real time.

• Sender’s packet count: It contains the total number of RTP data packets transmitted by the sender since the transmission started until the time this SR packet was generated. This count should be reset if the sender changes its SSRC identifier.

• Sender’s octet count: It defines the total number of payload octets transmitted in RTP data packets by the sender since the transmission started until the time this SR packet was generated. This count should be reset if the sender changes its SSRC identifier.

• Source identifier (SSRC): This SSRC identifier is the same SSRC field as the RTP packet source which this RTCP packet is related to.

• Fraction lost: It informs about the fraction of RTP packets from SSRC source that has been lost since the previous SR packet was sent.

• Cumulative number of packets lost: It refers to the total number of RTP packet from SSRC source that have been lost since starting transmission. This number is calculated using the number of packets expected minus the number of packets already received.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 16

• Extended highest sequence number received: It contains the highest se- quence number received in a RTP data packet from SSRC source, and the most significant 16 bits extend that sequence number with the corresponding count of sequence number cycles (it is calculated according to an algorithm in Appendix A.1 from RFC 3550).

• Interarrival jitter: It is an estimation of the statistical variance of the RTP data packet interarrival time, measured in timestamp units and expressed as an unsigned integer.

• Last SR timestamp (LSR): It contains the middle 32 bits out of 64 in the NTP timestamp received as part of the most recent RTCP sender report (SR) packet from SSRC source. If no SR has been received yet, the field is set to zero.

• Delay since last SR (DLSR): It refers to the delay, expressed in units of 1/65536 seconds, between the last SR packet received from SSRC source and the sending of the new one. If no SR packet has been received yet from SSRC, the DLSR field is set to zero.

1.7 VoIP Clients

Nowadays, there is a great amount of different VoIP clients. The following tables show some examples of free use VoIP softphones for using in computer and some others for mobile devices. There is not much information about VoIP clients for mobile devices because of the fact that they have proprietary license.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 17

1.7.1 VoIP Clients running on different OS

Application Operating License Language Other System ATLSIP Linux, Windows MPL, C++ It is written GPL, using the Ac- LGPL tive Template Library. Ekiga Linux, Mac OS GNU/GPL C++ X, BSD Eyeball Me- Linux and Proprietary It is based on ssenger uClinux, Win- Eyeball Mes- dows 2000/XP, senger SDK. Windows Mo- It is available bile, Windows in PC, PDA CE and embedded platforms. FreeSWITCH Linux, Win- Open C++ dows, Max OS source X, BSD, Solaris KCall Linux GNU/GPL It is a VoIP ap- plication for the KDE desktop environment. Kphone Linux GNU/GPL C++ It uses Qt.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 18

Linphone Linux, Windows Freeware C It uses eXosip XP (SIP user agent library based on libosip2), mediastreamer2 (powerful li- brary to make audio/video streaming and processing) and ortp (RTP library). Minisip Linux, Windows GNU/GPL It will be soon XP available on Pocket PC. MjUA GNU/GPL Java It is based on MjSIP stack. OpenWengo Linux, Win- GNU/GPL C++ dows, Mac OS X OpenZoep Windows GNU/GPL C++ PhoneGaim Linux, Windows GNU/GPL PJSUA Linux, Win- GNU/GPL C++ It is based on dows, Windows PJSIP stack. CE/Mobile, Mac OS X, Symbian OS SFLphone Linux GNU/GPL C++ It should be portable BSD operating sys- tems Shtoom Linux, Win- GNU/GPL Python dows, Mac OS X

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 19

SIPCommuni- Linux, Win- GNU/GPL Java cator dows, Mac OS X sipXphone Linux, Windows GNU/GPL Java TudoMais Windows GNU/GPL Java Twinkle Linux GNU/GPL C++ It uses KDE li- braries. VMukti Windows Open C# It is based on source .NET 3.0 WxCommuni- Windows GNU/GPL C++ It is based on cator XP/2000 sipXtapi client library and wxWidgets 2.8.4 GUI library. XMeeting Mac OS X Open source YATE Linux , Windows GNU/GPL C++ It supports scripting in various pro- gramming languages (such as embedded PHP, Python and ). YeaPhone Linux GNU/GPL It is based on the Linphone stack. Zap Linux, Win- Open JavaScript It is based on dows, Mac source Mozilla. OS

Table 1.1: VoIP Clients running on different OS

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 20

1.7.2 VoIP Clients for mobile devices

Application Others AGEphone Windows Mobile 5.0 It is based on microSIP for Pocket PC stack, developed in C/C++. Articulation Palm OS 5.0 or greater BeWip Windows Mobile OS CiceroPhone Windows Mobile 5.0, Windows PPC2003, Symbian OS ExpressTalk Windows Pocket PC, Windows Mobile OS eyeP Phone Desktop Windows Pocket PC 2003 iFon Windows Mobile, Windows CE 4.X Microsoft Office Com- Windows Mobile 2003 It is based on the user in- municator Mobile SE for Pocket PC terface of Microsoft Office smartphone, Mobile Communicator 2005 desk- 5.0 for Pocket PC and top client. Smartphone Microsoft Portrait Windows Mobile 5.0 It is a research prototype Pocket PC for mobile video communi- cation. MoviVoip Palm OS 5.0 or greater OnePhone Windows Mobile 5.0, Sybian OS, uLinux Mobile SJPhone Windows Pocket PC 2003 Solegy Softphone It is based on their Servi- cePDQ platform and using opensourcesip and opensip- stack.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 21

speaQ Windows Mobile 5.0 PDA Edition, Linux/Qtopia The FirstHand Mobile Windows Mobile 5.0 Console PDA Edition VOYP Palm OS Woize Windows Mobile 5.0, Windows Mobile 2003 for PocketPC X-Pro Windows Mobile 2003

Table 1.2: VoIP Clients for mobile devices

1.7.3 Structure and operation of softphones

As a general definition, a softphone (English combination of software and telephone) is software used to establish telephone calls from one computer to other softphones or conventional telephones over the internet network. Thereby, it is part of a VoIP environment and makes use of the protocols previously described, SIP, SDP and RTP. Nowadays, there are many available implementations. These softphones can have different license of use (closed proprietary software, freeware, open source, GPL/GNU), system requirements, operating system or , but their structure and operation must follow the same fundamental guidelines. In order to develop a VoIP softphone, we can choose between two principal methods:

• Using some libraries, platform or API, like RTC Client API from Microsoft, where all the necessary protocols are defined and implemented following their RFC files.

• Programming step by step all the features, functionalities, requirements, para- meters, and so forth that are defined in the RFC files of each needed protocol.

In this case, if we are interested in implementing a VoIP softphone based on SIP, it is necessary to take in consideration these documents:

• RFC 3261: Session Initiation Protocol (SIP)

• RFC 4566: Session Description Protocol (SDP)

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 22

• RFC 1889/3550: A Transport Protocol for Real-Time Applications (RTP - RTCP).

Independently of the method we choose to develop the softphone, at least it must contain the following parts:

• SIP package: it should define all the required classes and methods to provide and manage SIP services.

• SDP package: it should define all the required classes and methods to provide and manage SDP services.

• RTP package: it should define all the required classes and methods to provide and manage RTP services. This package should also afford RTCP services.

• User interface: it should help the final user to interact with the sotfphone and employ its functionalities, independently of how it is implemented.

These protocol packages must perform some methods to build, send, receive, analyze and process packets. As it was explained, neither of these protocols provides a way to be transmitted over the network, otherwise they are usually encapsulated in UDP packets. Consequently, the softphone must also define some classes and methods to send, receive, and process of IP/UDP packets, including the sockets and the ports that are needed for the communication channels. Normally, it is used to have two channels for SIP messages (sending and receiving) and other two channels for RTP packets (sending and receiving). The best way to analyze and understand the operation of the VoIP softphones based on SIP is by means of some examples.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 23

1.7.3.1 Registration procedure

In this example, the user carla1 wants to register in a registrar server. The following picture shows the SIP message exchange between the client and the server.

Figure 1.6: SIP registration procedure

Firstly, the user sends a SIP REGISTER request to a registrar server. The SIP package must build a SIP request including the header fields that has been already explained in 1.6.1.4. The SIP server receives the message and uses the information to manage the request. Meanwhile, it sends a provisional response (100 TRYING) to the user in order to indicate that it is performing some action and does not yet have a definitive response.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 24

The UAC processes the TRYING response and waits for a new response from the server. After that, the UAC receives the response of the server, 401 UNAUTHORIZED. This response indicates that the request requires the user to perform authentication because it does not contain the proper credentials. The UAC receives this message and process the information. It should rebuild the SIP request adding an Authorization header field in the message. The Authorization field value consists of credentials containing the authentication information of the UAC for the action requested as well as parameters required in support of authentication and replay protection. The new message will have the form:

Request-Line: REGISTER sip:141.24.93.180 SIP/2.0 Message Header Via: SIP/2.0/UDP 141.24.172.62:15966 Max-Forwards: 70 From: ;tag=6a09ca1e73c846b681c288ed49dbd071;epid=e04c9989ea To: Call-ID: [email protected] CSeq: 2 REGISTER Contact: ;methods=”INVITE, MESSAGE, INFO, SUBSCRIBE, OPTIONS, BYE, CANCEL, NOTIFY, ACK, REFER”’ User-Agent: RTC/1.2.4949 Authorization: Digest username=”’carla1”’, realm=”’asterisk”’, algorithm=MD5, uri=”’sip:141.24.93.180”’, nonce=”’6c839ae6”’, response=”’feea44258515c993945155793cc1c8d6”’ Event: registration Allow-Events: presence Content-Length: 0

In this message, the CSeq field value has been incremented and the new field Authoriza- tion has been included. One more time, the server sends a provisional response while it is analyzing the request. Finally, the request is successful accepted and processed by the registrar server and it answers with an OK response:

Status-Line: SIP/2.0 200 OK Message Header Via: SIP/2.0/UDP 141.24.172.62:15966;received=141.24.172.62

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 25

From: ;tag=6a09ca1e73c846b681c288ed49dbd071;epid=e04c9989ea To: ;tag=as45ef369c Call-ID: [email protected] CSeq: 2 REGISTER User-Agent: Asterisk PBX Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, SUBSCRIBE, NOTIFY Expires: 120 Contact: ;expires=120 Date: Thu, 31 May 2007 07:34:12 GMT Content-Length: 0

The UAC processes the new response and handles it. There is an expires parameter in the Contact field. It indicates how long the registration is valid expressed in seconds. Within the expiration interval, the UAC should send another REGISTER request in order to inform the server about its location. Although it is a complete and normal registration procedure, there are many other pos- sibilities according to the SIP server responses and the UAC must be able to process and manage each of them.

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 26

1.7.3.2 Multimedia session establishment

When a UAC wants to initiate a session, it originates an INVITE request. The INVITE request asks a server to establish a session. The handshake of the SIP messages is shown in the picture.

Proxy carla1 carla2 Server INVITE

407 PROXY AUTHENTICATION REQUIRED

ACK

INVITE INVITE TRYING 180 RINGING 180 RINGING OK OK

ACK ACK

BYE BYE

OK OK

Figure 1.7: Multimedia SIP session establishment

In this example, the user carla1 wants to initiate a multimedia call with the user carla2. The SIP package must build a SIP INVITE request similar to the REGISTER request following what it was explained in 1.6.1.4. In the same way, the SDP package has to create a SDP packet with the header fields deatiled in 1.6.2.1. The complete message is sent. The proxy server receives the message and processes the request. The request does not contain any Authorization field and the server requires client authentication. For that reason, the server sends a 407 PROXY AUTHENTICATION RE- QUIRED to inform the client. The UAC receives this message and process the information. This response is very simi- lar to the 401 UNAUTHORIZED. The UAC sends an ACK message and rebuilds the SIP request adding an Authorization header field with the proper credentials of the client.

Request-Line: INVITE sip:[email protected] SIP/2.0

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 27

Message Header Via: SIP/2.0/UDP 141.24.172.97:16580 Max-Forwards: 70 From: ”’carla1”’ ;tag=f7231a4fdbf547408da053a9f31b0fdb;epid=99274a963c To: Call-ID: [email protected] CSeq: 2 INVITE Contact: User-Agent: RTC/1.2 Proxy-Authorization: Digest username=”’carla1”’, realm=”’iptel.org”’, algorithm=md5, uri=”’sip:[email protected]”’, nonce=”’465e8da1a386b50d9c5608de3da1961645aa7abd”’, response=”61ecb5b096f0de7146ec001b7f469f85”’ Content-Type: application/sdp Content-Length: 679 Message body Session Description Protocol Session Description Protocol Version (v): 0 Owner/Creator, Session Id (o): - 0 0 IN IP4 141.24.172.97 Session Name (s): session Connection Information (c): IN IP4 141.24.172.97 Bandwidth Information (b): CT:1000 Time Description, active time (t): 0 0 Media Description, name and address (m): audio 62410 RTP/AVP 97 111 112 6 0 8 4 5 3 101 Encryption Key (k): base64:I/EwJ93tvnk62iBdgpAUBAtqQDDacCxMqae5MDj1i4A Media Attribute (a): rtpmap:97 red/8000 Media Attribute (a): rtpmap:111 SIREN/16000 Media Attribute (a): fmtp:111 bitrate=16000 Media Attribute (a): rtpmap:112 G7221/16000 Media Attribute (a): fmtp:112 bitrate=24000 Media Attribute (a): rtpmap:6 DVI4/16000 Media Attribute (a): rtpmap:0 PCMU/8000 Media Attribute (a): rtpmap:8 PCMA/8000 Media Attribute (a): rtpmap:4 G723/8000 Media Attribute (a): rtpmap:5 DVI4/8000 Media Attribute (a): rtpmap:3 GSM/8000 Media Attribute (a): rtpmap:101 telephone-event/8000 Media Attribute (a): fmtp:101 0-16 Media Attribute (a): encryption:optional

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 28

Media Description, name and address (m): video 45406 RTP/AVP 34 31 Encryption Key (k): base64:YQdcx0AUpMFh2+xW8A1qcZ8NTeN6qEEcjoyfejsVgmo Media Attribute (a): rtpmap:34 H263/90000 Media Attribute (a): rtpmap:31 H261/90000 Media Attribute (a): encryption:optional

This message is received by the proxy server. It routes the message to the destination user or through another proxy server. Normally, if the message is routed to another proxy server, it sends a provisional response to the first proxy server and routes the message to the destination user. The UAC processes the TRYING response and waits for a new response from the server. When the destination user receives the INVITE request, sends a provisional RINGING message indicating that the message is being processed. Then, the proxy server forwards the response using the information in the Via field. The UAC processes the RINGING message and waits for a non provisional response from the server. When the destination user finally accepts the call, it is sent an OK response to its proxy server. It forwards the message using the Via field and this proxy server forwards it again in the same way to the UAC that formulated the original request.

Status-Line: SIP/2.0 200 OK Message Header Via: SIP/2.0/UDP 141.24.172.97:16580;rport=1477 From: ”’carla1”’ ;tag=f7231a4fdbf547408da053a9f31b0fdb;epid=99274a963c To: ;tag=ac93304b84c644cc88c52d415d14caac Call-ID: [email protected] CSeq: 2 INVITE Record-Route: Record-Route: Record-Route: Contact: User-Agent: RTC/1.2 Content-Type: application/sdp Content-Length: 708 P-Behind-NAT: Yes Message body Session Description Protocol Session Description Protocol Version (v): 0 Owner/Creator, Session Id (o): - 0 0 IN IP4 141.24.92.247

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 29

Session Name (s): session Connection Information (c): IN IP4 213.192.59.66 Bandwidth Information (b): CT:1000 Time Description, active time (t): 0 0 Media Description, name and address (m): audio 6260 RTP/AVP 97 111 112 6 0 8 4 5 3 101 Encryption Key (k): base64:t7AMm5DBS7WjFacOTXt9B+vImF15vDxVPGVgL8fY5GY Media Attribute (a): rtpmap:97 red/8000 Media Attribute (a): rtpmap:111 SIREN/16000 Media Attribute (a): fmtp:111 bitrate=16000 Media Attribute (a): rtpmap:112 G7221/16000 Media Attribute (a): fmtp:112 bitrate=24000 Media Attribute (a): rtpmap:6 DVI4/16000 Media Attribute (a): rtpmap:0 PCMU/8000 Media Attribute (a): rtpmap:8 PCMA/8000 Media Attribute (a): rtpmap:4 G723/8000 Media Attribute (a): rtpmap:5 DVI4/8000 Media Attribute (a): rtpmap:3 GSM/8000 Media Attribute (a): rtpmap:101 telephone-event/8000 Media Attribute (a): fmtp:101 0-16 Media Attribute (a): encryption:optional Media Description, name and address (m): video 42648 RTP/AVP 34 31 Encryption Key (k): base64:rjR7lz3kmGVpahZErkninx1gTFaZMyNr+Y35W1pSHtg Media Attribute (a): recvonly Media Attribute (a): rtpmap:34 H263/90000 Media Attribute (a): rtpmap:31 H261/90000 Media Attribute (a): encryption:optional Media Attribute (a): nortpproxy:yes

This response indicates that the session has been accepted, but maybe not in the origi- nal way. The UAC must process the message, verify that it belongs to the original INVITE request using the CSeq field and analyze again the content of the SDP packet. The INVITE request sent from carla1 to carla2 indicated a multimedia session with audio and video, but the OK response indicates that only the received video data and audio are supported by the other user. In order to finish the establishment of the session with the corresponding parameters and media type, the UAC must send an ACK message. The task of the proxy servers is to facilitate the two UAC locating and contacting each other. They should not storage any knowledge of the fact that there is a session established between the users. Furthermore, once the ACK is received by the destination UAC, they can

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 30 start exchanging RTP packets. It is important to realize that media is usually transmitted end-to-end and not through any proxy server. According to the type of media and the rest of parameters present in the SDP session description, the RTP package should provide the way to generate RTP packets and send them through the network encapsulated in IP/UDP packets. The complete header structure of a RTP packet was detailed in 1.6.3. After the RTP header there must be the RTP Payload. This is an example of a real RTP packet:

Real-Time Transport Protocol 10...... = Version: RFC 1889 Version (2) ..0. .... = Padding: False ...0 .... = Extension: False .... 0000 = Contributing source identifiers count: 0 0...... = Marker: False Payload type: SIREN (111) Sequence number: 33555 Timestamp: 3169076983 Synchronization Source identifier: 2281082149 Payload: 5994FCBD0BD53F49C59C69B1E9F73449C6D6DC1CC62A6294...

Since it was mentioned before, there is another protocol which works in concert with RTP, RTP Control Protocol (RTCP), that provides control information and quality services asso- ciated with a data flow for a multimedia application. This is an example of a real RTCP SR packet related to the previous RTP packet which contains all the header fields explained in 1.6.3.1.

Real-time Transport Control Protocol (Sender Report) 10...... = Version: RFC 1889 Version (2) ..0. .... = Padding: False ...0 0001 = Reception report count: 1 Packet type: Sender Report (200) Length: 12 Sender SSRC: 2281082149 Timestamp, MSW: 3389590617 (0xca090c59) Timestamp, LSW: 2918510592 (0xadf4f000) ’[MSW and LSW as NTP timestamp: May 31, 2007 08:56:57,6795 UTC]’ RTP timestamp: 3169077087

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 31

Sender’s packet count: 47 Sender’s octet count: 2280 Source 1 Identifier: 3436236073 SSRC contents Fraction lost: 0 / 256 Cumulative number of packets lost: 16777213 Extended highest sequence number received: 52467 Sequence number cycles count: 0 Highest sequence number received: 52467 Interarrival jitter: 3 Last SR timestamp: 200748212 (0x0bf72cb4) Delay since last SR timestamp: 29204

In a single media session, many RTP and RTCP packets are transmitted. These have been only some little examples of its structure and operation. The media session finishes when some of the UACs sends a BYE request. In this session, user carla1 wants to end the call and its UAC creates and sends a BYE message. OK message. The media session is finished with the reception of this message. Summarizing, it is possible to say that the basic operation of an UAC consists in preparing the communication channel and building, sending, receiving and processing packets of the different protocols that are needed in a whole VoIP environment.

1.7.4 Softphones for Windows Mobile OS

In the previous point, the main protocols that are needed to develop a VoIP softphone and their operation were explained. The structure of a softphone on Windows Mobile OS is the same as in other operating systems. It means that it must implement or make use of SIP, SDP and RTP packages in the same way as it was described in 1.7.3. The specific characteristics for softphones running on Windows Mobile OS reside in the system requirements. There is not much information about it because these softphones have closed proprietary license, but some common specifications are enumerated below:

• Minimum Size Requirement: 64MB ROM, 32MB RAM

• ARM processor

• Audio codec support: G711

One example of this softphones is AGEphone, which is based on microSIP stack. These are its system requirements:

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 32

• CPU: ARM-type CPU 200 MHz or above

• Memory: 64MB or above

• Free Disk Space: 600kb or above

• Connection: Up- and Downstream of each 29.2kbps or above

This protocol stack supports G.711 and GSM6.10 codecs and provides the next protocols:

• RFC3261: Session Initiation Protocol (SIP)

• RFC2327: Session Description Protocol (SDP)

• RFC1889: A Transport Protocol for Real-Time Applications (RTP)

• RFC2833: RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals

• RFC3489: STUN - Simple Traversal of User Datagram Protocol (UDP) Through Net- work Address Translators (NATs)

• RFC3581: An Extension to the SIP for Symmetric Response Routing

• UPnP: Universal Plug and Play IGD

1.7.5 OS for mobile devices

Symbian OS is an operating system produced by Symbian Ltd. It has been designed for mobile devices with associated libraries, user interface frameworks and reference implemen- tations of common tools. It runs exclusively on ARM (Advanced RISC Machine) processors. Windows Mobile is a compact operating system combined with a suite of basic applications for mobile devices based on the Microsoft Win32 API. Devices which run Windows Mobile include Pocket PCs, Smartphones, and Portable Media Centers (portable media player de- vices). It is designed to be somewhat similar to desktop versions of Windows. Nokia OS (NOS) is an informal name for the operating system in many Nokia mobile phones. There is no such product or trademark. Officially it is referred as ISA (Information Source Adapter) platform. It is only available for Nokia’s internal use. It is not licensed to anyone else yet. No direct API is provided either, but most ISA phones can be programmed with J2ME. Operating System Embedded (OSE) is a real-time embedded operating system created by the Swedish firm ENEA. OSE uses signaling in the form of messages passed to and from processes in the system. Messages are stored in a queue attached to each process. BlackBerry OS runs on an Intel 80386 microprocessor and all devices include an embed- ded RIM (Research In Motion) wireless modem for wireless data access. This OS supports

Diplomarbeit Carla Garc´ıaS´anchez 1 VoIP Technology based on SIP 33 multitasking and multithreaded applications. Developers familiarized with other operating systems such as Windows and the MacOS will be at home in the BlackBerry environment. All applications interact with the underlying operating system (and other applications) through the exchange of event messages. As most operating systems, a C language API is provided for direct access to the system. Palm OS is a compact operating system developed and licensed by PalmSource, Inc. for personal digital assistants (PDAs) manufactured by various licensees. It is designed to be easy-to-use and similar to desktop operating systems such as . Palm OS is combined with a suite of basic applications including address book, clock, note pad, sync, memo viewer and security software. Palm OS licensees decide which applications are included on their Palm OS devices. The applications are primarily coded in C/C++ and a Java Run time Environment is also available for its platform. Linux is a Unix-like computer operating system family that uses the Linux kernel. A Linux system which includes system utilities and libraries from the GNU Project is sometimes referred to as GNU/Linux. The methodical design of Linux made it possible to adapt it to a wide range of computing platforms in spite of being originally developed for Intel 386 processors. Of particular interest in this context are the ARM based architectures, as many embedded systems and mobile devices are powered by ARM processors. Linux is a prominent example of and of open source development. Its underlying source code is available for anyone to use, modify, and redistribute freely, and in some instances the entire operating system consists of free/open source software. It is said that Mobile Linux and Mobile Java become a power combination. While Linux is evolving into a major standard for mobile device operating systems, Java is becoming a standard at the software application level. The J2ME/MIDP specifications have been adopted by all major mobile phone manufacturers. The MIDP (Mobile Information Device Profile) is comprised of a set of Java APIs, that provides a J2ME (Java 2 Micro Edition) runtime environment for mobile information devices. Mac OSX is a proprietary line, graphical operating systems developed, marketed, and sold by Apple Inc., the latest of which is pre-loaded on all currently shipping Macintosh computers. Mac OSX is a Unix-like operating system. This operating system has been developed for the handheld device iPhone and gives access to true desktop-class applications and software, including rich HTML email, full-featured web browsing, and applications such as calendar, text messaging, notes, and address book. iPhone is fully multi-tasking

Diplomarbeit Carla Garc´ıaS´anchez 2 Microsoft RTC Client API 34

2 Microsoft RTC Client API

2.1 Introduction

The Real-time Communications Client Application Programming Interface enables develop- ers to build applications for integrated multimodal communications. It provides the necessary structure and interfaces to establish PC-PC, PC-phone, or phone-phone calls, Instant Mes- saging (IM), sharing application, and whiteboard sessions over the Internet. Furthermore, multimedia sessions can be set up on PC-PC calls, and Presence information on a list of contacts is also supported. RTC Client API can be programmed with C++ or any other programming language that can access COM components. This includes .NET languages, such as Microsoft Visual Basic .NET and Microsoft Visual C#, which can access the RTC Client API through COM interoperability. The main functionalities supported by the RTC Client API are:

• Registration and provisioning

• Publishing presence

• Contact management

• Polling presence

• Instant Messaging

• Multimedia calls

• Call control

• Session negotiation

• User search

• Authentication

• Signalling privacy

• Media privacy

Diplomarbeit Carla Garc´ıaS´anchez 2 Microsoft RTC Client API 35

2.2 Object Model Overview

The basic coding model for RTC is COM (Component Object Model). The main objects used for communication in RTC are Client, Session, Profile, Participant, Buddy and Watcher objects, and the interfaces used to create and manage them are IRTCClient, IRTCSession, IRTCParticipant, IRTCProfile, IRTCBuddy, and IRTCWatcher, respectively.

Figure 2.1: RTC Client COM Objects

The client object is the basis of the RTC Client. It establishes the session types and the session parameters, the preferred audio and video devices and other media capabilities. This object is necessary to construct the rest of the objects. The session object is used to manage all the tasks related to the real-time session such as: initiating, answering, or terminating sessions, adding or removing participants, adding security media or storing information about media types. There are four kinds of sessions: PC-to-PC, PC-to-phone, phone-to-phone, and instant messaging. The profile object provides a way to get information from a profile user. This profile includes information about client account (username, password, sip server), supported session types and capabilities, authentication, transport protocol and so forth. After initializing RTC, the client application creates and enables a profile. The participant object contains all the information and methods associated with users who take part in a session. Each of these users is called a ’”participant”’ and is represented by a different participant object.

Diplomarbeit Carla Garc´ıaS´anchez 2 Microsoft RTC Client API 36

The buddy object is used to get and put information about the user contacts. It provides data like the name or the status of the contact. This object is created when a user adds a new contact to his contact list. The watcher object is used to get and put information about the state of a watcher. When a user adds a new buddy, this buddy creates an object watcher of the user in order to maintain information about his presence. The buddy and the watcher objects are used to manage the presence information.

2.3 Architecture

To provide its functionality, the RTC Client API uses industry standard protocols like:

• Session Initiation Protocol (SIP)

• Session Description Protocol (SDP)

• Real-time Transport Protocol (RTP)

• Public Switched Telephone Network/Internet (PINT)

2.4 .NET Platform

2.4.1 Introduction

.NET is a software platform that connects information, systems, people and devices. .NET Platform connects a great variety of technologies of personal use and businesses, of cellular telephones to corporative servants, allowing the access to important information, where and when they are needed. Developed with base on the standards of Services Web XML, .NET allows the systems and applications (new or existing) to connect their data and transactions independently of the version of the operating system, type of computer or mobile device that is utilized, or the programming language used to create it. Code written on the .NET Framework platform is called managed code. Regardless of which .NET language is employed, the output of the language compiler is a representation of the same logic in an intermediate language named CIL (Common Intermediate Language) or MSIL (Microsoft Intermediate Language). The programming languages that can be used in the .NET platform are C#, C++, Visual Basic .NET, J#, JScript .NET, Windows Pow- erShell, IronPython, F#.

Diplomarbeit Carla Garc´ıaS´anchez 2 Microsoft RTC Client API 37

2.4.2 Operation

There are three main points on which .NET platform bases its mode of operation:

• .NET languages, that have been previously enumerated.

• Base Class Library (BCL), which is a library of types available to all .NET languages and provides a lot of classes with a huge number of common functions, including file reading and writing, graphic rendering, database interaction and XML document manipulation.

• Common Language Runtime (CLR), which is explained below.

Figure 2.2: Overview of the Common Language Infrastructure

Diplomarbeit Carla Garc´ıaS´anchez 2 Microsoft RTC Client API 38

The most important component of the .NET Framework is the Common Language In- frastructure. The CLI is responsible for providing a language platform for application de- velopment and execution, including components for exception handling, security, interoper- ability, and so forth. Microsoft’s implementation of the CLI is called the Common Language Runtime (CLR). The CLR is composed of four primary parts:

• Common Type System (CTS)

• Common Language Specification (CLS)

• Just-In-Time Compiler (JIT)

• Virtual Execution System (VES)

Managed code is compiled down to a combination of MSIL and metadata. These are combined into a VES file, which can then be executed on any CLR-capable machine. When you run this executable, the JIT starts compiling the CIL down to native code. The result is that all .NET Framework components run as native code. Code that requires the CLR at run-time in order to execute is referred to as managed code. The purpose of the CLR is to control the execution of the code that runs on the .NET Framework.

2.4.3 Advantages

For software developers, the .NET Framework is an important change. It offers some capa- bilities and responsibilities that had previously been provided individually by programming languages and tools from various sources. The incorporation of the features into the operating system becomes in a great number of advantages, including:

• Assuring the availability of framework features to all programs written in any of the .NET languages.

• Providing to programmers a common mean of accessing framework features, regardless of programming language.

• Guarantees of a common behaviour within the framework, regardless of programming language.

• Allowing the operating system to provide some guarantees of program behaviour that, otherwise, it could not offer.

• Reducing the complexity and limitations of program-to-program communication, even when those programs are written in different .NET languages.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 39

3 Development of VoIP softphone for Windows 2000/XP

3.1 Understanding the code source

Before continue developing the softphone, it is essential to understand and identify how it has been built. It means that it is necessary knowing the softphone structure and the different functionalities that are already implemented. Analyzing the code source of the application and its behaviour in execution it is possible to identify the following operations:

• Initializing RTC Client object: creates the client object.

• Listening on RTC Events: allows the client to determine which specific events the application needs and ignore the rest.

• Creating and enabling a profile: creates a profile with the configuration parameters of the client object in order to register a user in a server and creates the profile object.

• Handling events: identifies and controls incoming events.

• Starting a session and making a call: configures the type of session, adds a participant and creates the session object.

• Answering a call: manages an incoming call.

• Terminate a call: finishes an existing session.

• Disabling profile: deregisters a user and disables the profile.

• Shut down client: stops the operation of the client object and disables the rest of existing objects.

These basic steps compose the softphone framework and permit the correct operation of its main purpose: the transmission of voice over internet by means of a SIP session establishment.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 40

3.2 New functionalities

Nowadays, a typical VoIP softphone includes some features that are not strictly related with its prime operation, but they add new useful capabilities in order to render some facilities or services to the user. For that purpose, five new functionalities have been added to the softphone. Each of them is explained as follows.

3.2.1 Volume bar for microphone and speakers

This functionality allows the user to configure and adjust the audio settings. In order to increase or decrease the volume level of the microphone or speakers, it is only needed to move the Microphone Volume or Speakers Volume trackbar, respectively. Furthermore, the Audio and Video Tuning Wizard help the user to verify that his camera, speakers, and microphone are working properly. Before using the Wizard, it is important to perform the following:

• Close all other programs that show video or play or record sound.

• Make sure that the camera, speakers, and microphone are plugged in and turned on.

These functionalities are implemented by the client object and the methods used are included in the RTCClientClass class.

• set volume(RTC AUDIO DEVICE enDevice,long lVolume), where the input parame- ters are the audio media type (microphone or speakers) and the volume level.

• InvokeTuningWizard()

3.2.2 Sending DTMF signals

Dual tone multi-frequency is a system of signal tones used in telecommunications. When the user presses a dial-pad button corresponding to a digit, two tones of specific frequencies are sent. The receiver, normally a switching centre, can decode and detect which digit was pressed. The tones are divided into two groups (low and high) into the voice frequency band, and each DTMF signal uses one from each group. These signals are used in different applications including voice mail, help desks, telephone banking, and so forth, to select some configuration options or manage remote control systems, for instance.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 41

The following table shows the frequencies associated with each decimal digit: Button or Digit Low frequency (Hz) High frequency (Hz) 1 697 1209 2 697 1336 3 697 1477 4 770 1209 5 770 1336 6 770 1477 7 852 1209 8 852 1336 9 852 1477 0 941 1336

Table 3.1: DTMF frequencies

This functionality is provided by the client object. The method used is SendDTMF(RTC DTMF enDTMF ), included in the RTCClientClass class. The input parameter is an enumeration that specifies which DTMF should be sent. This method sends a DTMF to the active session and plays a feedback tone to the RTC default audio device.

3.2.3 Addition of videoconference

Videoconference calls can be used in a great amount of different situations, which is one of the reasons the technology is so popular. Although a lot of people use videoconference in a recreational sense, general uses for videoconference include business meetings, educational training and collaboration among health officials. In fact, videoconference has been used in a huge variety of fields like the followings: telemedicine, telecommunications, education, surveillance, security, emergency response, and so on. Perhaps the biggest benefit videoconference offers is the ability to meet with people in remote locations without problem of time, distance or money. It can be use to keep in touch with the entire world without going out home. ’A picture says a thousand words’. Videoconference does not replace real life meetings, but enhances ’face-to-face’ communication making it easier and more natural, regardless where people are located. After establishing the audio/video session between two participants, the client object processes received and sent video data. This object gets incoming and outgoing video stream and shows each of them in a different video window. The method used is get IVideoWindow(RTC VIDEO DEVICE enDevice, out IVideoWindow VWindow), included in the RTCClientClass class. The input parameter is an enumeration that specifies the video device (receive/preview); the output parameter is referred to an interface to control video window properties.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 42

3.2.4 Contact List

The contact list, also called address book, is a feature which allows users to storage locally friends’ personal information. Besides, it lets users to know if their contacts are online or not. Users can call their friends only with some few clicks. It is easy, speedy and comfortable. All the information about the user’s buddies is persisted in a file on the user’s computer. The service that makes it possible is the presence information service. It is responsible for updating contact’s presence status and notifying user’s status. The calls will be done through a registrar server that maintains current location information of the contacts. The first stage consists in registering the user on the SIP server and enabling presence. The presence service can be enabled before registering user’s profile on the server. The main steps are: create profile - enable presence - set presence status - enable profile. Once the profile is registered and presence is enabled, adding a new contact to the address book is simple. The IRTCClientPresence interface provides methods add a buddy, remove a buddy, enumerate watchers, set local presence status, and so forth. If the buddy object is successfully created, using the IRTCBuddy interface the client object will be able to get the buddy’s presentity URI, name of the buddy, buddy’s SIP number, the buddy’s status, and some other data associated with the buddy. The contact list can be recovered by querying the client object using the IRTCClientPres- ence interface. From this interface, the contacts can be enumerated by calling the Enumer- ateBuddies method.

3.2.5 Encryption of media

It is indispensable to be aware of the risks using VoIP, especially in the case of telephony, an application of vital development. People who combine telephony and computing, they also promote security holes and dangers. The use of unsecured VoIP communications is a great opportunity for undesirable activities of the hackers. Hackers record calls like audio file, resend calls, make calls with false identification, generate busy tones or manipulate call queues. There are many programs for that purposes available in internet. For that reason, VoIP application must assure security. Confidentiality, integrity and authenticity of dates must be guaranteed at any time. In cryptography, encryption is the process of transforming information to make it unread- able to anyone except those possessing special knowledge, usually referred to as a key. It could be possible to think that encrypting media flows is sufficient to secure a VoIP communication, but this concept is completely wrong. Some media encryption protocols, like Secure Real-time Transport Protocol (SRTP), do not provide any method for key exchange or key management and they use SIP signalling for this purpose. So, if SIP signalling is not encrypted or protected by any mechanism, anyone could get this key. In conclusion, it

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 43 is needed to encrypt any media associated with a session and all SIP traffic to guarantee a secure VoIP communication. SIP is not an easy protocol to secure. The encryption of the whole message would be the best mean to assure security, however, SIP request and responses cannot be entirely encoded because some message fields, like Via, need to be able to read and modify by, for example, proxy servers. For that reason, it is recommended to use low-layer security mechanisms for SIP because they work hop-by-hop. In these kinds of mechanisms, servers are authenticated, so, the end users can be sure with whom they are communicating. Transport or network layer security encrypts signaling traffic, guaranteeing message confi- dentiality, integrity, and, sometimes, authentication. RFC 3261 documentation proposes two ways for securing the transport and network layer: Internet Protocol Security (IPSec) and Transport Layer Security (TLS). IPSec is a set of network-layer protocols for securing Internet Protocol (IP) communica- tions. IPSec also includes protocols for cryptographic key establishment. It can be used with TCP and UDP. IPSec is normally used in architectures with reliable relationship between hosts. IPSec is implemented on the operating system in a host or on a security gateway from a particular interface, like Virtual Private Network (VPN). Generally, IPSec does not need to be integrated in SIP applications, but it requires an IPSec profile defining the protocol tools that would be used to secure SIP. TLS is a cryptographic protocol that provides secure communications over internet. It supplies authenticity and privacy of the information and allows client-server applications to communicate in such a specific way that prevents eavesdropping, identity falsification and assures message integrity. TLS is normally used in architectures with not reliable relationship between hosts. After certificate exchange, client and server can trust each other and commu- nicate securely. Although TLS is the most recommended mechanism to secure SIP signalling, many SIP servers do not support it yet, such as servers based on Asterisk. Furthermore, other SIP servers which support TLS, like SIP Express Router (SER) from http://iptel.org, need a special configuration and they cannot be used for free. On the other hand, it is necessary to have a mechanism to encrypt real-time data. RFC 1889 documentation says that the default encryption algorithm is the Data Encryption Stan- dard (DES) algorithm in cipher block chaining (CBC) mode. This kind of encryption uses a 56-bits key and consists of dividing the message in smaller blocks of plaintext and applies the logical operation XOR (exclusive disjunction) between one block and the previous one before being encrypted. An initialization vector must be used in the first block. DES is now considered to be insecure for many applications. This is due to fact that the size of the encryption key is too small. DES keys have been easily broken in less than 24 hours. For that reasons, RTC 3550 documentation recommends a new security method called Secure Real-Time Transport Protocol (SRTP). It is a profile of RTP, which can provide

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 44 confidentiality, message authentication, and replay protection to the RTP traffic and to the control traffic for RTP, RTCP. SRTP assures stronger encryption because it is based on the Advanced Encryption Standard (AES). It is ideal for protecting VoIP traffic because it can be used in conjunction with header compression and has no effect on IP QoS. Once these security protocols for VoIP applications based on SIP and RTP have been explained it is time to explain why it has not been possible to use them. Beginning with the encryption of SIP signalling, RTC Client API supports TLS transport and secures the signalling channel. In order to enable this security protocol it is necessary to specify the TLS transport in the client’s profile. However the problem is in the SIP servers. As it was mentioned before, many servers do not support TLS and many others do not offer this service for free. Unfortunately, this is not very widespread yet and I have not been able to find a server where I can use this security protocol. According to the encryption of the media flows, RTC Client API does not support SRTP, but it provides a proprietary protocol to secure media sessions. The RTC Client API al- lows the application to set default encryption levels on the session object. Every packet of audio/video data is encrypted. The method supplied by the RTC Client API is put PreferredSecurityLevel, included in the IRTCSession2 interface. It has two input arguments:

• RTC SECURITY TYPE enSecurityType: an enumeration that specifies the media type. In this case, audio and video media streams are encrypted.

• RTC SECURITY LEVEL enSecurityLevel: an enumeration that specifies the level of encryption for the specified media type. In this case, only encrypted session are supported in order to assure a secure communication.

This method sets the proposed security level for an outgoing session.

3.3 Testing the program and results

Once the new functionalities have been developed, it is necessary to test them separately and debug the program in order to correct possible mistakes and guarantee an accurate operation. For that purpose, the following software tools have been employed:

• Microsoft Visual Studio 2005 Professional Edition: It is an integrated develop- ment environment for Windows’ systems that assists computer programmers in devel- oping and debugging software.

• Wireshark: It is a network packet analyzer that captures network packets and tries to display that packet data as detailed as possible.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 45

3.3.1 Volume bar for microphone and speakers

Although the application allows changing the volume level of the audio devices at any time, the result will be obviously noticeable if it is adjust during a call. The more exact method to assure that the volume level has been correctly updated is using the debugging mode of Microsoft Visual Studio. In this way, it is easy to execute the application step-by-step and note if the client object receives the appropriate event and manages it changing the corresponding values.

3.3.2 Sending DTMF signals

As it was explained before, this functionality requires the existence of an active session. There are two simple ways to test it:

• Listening to the feedback tone on the audio device.

• Analyzing the RTP packet that is sent when a dial-pad button is pressed.

The SIP server of the department of Communication Networks has an Interactive Voice Response (IVR) application set up. After calling the number 772 and hearing a voice message, the user can pressed any digit to send a DTMF signal and listening to the feedback tone. Besides, this test can be verified by using the Wireshark software. This is one of the RTP packets that have been sent when the user has pressed the digit 4: Real-Time Transport Protocol 0...... = Version: RFC 1889 Version (2) ..0. .... = Padding: False ...0 .... = Extension: False .... 0000 = Contributing source identifiers count: 0 0...... = Marker: False Payload type: telephone-event (101) Sequence number: 30048 Timestamp: 1616284590 Synchronization Source identifier: 557484360 RFC 2833 RTP Event Event ID: DTMF Four 4 (4) 1...... = End of Event: True .0...... = Reserved: False ..00 1010 = Volume: 10 Event Duration: 1600 The Payload type of the packet is telephone-event, and the type of event is DTMF Four.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 46

The following picture shows the graph analysis of the complete call, including the estab- lishment of the session and the sending of the DTMF signals.

Figure 3.1: Sending DTMF signals

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 47

3.3.3 Addition of videoconference

The user 5002705 has been registered in the sipgate.co.uk server and has established a videoconference call with the user 5002375. Using the Wireshark software, all the packets related to this videoconference has been captured.

Figure 3.2: Videoconference

As it is possible to appreciate in the graph, once the media session is established, the

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 48

RTP packets are sent endpoint-by-endpoint without involving the server. Furthermore, it is important to realize that there are four different types of RTP packets, depending on the SSRC header field:

• Video data from 5002705 to 5002375: 385110643

• Video data from 5002375 to 5002705: 1040943839

• Audio data from 5002705 to 5002375: 2343623439

• Audio data from 5002375 to 5002705: 4164887957

It is also possible to assemble all the audio/video packets and play them in order to obtain the whole message. Note: These two last functionalities have been tested and explained without encrypting the media in order to show and verify their correct operation.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 49

3.3.4 Contact List

The user carla1 has been registered in the http://iptel.org server and has added the user carla2 as a new buddy. All the SIP messages related to this action have been captured by using Wireshark software and are shown below.

Proxy carla1 carla2 Server SUBSCRIBE

407 PROXY AUTHENTICATION REQUIRED

SUBSCRIBE SUBSCRIBE OK OK NOTIFY NOTIFY OK OK

NOTIFY NOTIFY OK OK SUBSCRIBE

SUBSCRIBE

SUBSCRIBE

Figure 3.3: Example of SUBSCRIBE - NOTIFY

Firstly, carla1 sends a SUBSCRIBE message to carla2. This message is used to request status or presence updates from the presence server. The proxy server analyzes the request and answers with a 407 PROXY AUTHENTICA- TION REQUIRED because the user has not included the proper credentials. The user carla1 processes the response and rebuilds the request adding an Authorization field.

Request-Line: SUBSCRIBE sip:[email protected] SIP/2.0 Message Header Via: SIP/2.0/UDP 141.24.172.97:9993 Max-Forwards: 70 From: ’”carla1”’ ;tag=21134da32c084961a9caf6becc74ba33;epid=452b456bb8 To:

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 50

Call-ID: [email protected] CSeq: 2 SUBSCRIBE Contact: User-Agent: RTC/1.2 Proxy-Authorization: Digest username=’”carla1”’, realm=’”iptel.org”’, algorithm=md5, uri=’”sip:[email protected]”’, nonce=’”465ea154f015d0e0ae1291bbfe5fa1c9ec4bce2a”’, response=”289135896fbe665dd23a5e71e4ebe22e”’ Event: presence Accept: application/xpidf+xml, text/xml+msrtc.pidf Supported: com.microsoft.autoextend Content-Length: 0

When carla2 receives the SUBSCRIBE message sends an OK message indicating that the request has been accepted. After that, carla2 sends a NOTIFY message to carla1. This message is used to deliver the information that has been requested.

Request-Line: NOTIFY sip:141.24.172.97:1617 SIP/2.0 Message Header Via: SIP/2.0/UDP 213.192.59.75;branch=0 Via: SIP/2.0/UDP 213.192.59.76:5070;rport=5070;branch=z9hG4bKb136.ce6dce42.0 Via: SIP/2.0/UDP 213.192.59.75;branch=0 Via: SIP/2.0/UDP 141.24.92.247:9885;rport=1209 Max-Forwards: 14 From: ;tag=1acff860a77f4eab9437fe579595b387 To: ’”carla1”’ ;tag=21134da32c084961a9caf6becc74ba33;epid=452b456bb8 Call-ID: [email protected] CSeq: 2 NOTIFY Contact: User-Agent: RTC/1.2 Expires: 1320 Subscription-State: active;expires=1320 Event: presence Content-Type: application/xpidf+xml Content-Length: 425 P-hint: rr-enforced Route: Message body eXtensible Markup Language

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 51

Presence information

The process is completed when carla1 receives the NOTIFY message and answers it with an OK message indicating that it has been successfully received. The buddy has been added to the contact list of carla1 and a watcher object has been created to control the presence status. The NOTIFY message has an expires field that indicates the duration of the subscription. Within the expiration interval, the user should refresh the subscriptions by sending another SUBSCRIBE message in order to maintain it active. On the other hand, if it is any change on the subscribed state, carla2 should send a NOTIFY message to inform carla1. In this example, carla2 has shut down and has sent a message to carla1 to advise about the current state. When carla1 receives the message knows that carla2 has terminated the subscription and sends an OK response. Now the buddy appears as offline. If the user wants to continue being informed about the presence status of his buddy, he has to re-subscribe to the state composing a new SUBSCRIBE request, like the first time, with a new Call-ID value. There will be no response until carla2 will be online again, so carla1 must resend it peri- odically until receiving a response. This process is identical as the process already described. Now carla2 is again registered online, receives the SUBSCRIBE request and answers with an OK and a NOTIFY messages. One more time, carla1 sends an OK message informing that the message has been received and visualizes the buddy as an online contact.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 52

3.3.5 Encryption of media

The most suitable method to confirm that the encryption of the media is correctly imple- mented is by repeating the same videoconference call as before in 3.3.3 and analyzing the results. The user 5002705 establishes a multimedia session with the user 5002375. The following picture shows the graph analysis of the complete videoconference:

Figure 3.4: Videconference with encryption of the media

As opposed to the other example, it is possible to appreciate in the graph that every RTP packet has a different SSRC value and the type of payload is unknown in almost all of them. However, the two users have communicated without problems and have received the audio and the video data correctly. It means that the packets are encrypted and the network analyzer, Wireshark, has captured the packets but it is not able to understand the

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 53 information that they are transmitting. For that reason it is not possible to assemble all the audio/video packets and play them in order to obtain the whole message. The complete message could only be recovered when the destination user decrypt it.

Diplomarbeit Carla Garc´ıaS´anchez 3 Development of VoIP softphone for Windows 2000/XP 54

3.4 Software tools

In this section there is a brief description of all the software tools that have been used during the development of this project.

• Microsoft Visual Studio 2005 Professional Edition: It is an integrated develop- ment environment for Windows’ systems that assists computer programmers in devel- oping and debugging software. This software has been used to program and testing the softphone.

• Microsoft Office Visio Professional 2003: It is a set of software of vectorial draw- ing for Microsoft Windows. It is compose of many different tools to create diagrams. This application has been used to design UML diagrams.

• Windows Real-Time Communications Client API SDK v1.2: It is a software development kit that provides an application programming interface to develop a VoIP softphone based on SIP.

• Wireshark: It is a network packet analyzer that captures network packets and tries to display that packet data as detailed as possible. It has been used to test the new functionalities of the softphone.

• TeXnicCenter: It is a software to develop Latex documents on Microsoft Windows.

• Miktex: It is an up-to-date implementation of TeX, a typesetting system.

Diplomarbeit Carla Garc´ıaS´anchez 4 Adaptation of the VoIP softphone for mobile devices 55

4 Adaptation of the VoIP softphone for mobile devices

The main aim of the adaptation of the VoIP softphone on Windows Mobile OS is to enjoy the VoIP services and advantages in a mobile device, including, notably, the followings:

• Calling or being called on a mobile device at the same price as that of a landline independently of the location.

• Being accessible on a mobile phone or a PDA using the same number as that of home or office.

• Knowing if your contacts are available to reach them quickly.

Although VoIP technology has not been deeply widespread on mobile devices yet, it is sure that it will become an important and extended service in the short term. The device provided by the department of Communication Networks works on Windows Mobile 5.0, so, before getting started to develop an application on this operating system it is necessary to dispose of the next software tools:

• Visual Studio 2005 Professional Edition: It is an integrated development envi- ronment for Windows’ systems that assists computer programmers in developing and debugging software.

• Windows Mobile 5.0 Professional SDK: It provides the tools, header files, em- ulator images, and Visual Studio 2005 project types to develop Windows Mobile 5.0 applications using the .NET Compact Framework.

• .NET Compact Framework 2.0: It includes the common language runtime (CLR) and class libraries to help Windows Mobile developers.

• ActiveSync 4.0: It provides the necessary connectivity and synchronization to sup- port a Windows Mobile emulator image or device.

The adaptation of software involves a porting or migration process. This process consists in upgrading the original sources and changing them in such a way that the application could

Diplomarbeit Carla Garc´ıaS´anchez 4 Adaptation of the VoIP softphone for mobile devices 56 be compiled and installed on a specific system different from the one that it was initially designed. One important thing before proceeding with the porting process is to identify the soft- ware dependencies of the application, such as libraries or APIs that have been employed. If the target platform provides them, the process can be resumed in the following steps: creating target OS project (in this case Windows Mobile) using tools which support target device, adding the old sources to it, compiling, analyzing errors, fixing them, debugging the application on target platform and fixing runtime errors. On the one hand, it should be considered that the GUI may change between the two platforms. It means that some controls may be not supported. In that case, they must be modified by the ones provided in the new platform. On the other hand, some functionalities may be not available in the target platform in spite of providing the corresponding API or library. Consequently, these features must be implemented by another existing way. This softphone has been developed using the RTC Client API and the .NET platform provided by Microsoft, but this API is not available on Windows Mobile. Unfortunately, it means that the softphone cannot be adapted to this operating system because all the methods, classes, interfaces, and so on that have been used to program it are not offered for the time being. However, Windows Mobile provides a great amount of new APIs to develop device ap- plications, including VoIP phone services. There is not a single interface to build VoIP applications, but, as it was explained in 1.7.3, a VoIP softphone running on mobile devices needs to implement all the protocols related to this technology and manage a communication channel over the internet network. Actually, Microsoft has two softphones already working on Windows Mobile 5.0, Microsoft Office Communicator Mobile that is based on the user interface of Microsoft Office Communicator 2005 desktop client, and Microsoft Portrait that is a research prototype for mobile video communication. Both are softphones based on SIP, but the APIs or platform on which they are built are proprietary and not availables for free use. Since it was mentioned before, there are two main possibilities to develop a softphone: using existing protocols stacks or implementing them since the beginning. One example of third party libraries is micropSIP stack (VoIP SDK). It is not free or open source, but it can be purchased. It includes support to the next protocols and technologies on Windows Mobile 5.0:

• RFC3261: Session Initiation Protocol

• RFC2327: Session Description Protocol

• RFC1889: A Transport Protocol for Real-Time Applications

• RFC2833: RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals

Diplomarbeit Carla Garc´ıaS´anchez 4 Adaptation of the VoIP softphone for mobile devices 57

• RFC3489: STUN - Simple Traversal of User Datagram Protocol (UDP) Through Net- work Address Translators (NATs)

• RFC3581: An Extension to the SIP for Symmetric Response Routing

• Universal Plug and Play (UPnP) Internet Gateway Device (IGD)

On the other hand, in order to implement all these protocols, Windows Mobile provides the next interfaces:

• Winsock: It provides information about Windows Sockets and is used to create ad- vanced Internet applications to transmit data over the network, independent of the network protocol being used. It is used to build and process data packets.

• TCP/IP: It includes the core protocol stack for IPv4 that manages IP datagrams. Besides, it provides interfaces that assist in the network administration of the local device, like retrieve information about the network configuration and to modify it.

• Universal Plug and Play (UPnP): It allows exchanging information and data between the devices connected to a network. Besides, it is used to solve the problems between SIP and NAT.

Developing all the needed functionalities of a protocol is a very complicated and long work. For that reason it is usual to employ some protocol stacks or APIs.

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 58

5 UML Structure

Unified Modeling Language (UML) is the most well-known and used object oriented modeling language of software systems nowadays. It is officially defined at the Object Management Group (OMG). UML is a graphical language to visualize, specify, construct and document a software system. It offers a standard to describe a model of the system, including conceptual aspects such as processes of businesses and functions of the system; and concrete aspects like expressions of programming languages, schemes of data bases and components of reusable software. Furthermore, UML is independent of the programming language. It is important to note that UML is a language to specify, but it is not a method or a process by itself. UML can be used in a great variety of forms to support a methodology of software development, but it does not specify which methodology or process should be employed. UML 2.0 specification provides 13 types of diagrams which show different aspects of the represented entities. Each diagram is a partial graphical representation of the system’s model. There are three classifications of UML diagrams:

• Structure diagrams: A type of diagram that represents the elements of a specifica- tion that are irrespective of time. It emphasizes in all the elements that should exist in the modeling system.

• Behaviour diagrams: A type of diagram that represents behavioural features of a system or business process. These diagrams emphasize what should happen in the modeling system.

• Interaction diagrams: A subset of behaviour diagrams which emphasize on the control and data flow between the elements of the modeling system.

The explanation of each type of diagram is described as follows:

• Structure diagrams

– Class diagram: describes the structure of a system by showing a collection of static model elements such as classes and types, their attributes, and the relationships between them. This kind of diagram let us know the structure of the system, but it does not explain what happens to its different parts when the system starts running.

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 59

– Object diagram: shows a complete or partial view of the structure of a mod- eled system at a specific time. This snapshot focuses on some particular set of object instances and attributes, and the links between the instances. They can be considered as a special case of class diagrams. – Component diagram: represents how a software system is divided into physical components and shows the dependencies among these components. Physical com- ponents could be, for example, files, headers, link libraries, modules, executables, or packages. – Composite structure diagram: shows the internal structure of a classifier (such as a class, component or use case) including the interaction points of the classifier to other parts of the system. It is a set of interconnected elements that collaborate at runtime to achieve some purposes. Each element has some defined roles in the collaboration. – Deployment diagram: is used to model the hardware used in system implemen- tations, the components deployed on the hardware, and the associations between those components. Each hardware element is represented as a node. – Package diagram: represents how a system is divided into logical groupings by showing the dependencies among them. As a package is typically thought as a directory, package diagrams provide a logical hierarchical decomposition of a system.

• Behaviour diagrams

– Use case diagram: describes a way in which the system will be used to achieve some functional goals. Use case diagrams allow us to model the functional goals of the system and relate those goals to kinds of user, called actors, so that we can see who should be able to do what using the system. Besides, these diagrams show the different functionalities of an application or system and how it is related with its environment. – Activity diagram: models the flow of actions in a process or workflow. It could be a business process, or it could be the control flow of program code. Activity diagrams show sequences of activity states where the flow immediately moves on to the next action when one action is complete. It is a way of representing intern transitions without attending to extern events. – State diagram: shows the various states that an object may be and how events cause transitions from one discrete state to another one. An object can remain in a state for a definite time, until the change condition is fulfilled.

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 60

• Interaction diagrams

– Sequence diagram: shows the interaction of a group of objects in an application along the time. It shows the different processes or objects that live simultaneously as parallel vertical lines and the messages exchanged between them as horizontal arrows. – Communication diagram: models the interactions between objects or processes in terms of sequenced messages. They represent a combination of information taken from Class, Sequence, and Use Case Diagrams describing both the static structure and dynamic behaviour of a system. Communication diagrams typically focus on the structural organization of objects that send and receive messages. – Timing diagram: are used to explore the behaviours of objects throughout a given period of time. A timing diagram is a special form of a sequence diagram. The differences between timing diagram and sequence diagram are that the axes are reversed so that the time is increased from left to right, and the lifelines are shown in separate compartments vertically arranged. – Interaction Overview Diagram: is a variant of an activity diagram which overviews the control flow within a system or business process. Each node/activity within the diagram can represent another interaction diagram. These diagrams can include sequence, communication, interaction overview and timing diagrams.

Sometimes, it is thought that the number of UML diagrams is excessive because some of them have very similar functionalities. As a result, UML allows defining only necessary diagrams. Depending on the system to implement and its characteristics and the developer’s requirements, it should be chosen one kind of diagrams or another. In this case, it is required UML diagrams to show the compatibility and interoperability of the softphone. For that purpose, the chosen diagrams are:

• Class diagram: because it presents a general and complete view of the softphone structure.

• Use case diagram: because it represents all the features in a system and the inter- relation with extern agents. It shows the main functionalities of an application, like the options presented in the application menu, and allows the user to visualize the alternatives of the execution normal flow. Every use case diagram has associated a description of each case of use.

• Sequence diagram: because it illustrates the interaction between the objects in the application along the time. It can detail the case of uses and make them clear because this kind of diagram shows the sequence of message’s exchange between the objects. Furthermore, it let us know who created each object and when.

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 61

• State diagram: because it describes all the possible states of the whole life of a single object and the reason of the change.

5.1 Class diagram

The prime class of this softphone is the voipUA class, where all the objects and main func- tionalities are implemented. The rest of the classes give support to the main class and define the graphical user interface. The class diagram is enclosed in Appendix A.

5.2 Use case diagram

The first step to build a use case diagram is to identify the actors involved. An actor is an extern person, system or machine that interacts with the application. Every case of use must be initiated by an actor. This application has two main actors: the user and the system or GUI. Besides, there is a secondary actor, registered user, who inherits all the characteristics from the user, but has some specific qualities. After identifying the actors, the requirements and the normal sequence of actions should be described step by step. Finally, the possible alternatives of execution must be enumerated. The final use case diagram is enclosed in Appendix A and the description of each case of use is shown below.

USE CASE 1 Running application Actor User Description Starting the application Preconditions Normal Step Action Sequence 1 User clicks on the executable file Postconditions User will have to register to use all the features of the appli- cation EXTENSIONS Step Branching Action

Table 5.1: Use case 1

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 62

USE CASE 2 Registering Actor User Description Introducing personal data to register in a SIP server Preconditions User runs application and must have a SIP account in any SIP server Normal Step Action Sequence 1 User clicks on the setting button 2 User fills in the settings form with the username, pass- word and SIP server 3 User clicks on the login button 4 System shows registration state registering uses Receiving registration state message

5 System shows registration state registered uses Receiving registration state message 6 System shows buddies in the contact list Postconditions User is registered and is able to use the different functions of the application. There is an inherited actor called registered user. EXTENSIONS Step Branching Action 2a User introduces wrong data: 2a. Fill in the form again 5a System shows a different registration state from regis- tered: 5a1. User should revise his information account and re- peat the registration

Table 5.2: Use case 2

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 63

USE CASE 3 Adding Buddy Actor Registered user Description Adding a new contact to the contact list Preconditions Registered user must know the buddy data Normal Step Action Sequence 1 Registered user clicks on the New buddy button 2 Registered user fills in the buddy form with the user- name, presentity URI and SIP number 3 Registered user clicks on the Add buddy button 4 System adds the new buddy to the contact list uses Updating buddy Postconditions Contact list information is updated EXTENSIONS Step Branching Action 1a The actor is the system 1a1. System adds new buddy internally (without fill in the form) 1a2. Jump to step 4 2a Registered user introduces only username: 2a1. It is impossible to identify the buddy and he has to fill in the form again. 2b User introduces wrong data: 2b1. Fill in the form again 4a System does not include the new buddy in the contact list 4a1. Registered user should repeat the action

Table 5.3: Use case 3

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 64

USE CASE 4 Removing Buddy Actor Registered user Description Removing an existing buddy from the contact list Preconditions Registered user must have any buddy in the contact list Normal Step Action Sequence 1 Registered user writes in the main panel the SIP URI of the buddy that it is going to be removed 2 Registered user clicks on the Remove buddy button 3 System removes the buddy from the contact list uses Updating buddy Postconditions Contact list information is updated EXTENSIONS Step Branching Action 1a Registered user introduces a wrong SIP URI: 1a1. System is not able to identify which buddy must be removed and does nothing 1a2. Registered user should modify the SIP URI and repeat the action 1b Registered user does not introduce any data: 1b1. System is not able to identify which buddy must be removed and does nothing 1b2. Registered user must write a valid buddy’s SIP URI 1c Registered user wants to remove an online buddy: 1c1. Registered user can click on the online contact and the SIP URI appears in the panel extends Clicking on buddy list

Table 5.4: Use case 4

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 65

USE CASE 5 Starting videoconference Actor Registered user Description Establishing an audio/video call with another user Preconditions Registered user must know SIP number or have any contact, have audio/video devices correctly connected and not have another videoconference started Normal Step Action Sequence 1 Registered user dials the SIP number uses Dialling number

2 Registered user clicks on the Make a call button 3 System rings a tone and shows caller ID 4 System shows session state inprogress uses Receiving session state message 5 System shows session state connected uses Receiving session state message 6 Systems reproduces audio received 7 System shows video windows (sent video and received video) Postconditions There is a session established EXTENSIONS Step Branching Action 1a Registered user has online contact: 1a1. Registered user clicks on an online contact and the SIP number appears in the panel extends Clicking on buddy list 1b Registered user introduces wrong data: 1b1. Registered user deletes data extends Deleting number 5a Session has not been accepted: 5a1. System informs user with an session state message uses Receiving session state message 5b Registered user can send DTMF signals: 5b1. Registered user clicks on number buttons 6a Some of the audio devices does not work or is not prop- erly connected: 6a1. System may not be able to reproduces audio 7a Some of the video devices does not work or is not prop- erly connected: 7a1. System may not be able to reproduces video

Table 5.5: Use case 5

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 66

USE CASE 6 Ending videoconference Actor Registered user Description Finishing an audio/video call with another user Preconditions A videoconference must have been started Normal Step Action Sequence 1 Registered user clicks on the oemphEnd call button 2 System shows session state disconnected uses Receiving session state message 3 System stops audio reproducing 4 System closes video windows Postconditions There is no session established EXTENSIONS Step Branching Action

Table 5.6: Use case 6

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 67

USE CASE 7 Receiving videoconference Actor Registered user Description Another user is trying to start a videoconference with the current user Preconditions Registered user must have audio/video devices correctly con- nected and not have another videoconference started Normal Step Action Sequence 1 Systems rings a tone and shows caller ID 2 Registered user clicks on the Answer button 3 System shows session state incoming uses Receiving session state message

4 System shows session state connected usesReceiving session state message 5 System reproduces audio received 6 System shows video windows (sent video and received video) Postconditions There is a session established EXTENSIONS Step Branching Action 2a Registered user rejects the call and clicks on the End call button: 2a1. System informs user with an session state message uses Receiving session state message 4a Session has not been established: 4a1. System informs user with a session state message uses Receiving session state message 4b Registered user can send DTMF signals: 4b1. Registered user clicks on number buttons 5a Some of the audio devices does not work or is not prop- erly connected: 5a1. System may not be able to reproduces audio 6a Some of the video devices does not work or is not prop- erly connected: 6a1. System may not be able to reproduces video

Table 5.7: Use case 7

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 68

USE CASE 8 Sending DTMF Actor Registered user Description Registered user can interact with a switching centre sending DTMF signals Preconditions A session must have been established Normal Step Action Sequence 1 Registered user clicks on the numeric buttons 2 System shows the pressed digits on the panel Postconditions DTMF signals have been sent EXTENSIONS Step Branching Action

Table 5.8: Use case 8

USE CASE 9 Enabling auto answer Actor User Description The user can enable the auto answer option to accept auto- matically incoming calls Preconditions Starting the application Normal Step Action Sequence 1 User clicks on the auto answer label Postconditions Incoming calls will be accepted automatically EXTENSIONS Step Branching Action

Table 5.9: Use case 9

USE CASE 10 Changing microphone volume Actor User Description The user can turn up or turn down the volume level of the microphone Preconditions Having audio device correctly connected Normal Step Action Sequence 1 User changes the volume moving the microphone track- bar Postconditions Microphone volume has been updated EXTENSIONS Step Branching Action

Table 5.10: Use case 10

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 69

USE CASE 11 Changing headset volume Actor User Description The user can turn up or turn down the volume level of the speaker/headset Preconditions Having audio device correctly connected Normal Step Action Sequence 1 User changes the volume moving the speaker trackbar Postconditions Speaker/headset volume has been updated EXTENSIONS Step Branching Action

Table 5.11: Use case 11 USE CASE 12 Configurating AV Actor User Description The user can verify that the camera, speakers, and micro- phone are working properly. Preconditions Having audio/video devices correctly connected Normal Step Action Sequence 1 User clicks on the AV T. Wizard button 2 System opens AV assistant 3 User manages the AV assistant and configures every de- vice Postconditions EXTENSIONS Step Branching Action

Table 5.12: Use case 12 USE CASE 13 Clicking logo SIP UA Actor User Description Showing to the user the main information about the software (author, where and when it has been developed. . . ) Preconditions Running application Normal Step Action Sequence 1 User clicks on the SIP UA logo 2 System opens a new window with the information Postconditions EXTENSIONS Step Branching Action

Table 5.13: Use case 13

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 70

USE CASE 14 Closing application Actor User Description Closing application Preconditions Running application Normal Step Action Sequence 1 User clicks on the X button 2 System closes the window application Postconditions EXTENSIONS Step Branching Action

Table 5.14: Use case 14

USE CASE 15 Managing audio Actor System Description System manages audio devices to reproduce or stop audio re- ceived Preconditions A session has been started or finished Normal Step Action Sequence 1 System starts reproducing audio when the session is es- tablished 2 System stops reproducing audio when the session is fin- ished Postconditions EXTENSIONS Step Branching Action 1a Some of the audio devices does not work or is not prop- erly connected: 1a1. System may not be able to reproduces audio

Table 5.15: Use case 15

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 71

USE CASE 16 Managing audio Actor System Description System manages audio devices to reproduce or stop audio re- ceived Preconditions A session has been started or finished Normal Step Action Sequence 1 System starts reproducing video and opens sent video window and received video window when the session is established 2 System stops reproducing video and closes video win- dows when the session is finished Postconditions EXTENSIONS Step Branching Action 1a Some of the video devices does not work or is not prop- erly connected: 1a1. System may not be able to reproduces video

Table 5.16: Use case 16

USE CASE 17 Ringing Actor System Description Description System rings a tone to warn the user about a videoconference started or received Preconditions A session has been started or received Normal Step Action Sequence 1 System rings a tone Postconditions Audio device reproduction has been started or stopped EXTENSIONS Step Branching Action

Table 5.17: Use case 17

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 72

USE CASE 18 Showing information Actor System Description System informs the user writing some information in the main panel of the application Preconditions User has made an action that requires information, such as pressing numeric button to send DTMF signal, or start- ing/receiving a videoconference call Normal Step Action Sequence 1 System writes information in the panel (numeric digits, caller ID) Postconditions User has been informed EXTENSIONS Step Branching Action

Table 5.18: Use case 18

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 73

5.3 Sequence diagram

In this diagram, it is described the exchange of messages between the objects in a normal ex- ecution of the application. It is recommended to analyze the final sequence diagram enclosed in Appendix A with the following explanation. The user runs the application and the graphical user interface (GUI) starts a sequence of actions or calls to different methods. The first step is the registration of the user. It is necessary to create a client object with an initialization and configuration of its parameters. This object needs a provisioning file where it is defined the user account information, such as name, password and URI, and SIP settings, such as address and transport protocol. After enabling this provisioning file, the client object creates the profile object. The client object sets its presence information and then enables the profile object, which saves all the information related to the client. The next step is updating the contact list. The GUI requires the information about the buddies and the watchers of the client. The client imports this information from a file and creates as many buddy and watcher objects as the file contains. The objects send information, like their current presence status, to the client and it updates the information. The GUI is then ready to show the updated contact list to the user. When the user adds a new buddy, the GUI acts again as an intermediary and asks the client for creating a new buddy. The operation is the same: the client object creates a buddy object, receives its information, updates it and the GUI can add the new buddy in the contact list. If the user dials a number clicking on the button numbers, the GUI manages the action and writes the number in the panel. The user can also correct the numbers and write them again. Once the user wants to start a videoconference, the GUI sends the petition to the client object and it creates a session object. While the object is being configured with the necessary parameters and security, the GUI rings a tone to inform the user that his petition is being processed. The session object creates the participant object with the information that the client object has sent it and establishes the videoconference. The GUI manages the session and adds the needed tools, like video windows. When the user finishes the videoconference, the GUI transmits the request and finishes the existing session. The session and the participant objects are deleted. Another possibility is that the system receives a petition for a videoconference from another user. Therefore, the GUI asks the user if he wants to accept it or not and notifies him with a phone ring. If the user accepts the incoming call and answers it, the GUI sends the petition to the client object. In this case, the client object has to get the information of the incoming session to create a new session object. This session object gets also the parameters of the session and creates the participant object with the information received. A new conference has been established.

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 74

If the user wants to finish the videoconference, it has to follow the same steps that have been detailed before. On the other hand, the session may be finished by another participant of the videoconference. As a result, the GUI receives a notification of finishing the current session, so it informs the user and deletes the session and the participant objects. The user can also change the volume devices (microphone and speakers). The GUI manages this request and sends it to the client object. The client object updates the volume values and the GUI shows it to the user. Another functionality of the softphone is removing buddies from the contact list. The sequence of exchanged messages is very similar when the user adds or removes a buddy, but the actions are completely different. When the user wants to remove a buddy, the GUI proceeds to ask the client for deleting the buddy. The client object removes this specific buddy object and updates its information about the buddies. Then, the GUI updates the contact lists with the purpose show the user that the buddy has been correctly removed. In order to finish the application, the user can close the application. The GUI informs the client object about this action and it saves the buddies and watchers information in a file. Besides, the client object has to delete all the existing objects before being disabled. After that, the GUI closes the application.

5.4 State diagram

This kind of diagram is useful to understand the behaviour of the most significant objects. Although the RTC Client API is based on the six main objects present in the sequence diagram, only four of them are going to be explained with the state diagrams. The order of the state diagram has been chosen according to its complexity, from minor to major. It is recommended to analyze the different state diagrams enclosed in Appendix A with the following explanations.

5.4.1 Buddy state diagram

The buddy object maybe created because of two possible different events:

• A new client object has been initialized and has updated its contacts information

• A registered user has added a new buddy

In both cases, the current state of the buddy is BUDDY ADD. This state can change due to the following events:

• The buddy changes some data information. The server updates this information and informs the client. The buddy state becomes BUDDY UPDATED.

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 75

• The buddy changes its presence status. The buddy was offline when it was added and now is online, or vice versa. The buddy state becomes BUDDY STATE CHANGE.

• The buddy presence state becomes offline. The client refreshes the buddy information periodically to be informed of possible changes. The buddy state becomes BUDDY SUBSCRIBED.

• The buddy has been removed. The client object deletes the buddy directly. The buddy state becomes BUDDY REMOVED and then it is finished.

Independently of the current state of the buddy object, it will be finished if the client shuts down. In that case, the object client storages all the information related to the buddy objects and deletes them.

5.4.2 Watcher state diagram

The watcher object is created when a buddy is added. In this moment, its state is WATCHER ADD. The watcher object is all the time referred to a buddy object. Thereby, the watcher state will become WATCHER UPDATED in the following cases:

• Buddy object changes its data information.

• Buddy object changes its presence status.

• Watcher object changes its data information.

Independently of the current state of the watcher object, it will be finished if the client shuts down. In that case, the object client storages all the information related to the watcher objects and deletes them.

5.4.3 Session state diagram

The session object may be created because of two possible different events:

• The user starts a videoconference call

• The user receives a videoconference call

These two possibilities are mutually exclusive, so depending on the event, the session object will go through some states or others.

• Videoconference started

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 76

The object session is configured with the type of session and the preferred security level. Its current state is IDLE. When the participant to the videoconference is added, the session state becomes INPROGRESS. If the participant does not accept the call and the time expires, the session state becomes DISCONNECTED. On the other hand, if the participant does accept the call, the session state becomes CONNECTED. It will remain in this state until any of the participants in the videoconference decides to finish it. In that moment, the session state becomes DISCONNECTED.

• Videoconference received

The session state is INCOMING. If the videoconference is not accepted, the ses- sion state becomes DISCONNECTED. If the videoconference is accepted, the session object is configured with received parameters of the type of session and the preferred security level and gets the information related with the participant in the videoconference. If the auto answer option is enabled, the session state becomes CONNECTED automatically. If it is not, the session state becomes ANSWERING until the user click on the Answer button. In that moment, the session state changes to CONNECTED. On the other hand, if the user does not answer the call and the time expires, the session state becomes DISCON- NECTED. Once the session state is CONNECTED, it will remain in this state until any of the participants in the videoconference decides to finish it. In that moment, the session state becomes DISCONNECTED.

Independently of the current state of the session object, it will be finished if the client shuts down, without saving any information about the session or the participants.

5.4.4 Client state diagram

The client object is created when a user runs the application. This is the main object in all the softphone because, in one way or another, the rest of the objects have a dependence on it. Its operation, functionalities and different states are more complex and complicated, so they have to be carefully analyzed. It should be clearly distinguished if the object state is REGISTERED or not because, depending on it, the user will be allowed to make some operations. At first, the client object is initialized and configured. Until the client object will be REGISTERED, it could only have these possibilities:

• The user can change the volume on the audio devices (microphone and speakers): The object client receives an event an updates the value.

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 77

• The user can close the application: The client object is shut down. Before destroy- ing the object, it has to storage its data information, including buddy and watcher information, and delete the other existing objects.

The next step is the registration of the client in the server. Depending on the result of this action, the client object will have one of the following states:

• REGISTERED: user is correctly registered in the server

• ERROR: there is an error registering the user (server is not available, data incorrect...)

• LOGGED OFF : user has registered with another device in the same URI.

• NOT REGISTERED: server has deregistered the user

• REGISTERING: SIP REGISTER message has been sent

• REJECTED: SIP REGISTER message has been rejected with a failure response

Independently of the current state, after any change, registration state will be updated. Furthermore, the user can start again the registration process at any time. If it happens, the current client object is shut down to create a new one. Although the normal way of operation is REGISTERING –> REGISTERED, it is possible to become to an ERROR state. In this case, the registration will be automatically repeated after a while. If the result is different from REGISTERED or ERROR, the user should revise the registration data and repeat the process by himself. When the client state is REGISTERED, it could also have these events:

• User starts/finishes a videoconference. The object client receives a media event and has to manage it. There are two possibilities: video sent and video received. For each one, the client object has to process the event and it could become into three different states:

– Failed: user will be informed. – Started: the video (sent or received) will be captured and shown in a video window. – Stopped: the existing video windows (sent or received) will be closed.

• User changes his presence information. The client object receives a presence status event and the new status is updated.

• User adds a new buddy. The client object creates a buddy object.

Diplomarbeit Carla Garc´ıaS´anchez 5 UML Structure 78

• Another client objects has added this client object as a buddy. The client object creates a watcher object.

NOTE: For some external reason, like server temporary unavailable, there could be an exception and the objects could be not processed or automatically finished.

Diplomarbeit Carla Garc´ıaS´anchez 6 Getting Started 79

6 Getting Started

6.1 Software requirements

In order to a correct operation and use of this softphone under Windows XP, it is necessary to install the following software and components:

• Microsoft Visual Studio 2005 Professional Edition. This package includes Microsoft .NET Framework 2.0.

• Windows Real-Time Communications Client API SDK v1.2

6.2 Getting an account

Before starting to use the softphone in order to call other users or receive incoming calls, it is essential to have a SIP address. Getting a SIP account is very easy. There are a great amount of public SIP servers of free use if the user does not dispose of his own SIP server. Some examples are enumerated as follows:

• http://iptel.org

• http://www.sipgate.co.uk

• http://www.freeworlddialup.com

It is important to realise that not all the servers have the same capabilities or offer the same services. So that, the user should inform about which server can provide him what he requires. Furthermore, in some public servers the user must pay to have an account or to have benefit of some services. The SIP account consists of a username, a password and a SIP server address. In some cases there is also a SIP number, which is the number assigned for incoming calls. If there is no SIP number, the calls are made using the username.

Diplomarbeit Carla Garc´ıaS´anchez 6 Getting Started 80

6.3 Description of Graphical User Interface

The prime objective of the user interface is to facilitate the execution of the different tasks that the developed softphone provides in a completely transparent way to the end user. This interface acts as an intermediary between the client, the SIP server and other clients. The user interface is composed of a main form, voipUAform, and three secondary forms, aboutprogramform, settingsform and addbuddyform. It is detailed below their operation, use and the way they interact between themselves. In the main form, we can distinguish the following controls:

• SIP UA logo: opens the aboutprogramform.

• Panel:

– Auto answer label: allows the automatic reception of every incoming call. – Top panel side: shows information related to the registration state of the user. – Central panel: it shows information related to the outgoing/incoming calls and callers. – Bottom panel side: it shows the call status (in progress, connected...) or the exceptions that happen in the program.

• Options:

– Make a call: starts a videoconference call with the SIP URI that has previously been introduced in the panel. If the video devices are correctly attached and the call state is connected, the user will visualize two video windows (sent video and received video). – End call: ends an incoming or an existing videoconference. – Answer: it allows an incoming videoconference to be established. – AV T. Wizard: it opens the assistant for audio/video adjustment which helps the user to verify that the camera, speakers, and microphone are working properly. – Setting: it opens the settingsform. – New buddy: it opens the addbuddyform. – Remove buddy: it deletes a buddy whose Presentity URI or SIP Number is written in the main panel.

• Keyboard:

– Numeric buttons and at sign: they introduce in the panel the digit or character that has been pressed.

Diplomarbeit Carla Garc´ıaS´anchez 6 Getting Started 81

– Backspace: deletes the last character that has been introduced in the panel.

• Audio options:

– Microphone/Speaker volume: the trackbar allows the user to turn up/down the microphone/speaker volume respectively.

• Contact list:

– Online contacts: shows in alphabetical order the username of all the contacts of the current user whose present status is online. The user can click on them, visualize their SIP URI / SIP Number in the panel and make a call. – Offline contacts: shows in alphabetical order the username of all the contacts of the current user whose present status is offline.

The settingsform permits the user’s registration in the SIP server. The user must introduce correctly all the data related to his SIP account (Username, Password and SIP Server) and click on the login button. The aboutprogramform shows the information about the softphone name, the date and the place when and where it was developed and the author.

Diplomarbeit Carla Garc´ıaS´anchez 82

Appendix

Diplomarbeit Carla Garc´ıaS´anchez A UML Diagrams 83

A UML Diagrams

A.1 Class diagram

A.2 Use case diagram

A.3 Sequence diagram

A.4 Buddy state diagram

A.5 Watcher state diagram

A.6 Session state diagram

A.7 Client state diagram

Diplomarbeit Carla Garc´ıaS´anchez A UML Diagrams 84

simple_UA_in_CS::voipUA -WM_NCHITTEST : int = 0x0084 -HTCAPTION : int = 2 +counter : int +client : RTCClientClass +provisioning : IRTCClientProvisioning +profile : IRTCProfile -objRTCClientWithEvents : RTCClientClass -SessionType : RTC_SESSION_TYPE simple_UA_in_CS::RTCConst -g_objSession : IRTCSession2 +RTCCS_FORCE_PROFILE : int = 0x00000001 -state : RTC_SESSION_STATE +RTCEF_CLIENT : int = 0x00000001 -g_objParticipant : IRTCParticipant +RTCEF_REGISTRATION_STATE_CHANGE : int = 0x00000002 -pEnum : IRTCEnumParticipants +RTCEF_SESSION_STATE_CHANGE : int = 0x00000004 -security : bool +RTCEF_BUDDY : int = 0x00000100 -g_objBuddy : IRTCBuddy +RTCEF_WATCHER : int = 0x00000200 -g_objWatcher : IRTCWatcher +RTCEF_PROFILE : int = 0x00000400 -g_objPresence : IRTCClientPresence +RTCEF_PRESENCE_PROPERTY : int = 0x00020000 -g_objEnumBuddies : IRTCEnumBuddies +RTCEF_BUDDY2 : int = 0x00040000 -g_objEnumWatchers : IRTCEnumWatchers +RTCEF_WATCHER2 : int = 0x00080000 -objRegistrationEvent : IRTCRegistrationStateChangeEvent +RTCEF_PRESENCE_DATA : int = 0x00800000 +stateR : RTC_REGISTRATION_STATE +RTCEF_PRESENCE_STATUS : int = 0x01000000 +speakerD : RTC_AUDIO_DEVICE +RTCEF_MEDIA : int = 0x00000020 +microphoneD : RTC_AUDIO_DEVICE +RTCMT_AUDIO_RECEIVE : int = 0x00000002 -strDestURI : string +RTCMT_AUDIO_SEND : int = 0x00000001 +strprofile : string +RTCMT_VIDEO_SEND : int = 0x00000004 +username : string +RTCMT_VIDEO_RECEIVE : int = 0x00000008 +password : string +RTCMT_ALL : int = 0x0000001F +sipserver : string +RTC_E_STATUS_CLIENT_UNAUTHORIZED : int = -2131820143 -contactList : string +RTC_E_STATUS_CLIENT_PROXY_AUTHENTICATION_REQUIRED : int = -2131820137 +remnumber : string +RTC_E_STATUS_CLIENT_FORBIDDEN : int = -2131820141 +remserver : string +RTC_E_STATUS_CLIENT_NOT_FOUND : int = -2131820140 +boolautoansw : bool +callerURI : string +str7 : string -celt : uint = 1 -callIsActiv : bool -answer : Button -endcall : Button -panel2 : Panel -alphauri : AlphaBlendTextBox -setting : Button simple_UA_in_CS::aboutprogram +voip : voipUA -WM_NCHITTEST : int = 0x0084 -label2 : Label -HTCAPTION : int = 2 -calleruri : Label -label5 : Label -statusreg : Label -label1 : Label -statusl : Label -label2 : Label -autoanswl : Label -label3 : Label -label3 : Label -components : Container = null -label4 : Label -label5 : Label +aboutprogram() -label6 : Label #Dispose(in disposing : bool) -zif1 : Button +InitializeComponent() -zif2 : Button -settingsform_Load(in sender : object, in e : EventArgs) -zif3 : Button External Classes::AlphaBlendTextBox #WndProc(inout m : Message) -button1 : Button -label5_Click(in sender : object, in e : EventArgs) -button2 : Button -label1_Click(in sender : object, in e : EventArgs) -button3 : Button -label2_Click(in sender : object, in e : EventArgs) -button4 : Button -button5 : Button -button6 : Button -button7 : Button -button8 : Button -button9 : Button +rndstr : string -tuningWizard : Button -SpeakerTrackBar : TrackBar -SpeakerLabel : Label simple_UA_in_CS::regkey simple_UA_in_CS::settingsform -label7 : Label +voip4 : voipUA -WM_NCHITTEST : int = 0x0084 -newBuddyButton : Button +u : string -HTCAPTION : int = 2 -buddyListBox1 : ListBox +p : string +voip3 : voipUA -label1 : Label +s : string -textBox1 : TextBox -buddyListBox2 : ListBox -sendVideoButton : Button +regkey(in voip : voipUA) -label1 : Label -label2 : Label -label8 : Label -label3 : Label -MicrophoneTrackBar : TrackBar -textBox2 : TextBox -removeBuddyButton : Button -textBox3 : TextBox -components : Container = null -button1 : Button +voipUA() -groupBox1 : GroupBox #Dispose(in disposing : bool) -button2 : Button -InitializeComponent() simple_UA_in_CS::parseruri -components : Container = null -Main() +uri : string +settingsform(in voip : voipUA) -voipUA_Click(in sender : object, in e : EventArgs) +serv : string #Dispose(in disposing : bool) -voipUA_DbClick(in sender : object, in e : EventArgs) -numb : int +InitializeComponent() -voipUA_Load(in sender : object, in e : EventArgs) -lg : int -settingsform_Load(in sender : object, in e : EventArgs) -objRTCClientWithEvents_Event(in RTCEvent : RTC_EVENT, in pEvent : object) +parseruri(in str : string) -button1_Click(in sender : object, in e : EventArgs) -SessionStateEvent(in pEvent : object) #WndProc(inout m : Message) -RegistredUA(in pEvent : object) -button2_Click(in sender : object, in e : EventArgs) -ClientEvent(in pEvent : object) -answer_Click(in sender : object, in e : EventArgs) -sendVideoButton_Click(in sender : object, in e : EventArgs) -endcall_Click(in sender : object, in e : EventArgs) -DisplayStatus(in strStatus : string) -DisplayStatus2(in strStatus2 : string) -WriteToDisplay(in numb : string) -setting_Click(in sender : object, in e : EventArgs) +registermy3(in usrname : string, in pwd : string, in srv : string) -registermy5() simple_UA_in_CS::Info -probeRegistering() simple_UA_in_CS::addBuddyForm #WndProc(inout m : Message) -_bURI : string -autoanswl_Click(in sender : object, in e : EventArgs) -_bName : string +pURI : string -label4_Click(in sender : object, in e : EventArgs) -_bSIPNumber : string +userName : string -label5_Click(in sender : object, in e : EventArgs) +bURI() : string +sipNumber : string = "" +shutDown() +bName() : string +voip5 : voipUA +clearListBox() +bSIPNumber() : string +addBuddyForm(in voip : voipUA) -label6_Click(in sender : object, in e : EventArgs) +Info(in buri : string, in bname : string, in bsipnumber : string) -button2_Click(in sender : object, in e : EventArgs) -zif1_Click(in sender : object, in e : EventArgs) +ToString() : string -button1_Click(in sender : object, in e : EventArgs) -zif2_Click(in sender : object, in e : EventArgs) -zif3_Click(in sender : object, in e : EventArgs) -button3_Click(in sender : object, in e : EventArgs) -button2_Click_1(in sender : object, in e : EventArgs) -button1_Click_1(in sender : object, in e : EventArgs) -button6_Click(in sender : object, in e : EventArgs) -button5_Click(in sender : object, in e : EventArgs) -button4_Click(in sender : object, in e : EventArgs) -button8_Click(in sender : object, in e : EventArgs) -button9_Click(in sender : object, in e : EventArgs) -button7_Click(in sender : object, in e : EventArgs) +Beep(in freq : int, in duration : int) : bool -SpeakerTrackBar_ValueChanged(in sender : object, in e : EventArgs) -MicrophoneTrackBar_ValueChanged(in sender : object, in e : EventArgs) -newBuddyButton_Click(in sender : object, in e : EventArgs) +addBuddyList(in presentityU : string, in usrName : string, in sipNumber : string) -refreshBuddyListBox() -refreshBuddyListBox2(in pEvent : object) -MediaEvent(in pEvent : object) -StatusEvent(in pEvent : object) +refreshWatchers() -WatcherEvent(in pEvent : object) -alphauri_TextChanged(in sender : object, in e : EventArgs) -buddyListBox1_SelectedIndexChanged(in sender : object, in e : EventArgs) -tuningWizard_Click(in sender : object, in e : EventArgs) -removeBuddyButton_Click(in sender : object, in e : EventArgs)

Figure A.1: Class diagram

Diplomarbeit Carla Garc´ıaS´anchez A UML Diagrams 85

SIP UA

Receiving «uses» registration state message

Registering

Adding Buddy «uses»

Removing Buddy «uses» Updating buddy

Running application

«uses»

«extends» Deleting number Dialling number

«extends» «uses» «uses»

Clicking on buddy «extends» list «uses» Starting videoconference Managing audio Receiving session state message

«uses»

«uses»

Managing video windows

Ending videoconference

User Receiving System / GUI videoconference Ringing

Configurating AV

Sending DTMF Showing information

Enabling auto answer Changing microphone volume

Changing headset volume

Receiving softphone information

Clicking on logo SIP UA

Deleting data «extends»

Closing application «extends» Saving data

Figure A.2: Use case diagrama diagram

Diplomarbeit Carla Garc´ıaS´anchez A UML Diagrams 86

User System - GUI

Start application Register user

Initialitze Client

Configuration

Create provisioning

Enable provisioning

Create Profile

Define presence

Enable presence

Update contact list Enable profile

Create Buddy

Send information Create Watcher

Send information

Update information Show updated contact list Add new buddy

Ask for new buddy Create new buddy Buddy

Send information

Add to the contact list Update information

Dial number

Write number

Correct number

Delete number

Start videoconference

Ask for videoconference

Create Session Ring tone

Add security

Create Participant

Establish videoconference

Videoconference started Finish videoconference

Finish session

Delete Delete

Ask for videoconference

Ring phone Answer phone

Ask for videoconference

Get session Session

Get security

Get participant Participant

Establish videoconference

Videoconference started Finish session Videoconference finished Delete Delete

Change volume devide

Ask for changing volume

Change volume

Show changed volume

Remove buddy

Ask for removing buddy

Delete

Update information

Update contact list

Close application

Ask for closing

Save buddies and watchers information

Delete

Delete

Delete

Disable

End

Figure A.3: Sequence diagram

Diplomarbeit Carla Garc´ıaS´anchez A UML Diagrams 87

New buddy added

BUDDY SUBSCRIBED BUDDY UPDATED

Buddy refreshed Data changed

BUDDY ADD

Buddy removed Presence changed

BUDDY REMOVED BUDDY STATE CHANGE

Client shut down

Figure A.4: Buddy state diagram Diplomarbeit Carla Garc´ıaS´anchez A UML Diagrams 88

Buddy added

WATCHER ADD

Data changed

WATCHER UDPATED

Client shut down

Diplomarbeit Carla Garc´ıaS´anchez

Figure A.5: Watcher state diagram A UML Diagrams 89

Videoconference received Videoconference started

State updated: INCOMING Configured

Session accepted

Configuration handled State updated: IDLE Participant handled

Auto answer enabled Participant added

No Yes

Answered Accepted State updated: ANSWERING State updated: CONNECTED State updated: INPROGRESS

Videoconference finished Not answered Not accepted

Session not accepted DISCONNECTED

Client shut down

Diplomarbeit Carla Garc´ıaS´anchez

Figure A.6: Session state diagram A UML Diagrams 90

Client data saved

Volume on audio devices updated Shut down Volume changed

Other state Registration tried

Initialized and configured Registration state updated

New registered ERROR REGISTERED

User informed Media event received Presence status changed Failed

Video sent/received

Presence information updated

Started Stopped New buddy added New watcher added

Video captured and shown Video window closed

Figure A.7: Client state diagram

Diplomarbeit Carla Garc´ıaS´anchez Literaturverzeichnis 91

Bibliography

Diplomarbeit Carla Garc´ıaS´anchez Abbildungsverzeichnis 92

List of Figures

1.1 A typical SIP network with gateways ...... 5 1.2 SIP clients and servers ...... 6 1.3 Example of SIP INVITE ...... 7 1.4 Example of SIP REGISTER ...... 8 1.5 Example of a SIP extension: SUBSCRIBE - NOTIFY ...... 8 1.6 SIP registration procedure ...... 23 1.7 Multimedia SIP session establishment ...... 26

2.1 RTC Client COM Objects ...... 35 2.2 Overview of the Common Language Infrastructure ...... 37

3.1 Sending DTMF signals ...... 46 3.2 Videoconference ...... 47 3.3 Example of SUBSCRIBE - NOTIFY ...... 49 3.4 Videconference with encryption of the media ...... 52

A.1 Class diagram ...... 84 A.2 Use case diagrama diagram ...... 85 A.3 Sequence diagram ...... 86 A.4 Buddy state diagram ...... 87 A.5 Watcher state diagram ...... 88 A.6 Session state diagram ...... 89 A.7 Client state diagram ...... 90

Diplomarbeit Carla Garc´ıaS´anchez Tabellenverzeichnis 93

List of Tables

1.1 VoIP Clients running on different OS ...... 19 1.2 VoIP Clients for mobile devices ...... 21

3.1 DTMF frequencies ...... 41

5.1 Use case 1 ...... 61 5.2 Use case 2 ...... 62 5.3 Use case 3 ...... 63 5.4 Use case 4 ...... 64 5.5 Use case 5 ...... 65 5.6 Use case 6 ...... 66 5.7 Use case 7 ...... 67 5.8 Use case 8 ...... 68 5.9 Use case 9 ...... 68 5.10 Use case 10 ...... 68 5.11 Use case 11 ...... 69 5.12 Use case 12 ...... 69 5.13 Use case 13 ...... 69 5.14 Use case 14 ...... 70 5.15 Use case 15 ...... 70 5.16 Use case 16 ...... 71 5.17 Use case 17 ...... 71 5.18 Use case 18 ...... 72

Diplomarbeit Carla Garc´ıaS´anchez List of Abbreviations and Symbols 94

List of Abbreviations and Symbols

AES ...... Advanced Encryption Standard API ...... Application Programming Interface ARM ...... Advanced RISC Machine BCL ...... Base Class Library CBC ...... Cipher Block Chaining CIL ...... Common Intermediate Language CLI ...... Common Language Infraestructure CLR ...... Common Language Runtime CLS ...... Common Language Specification COM ...... Component Object Model CTS ...... Common Type System DES ...... Data Encryption Standard DTMF ...... Dual Tone Multi Frequency GNU ...... GNU’s Not Unix GPL ...... General Public License GUI ...... Graphical User Interface HTML ...... Hypertext Markup Language Hz ...... Hertz ID ...... Identifier IGD ...... Internet Gateway Device IM ...... Instant Messaging Inc...... Incorporation IP ...... Internet Protocol IPSec ...... Internet Protocol Security ISA ...... Information Source Adapter IVR ...... Interactive Voice Response J2ME ...... Java 2 Micro Edition JIP ...... Just-In-Time LGPL ...... Lesser General Public License MIDP ...... Mobile Information Device Profile

Diplomarbeit Carla Garc´ıaS´anchez List of Abbreviations and Symbols 95

MPL ...... Mozilla Public License MSIL ...... MicroSoft Intermediate Language NAT ...... Network Address Translator NOS ...... Nokia Operating System OMG ...... Object Management Group OS ...... Operating System OSE ...... Operating System Embedded PC ...... Personal Computer PDA ...... Personal Digital Assistant PINT ...... PSTN INTernet PSTN ...... Public Siwtched Telephone Network QoS ...... Quality Of Service RAM ...... Random Access Memory RFC ...... Request For Comments RIM ...... Research In Motion RISC ...... Reduce Instruction Set Computer ROM ...... Read Only Memory RR ...... Receiver Report RTC ...... Real Time Communications RTCP ...... Real-time Transport Control Protocol RTP ...... Real-time Transport Protocol SDK ...... Software Development Kit SDP ...... Session Description Protocol SER ...... SIP Express Router SIP ...... Session Initiation Protocol SR ...... Sender Report SRTP ...... Secure Real-time Transport Protocol SSRC ...... Synchronization Source STUN ...... Simple Transversal of UDP through NATs TCP ...... Transmission Control Protocol TLS ...... Transport Layer Security UA ...... User Agent UAC ...... User Agent Client UAS ...... User Agent Server UDP ...... User Datragram Protocol UML ...... Unified Modeling Language UPnP ...... Universal Plug and Play URI ...... Uniform Resource Identifier VES ...... Virtual Execution System

Diplomarbeit Carla Garc´ıaS´anchez List of Abbreviations and Symbols 96

VoIP ...... Voice over Internet Protocol VPN ...... Virtual Private Network XML ...... EXtensible Markup Language XOR ...... eXclusive OR

Diplomarbeit Carla Garc´ıaS´anchez Thesis of Diplomarbeit 97

Thesis of Diplomarbeit

1. VoIP is already a widespread technology in pc-to-pc transmissions that is now becom- ing even more popular in mobile communications because of its multiple advantages, especially, cost reduction and mobility.

2. Softphones are implementations of software that allow the establishment of VoIP com- munications in such a transparent way to the end user. The softphone used in this project is based on SIP and works in conjunction with RTP and SDP to provide some extra features.

3. SIP is the signalling protocol responsible of the establishment, modification, manage- ment and finalization of real-time multimedia session.

4. The further development of this softphone offers more facilities to the final user like adjusting audio settings during a conversation, sending DTMF to interact with inter- active voice response such as voice mail, establishing videoconference calls, managing address book with online and offline contacts and encryption of the media to secure the communications.

5. Microsoft RTC Client API has been used to develop the softphone because it provides the necessary structure and interfaces to implement VoIP communications and lots of extra capabilities. Besides the previous functionalities, it also allows implementing other features such as instant messaging or sharing applications.

Ilmenau, 14. 06. 2007 Carla Garc´ıa S´anchez

Diplomarbeit Carla Garc´ıaS´anchez Erkl¨arung 98

Erkl¨arung

Die vorliegende Arbeit habe ich selbstst¨andig ohne Benutzung anderer als der angegebenen Quellen angefertigt. Alle Stellen, die w¨ortlich oder sinngem¨aß aus ver¨offentlichten Quellen entnommen wurden, sind als solche kenntlich gemacht. Die Arbeit ist in gleicher oder ¨ahn- licher Form oder auszugsweise im Rahmen einer oder anderer Prufungen¨ noch nicht vorgelegt worden.

Ilmenau, 14. 06. 2007 Carla Garc´ıa S´anchez

Diplomarbeit Carla Garc´ıaS´anchez