VOICE-OVER-IP Security: Enhancing ZRTP

Protocol by Proposing an additional Human Authentication Approach

First A. Remon Shawki Fahim, B. Prof. Dr. Mohammed M. Kouta, Jr., and Third C. Prof. Dr. Ali Moharam, Member, IEEE

Abstract — Since the appearance of the Internet protocol (IP) as a universal network infrastructure that is available on both Local and public area networks such as the public Internet, securing the VOIP networks is considered a challenging task

duo to the publicly accessible nature of these networks. II. VOICE OVER IP DEFINITIONS AND PROTOCOLS In this paper we present a structured analysis of one of the Voice Over IP: The Voice Over Internet Protocol is the latest VOIP security and protocols the ZRTP, protocol of routing the voice packets through the internet, ZRTP has been created by the creator of the famous Pretty or other packet switched networks. VOIP can be Good Privacy Protocol Philip Zimmerman. the potential threats to the authentication of this protocol will covered on considered as an alternative of the traditional PSTN phone this paper, beside a proposed solution as a trial to decrease the lines that provides cheaper and clearer voice service. probability of these threads by adding a voice verification VOIP Technology uses the following protocols: system used to verify the caller’s certificate, and the caller’s a. The Real Time Transport Protocol is used for the voice print by using the voice xml technology, this Human transmission of the voice and video packets Authentication system is proposed to act as an additional layer between the two communicating end points in addition to the standard security procedures that are used on the ZRTP protocol for authentication. through the IP Networks.

Keywords —AES, Diffie Hellman exchange, Human b. The Real Time Control Transport Protocol is Authentication, Private Key, Public Key, Voice Biometric, used in conjunction with the RTP, In the RTP VoiceXml, ZRTP. session the RTCP is used to check the quality of

the service provided by the RTP. I. INTRODUCTION c. The Secured Real Time Transport Protocol is HE voice -over-internet protocol is vulnerable to used in order to protect the privacy of the calls Tnumerous security issues that occurs to the IP based data networks such as unauthorized access by third against the eavesdroppers. Secure RTP provides party so he/she can intercept the call (eavesdropping), and encryption and authentication and integrity of the voice fraud in which an attacker fakes an id and pertains to voice or video packets during the transmission, be someone you already know, therefore, it is just like any Security protocols such as the ZRTP protocol may technology that involves transfer data/voice packets onto a used as an extension to RTP in order to create a compromised network. Secure SRTP session. One of the primary methods of securing voice -over-IP is the encryption. ZRTP protocol [1] creates a secured d. The SIP and H323 are used before the process of channel that is opened between two back ends by encrypting the voice packets on real time; the transmitting the packets (initiating the call encryption/decryption key is a shared key that is created by operation), they are used to locate the remote exchanging the keys between the two parties by using the device and make the negotiations that sets up the diffie and Hellman key exchange. Zrtp uses The Block both devices to establish the communication Cipher Advanced Encryption Standard (AES) and the key channel, this process is referred as the “Call length is 128 bit and 256 bit to encrypt the call sessions [1]. Signaling Process’’, other protocols that can be The signaling protocol that is used to establish, negotiate also used for this process are RAS, DNS, TRIP, (discovery phase), and terminate the calls between parties is the Session Initiation Protocol (SIP). ENUM.

F. A. author is now with the Arab Academy for Science and Maritime e. H.248 and MGCP are called Device Control Transport, Egypt (phone: +20-123862677; e-mail: Protocols, the purpose of these protocols is to deal [email protected]). S. B. author is now with the Arab Academy for Science and Maritime with the gateways that are used to connect the Transport, Egypt (phone: +20101684016; e-mail: traditional PSTN telephone networks to the IP [email protected]). based computer networks, thus, in VOIP area, a

gateway is a device that offers an IP interface from one side and some sort of a legacy PSTN switch on the other side.

203

f. The International Communication Union which is therefore, SIP is the signaling protocol that is used on the also known as ITU is working on a new signaling IP telephony to initiate or create calls between two parties protocol that contains more capabilities than SIP or multiple parties (conferences). The SIP is popular because of the following benefits: and H.323, the purpose of this protocol is to a. Light weight and efficient protocol [4]. enable the voice and video and data communication capabilities that work with b. Text based protocol which makes it more convenient separate devices such as mobile phone and for the users just like the HTTP protocol (Hyper HDTV. Text Transfer Protocol) [4].

g. The Internet is a datagram network, so, to transmit Easy to implement thus, it is suitable for the small the voice packets from one network node to companies. another, the real time packets has to be The SIP Protocol has been defined on the RFC 2543 and encapsulated on a data gram protocol, The then a newer RFC 3261 that obsoletes 2543 has been also Universal Datagram Protocol is used for this introduced to the IETF [8]. process. The SIP Session may look like the following figure:

III. VOIP PROCESS According to the draft that has been published on the National Institute of Standards and Technology, NIST sp800 document [3], once the call is answered the voice is converted to digitized form and segmented into stream of packets for the transmission; this process is occurred by applying the following steps: Fig. 1 the SIP session Format [17] a. The voice digitized form requires large number of Since the protocol is a text based protocol the INVITE bits; a compression algorithm is used to reduce the may looks like the following: volume of the data that is going to be transmitted. INVITE sip:[email protected] SIP/2.0 Via: SIP/2.0/UDP b. Since the data has to be transmitted on real time, pc33.atlanta.com;branch=z9hG4bK776asdhd the RTP Real –Time Transport Protocol is used, Max-Forwards: 70 the data samples or voice samples are inserted into To: Bob RTP Packets to be carried on the internet because From: Alince ;tag=1928301774 the RTP packets can hold the data needed to re- Call-ID: [email protected] assemble the segmented packets into voice signals CSeq: 314159 INVITE on the other end. Contact: Content-Type: application/sdp c. The voice RTP packets will be carried as payload Content-Length: 142 [8]. by the UDP (Universal Data-Gram Protocol) which The main idea behind the SIP is that the caller or the sender can be processed by the ordinary data transmission sends an invitation message to establish a session to the on the ordinary networks. Payload term is called on receiver and wait for an acceptance from the receiver which the actual data in order to differentiate between the is called the 200 ok in order to start establish the session, when the session has been accepted and the actual data and the information that defines these acknowledgement is done, the multimedia session has been data on the telecommunication or computer created and the transmission of the voice and the video networks. packets will start. This process can be done as directly as peer to peer, or through a third party like a server that is The whole process will be reversed on the other end (the used for get registration information from the both the receiver node); the packets will be disassembled and sender and the receiver, The SIP protocol works also with rearranged to the correct order using the information hold the gateways that connects the internet with other by the special header fields of the RTP packets. The traditional networks such as the PSTN networks in order to digitized packets will be converted again to analog establish a call between a classic phone and the VOIP node continuous voice again by using Digital to Analog which can be a VOIP telephone or a computer. converter (DAC).

V. ZRTP PROTOCOL IV. SESSION INITIATION PROTOCOL (SIP) The ZRTP is defined on the Internet draft that is submitted The SIP or the Session Initiation Protocol is a signaling to the IETF [1] as a “key agreement protocol which protocol that is located on the application-layer of the five performs diffie Hellman key exchange during call setup in layer TCP/IP model. band in the Real Time Transport Stream which has been SIP protocol is used for initiating, modifying, and established by using other Signaling protocol called the SIP terminating session between two or more participants, protocol”. The purpose of the ZRTP protocol is to initiate a

204 secured encrypted channel between two parties, in other words, it creates a secured session where the data are transferred by the SRTP or the Secure Real Time Protocol. The key agreement that is used for this protocol is the Diffie-Hellman key agreement that is defined on the RFC 2631, June 1999 [5]. D-H Diffie Hellman key exchange is a Fig. 2 Diffie-Hellman Key exchange [6] key agreement algorithm that is used by two parties to agree on a shared secret. The Diffie-Hellman Key exchange scenario between user A VI. ZRTP PROTOCOL SCENARIO and user B can be described as follows [6]: Calculation of the diffie Hellman consists of two types of The ZRTP Protocol Scenario can generally described number, public numbers that are globally shared over the by the following steps [9]: network and the private numbers that are known only by 1. Session Establishment: The RTP session is the party who generates it. established using the SIP signaling a. First a large prime number presented by q is protocol that is secured and the data are selected, and the primitive root of this primary transmitted using number is selected which an integer is represented protocol or the TLS which is an by the symbol α. The root primitive α can be defined application layer protocol and a as the integer number whose powers generate all the that is used to integer numbers between 1 to q-1 where is: establish secured channel over the internet to provide secured communication against α mod q, α² mod q, ………, mod q [6] the attacks of the eavesdropping and forgery. The number q and the primitive root α are global public elements that are accessed and used for the calculation of To establish the secured channel and before keys by both users A and B. staring the RTP session, the SIP signaling protocol a. Next, the user A selects a private Random passes the sig or the signaling secrets, signaling number, it can be symbolized by XA the user secrets are the concatenation of the hash of the use this number to calculate the following caller-id, the From tag, and the To tag: equation: sigs = H(call-id | to-tag | from-tag).

= α X A The SRTP session then will be started, the Secret Y A mod q ]6[ of the SRTP (Secured Real Time Protocol) is also Where the Y value is the calculated key that will provided by the signaling protocol which is the be exchanged in public with the other side as the hash function of the SRTP master key and salt key shared key secret in order to create the secured after the two sides exchange the keys by using the channel, note that the X value will be kept in diffie Hellman method, note that the signaling private so the attacker can’t attack the channel secrets (sig) is also dependent upon the Diffie even with the Y value. Hellman key exchange step, the hash function may b. And also the user B makes the same calculation: look like the following: = α X B srtps = H(SRTP-mkey | SRTP-msalt). Y B mod q ]6[ 2. Key Agreement: c. In this step, user A and User B will Key agreement has four stages starting from the discovery phase where the caller configures what the share the calculated YA and YB in order to another side’s have such as if the another side has the zrtp calculate the shared secret keys: protocol and what type of encryption will have, the next = X A K A Y B mod q [ 6 ] step is the commitment step, in this step, the information that has been supported by the previous step about will be = X B K B Y A mod q [ 6 ] used as commitment or agreement between the both sides, this information well be sent as a message called commitment message. d. Since the global values or the global keys YA

and YB has been exchanged for the calculation The information that are gathered during the discovery of the keys then the both values of K and step and used on the commitment step is the information A about the supported ZRTP versions, the hash functions, K B will be equal (shared key). ciphers, authentication tag length, key agreement type, and SAS algorithm. The above scenario can be represented by the following figure: Step Three is the step of the Diffie Hellman exchange that is necessary for establishing the SRTP session, in this step each side generate his private and public numbers (Diffie Hellman key pair) and the keys are exchanged

205 between the both sides by the method which has been The shared secret which will be referred as S0 can be mentioned above on the Diffie Hellman section (section generated by the receiver by hashing the concatenation 5.1), the keys are of course exchanged in order to calculate of the Diffie-Hellman result or the DHResult, the shared secrets. DHResult is the final shared Diffie-Hellman key that is

The Last step is switching to the SRTP session which will calculated by each side according to the public value be generated by using the salt and the master keys that has received from the other side, button line, the purpose been generated by the both sides, the cipher method that is of the step of the key exchange is the following: used to encrypt the secured SRTP session is the Advanced Encryption Standard or the AES with 128 or 256 key mh =H(reciever’sHello|Commit|DHPart1|DHPart2). lengths. s0 =H (dhr|mh|s1|s2|s3|s4|s5). Where mh is the message hash which is the In general, to perform the key agreement step, the both concatenation by the receiver’s hello sides has to perform the following: message, the commit message that is sent by the a. The Discovery phase, on this phase, the messages sender, and the public values generated by the sender and that are exchanged are called the Hello messages the receiver. to which each party replies with a hello To avoid compromising the calculation of the S0 Acknowledge, and the ZID of each side is because of the order of the retained shared secrets that are exchanged. The ZRTP identifier or the ZID is a 96 derived from the S0, further calculations are required to bit random generated number that is generated make sure that the receiver sorts his retained secrets after once during the installation of the ZRTP instance. the sender, this is important because the receiver and the The ZID is used for retrieving information about sender may have different secrets or may have no secrets at the retained secrets rs1 and rs2 that should be all. d. Switch to SRTP and Confirmation: already available from previous sessions. The value of the S0 will be used to initiate the SRTP (Secured Real Time) session by calculating the SRTP b. The hash commitment step, commitment is sent by master keys and Salts (SRTP msalt) by using an HMAC the initiator side of the call. Based on the function, (Hash Message Authentication Code): discovery phase, the initiator chooses which hash SRTP-mkey (receiver) = HM (s0; "Responder SRTP functions, cipher, authorization tag length, and key master key") agreement that are supported by the both sides SRTP-msalt (receiver) = HM (s0; "Responder SRTP (according to the feedback of the discovery master salt") phase), this choice will be sent as what is called a SRTP-mkey (initiator/responder) = HM (s0; "Initiator SRTP master key") commit message. Since the ZRTP protocol SRTP-msalt (initiator/responder) = HM (s0; "Initiator establishes the call session by exchanging the SRTP master salt") generated Diffie-Hellman public values (contained The above hmacs are used to generate the necessary on the messages DHPart1,and DHPart2) between keys for the SRTP, for the ZRTP the parties of the call, on this step, the sender who Keys, the sender and the receiver need to calculate is referred by the initiator will generate a pair of the following HMACkeys: Hmackey (initiator/sender) = HM (s0; "Responder HMAC keys consists of public value (denoted by pv) and key") private or secret value (denoted by sv) : Hmackey (Receiver) = HM (s0; "Initiator HMAC key") pv(Initiator ) = α sv(initiator ) mod q ]9[ rs0 = HM (s0; "retained secret") Finally, the ZRTP keys of the sender and the receiver will pv (Initiator ) = α sv (initiator ) mod q ]9[ α be calculated: Where and q are determined by the key agreement type ZRTP-key (Receiver) = HM (s0; "Responder ZRTP Key") value [9]. ZRTP-key (initiator/sender) = HM (s0; "Initiator ZRTP The sender will be able to calculate his hash value which is Key"). the hash of the DHPart2 concatenated with the receiver’s In addition, the SAS (Short Authentication String) will be hello message, so the hash value hv(initiator/sender) will calculated at this point. be: The Confirmation is done when the both sides needs to hv (initiator) = H(initiator’s DHPart2|reciever’s Hello issue Confirm1 and Confirm2 messages, these messages are message) exchanged for the security reasons of confirming that the whole key agreement procedure and the encryption are Once the receiver gets the commit message of the successfully working so they can enable the automatic initiator/sender he will also generate his detection of the Man In The Middle Attack, another reason own pair of private and public values and get ready for the for the exchange of the confirmation message is to allow exchange of values in order to create their shared the SRTP encrypted transmission of the SAS verified flag. secret key. c. Diffie-Hellman Key Exchange:

206

VII. ZRTP PROTOCOL AND MAN IN THE MIDDLE ATTACK communicating parties can’t see the calculated SAS shared Man in the Middle Attack [MITL] can be defined as a value, and therefore, they can’t confirm it with each other. Computer security breach in which a malicious user stopping the communication is not a solution on this case intercepts and possibly alters data traveling along a because it will be simply denial of service attack This is network.” [10]. therefore, it is considered as an active why another authentication by voice is needed. attack. In VOIP systems, such as the SIP VOIP systems the Another attack has been detected on the ZRTP protocol purpose of the Man in the middle attack is to intercept the which is considered also a DOS attack (Denial of Service caller and the called User Agents in order to collect the attack), the attacker can exhaust the memory resources by signaling and media, a bad User Agent can also use the sending fake HELLO messages to the end points, in Man in the middle attack to control the route of the call and response to these messages, the end points creates redirecting a call to a wrong destination. uncompleted connections because each of these messages The ZRTP protocol allows the detection of the Man in the didn’t actually started the session, in the end the memory Middle attack by displaying a string called the Short will be totally exhausted causing the refusal of any more Authentication String (SAS), this string must be the same requests for establishing sessions even from the legitimate on all the sides, the sender and the receiver has to verbally users [11]. check if the SAS is the same on the both sides or not. Other types of man in the middle attack can happen in The SAS is calculated from the common shared secrets that case of inability of the SAS to detect attacks such as the are available between the sender and the receiver. So, voice forgery attack, Bill Clinton Attack, Six Month naturally, if the SAS string is not the same on the caller and Attack, and Court reporter Attack [9]. the called, that means there is a Man in the Middle Attack.

The shared secrets between both sides of the call will be IX. PROPOSED HUMAN AUTHENTICATION SYSTEM cached in order to be used for the authentication for the Intensive Authentication may decrease the chances of any next sessions that will be established between them, these security breach, especially when the authentication system shared secrets will be retained by the ZID number that is is capable to authenticate the personality of the users as associated with each instance using the ZRTP protocol human beings. (associated with each side of the call). The Proposed authentication system consists of two major components in the authentication point of view: VIII. PROBLEM DEFINITION a. Authentication using the user’s voice print that has As mentioned above, ZRTP shared key is calculated from been saved on a third party (server), using the Voice XML the shared Diffie-Hellman s0 secrets. The Diffie Hellman Technology. allows the process of establishing shared secrets between b. Authentication using the user’s certificate that is any two sides even without prior knowledge between each issued by a trusted party such as virisign. other.

In-spite of it’s capable to detect the man in the middle X. THE PROPOSED HUMAN AUTHENTICATION SYSTEM attack by allowing the users to check the authentication SCENARIO string and see if it matches the both sides or not. There is The proposed system for the user authentication mainly still a chance that a man in the middle attack against the depends on two third-party server; the first one is the ZRTP protocol will occur. certificate verification server, which is used to authenticate the identity of the user by requesting a valid certificate A paper written by Prateek Gupta [VWare, inc.], and from the user. Vitaly Shmatikov [University of Texas at Austin][11] The second third-party server is the VXML or the Voice shows a man in the middle attack on the ZRTP Protocol, XML server, this server authenticates the user by his/her the attacker can simply` convince one of the voice print. communicating parties that they lost their shared secrets or The voicexml server is dependant on the certificate ZID values, therefore, an attacker can initiate a session with verification server; eventually the user can’t access the a party A under the guise that he is some body else that A voicexml server without a valid certificate. knows and already established sessions with him before. The user who wants to be recognized officially by the This is possible because the ZID values are not authentication system (the Applicant) is required to visit the authenticated as a part of the ZRTP protocol. so, during the authentication system’s website and go through the session establishment when the party A confirms the value registration steps as follows: with the attacker, the attacker can choose values to fail and a. Registration Phase [1]: Register to the certificate therefore, party A will get a feedback that the another party verification server: has lost the shared secrets that established previously, and a. The new user will be required to send his certificate according to the protocol it will start establishing new (public key) that is issued by a trusted party such as values between him and the attacker. Another issue that can virisign; a valid certificate should contain the name, email increase the chances of man in the middle attack is using address as a point of contact between the user and the VOIP devices that don’t have displays, in this case the SAS authentication system. The certificate will be checked by human authentication will become useless, because the the authentication system’s administrator, a human

207 administrator who is authorized by the system will check consists of two major parts, the fixed part and the variable whether the certificate is valid or not, this is done by using part. the certificate’s issuer gate keeper. The Fixed part of the ticket is the user’s ID and PIN b. In case of valid certificate, the new applicant’s certificate code that are sent the user’s email address that is mentioned data will be registered to the authentication system’s on his certificate the ID and PIN code are important for database. every login transaction on both the certificate server and the voice XML server. c. As a response to the applicant, an encrypted email message will be sent to the applicant’s mail address that is The Variable Part of the Ticket is called the VXML server available in the valid certificate; the message will be pass code, this pass code is used in conjunction with the encrypted by the applicant’s certificate (public key) to fixed part of the ticket in order to get the registered user’s ensure the privacy of the message against any unauthorized unique code, the pass code is available on the Voice XML access. server, the user must login with the user ID and the voice sample (voice print) in order to get the pass code, this d. The applicant must decrypt the email message that has process is a daily process because the pass code is also a been sent to him by using the private key of his certificate; variable code, which means that the user must verify his this email message contains the authentication ticket and voice print every day. consists of the following: ID, PIN, VXML Server Enrollment and verify Application In some applications, it may not necessary to make the pass Pin code variable; the user can login only one time to the Registration Phase [2]: Register to the VoiceXml server voicexml server and get a permanent pass code, on the The email message that is sent by the authentication other hand, it is very important to change the today’s system to the user contains the PIN code of the Enrollment unique code every day to make sure that the user will login application and the verification application, this pin code is every day and confirm his identity. entered verbally by the user in order to access and use one of these application, before using the application, it must be hosted in the VoiceXml server, the bevocal café server [12] XI. TECHNICAL DIFFICULTIES will be used for this purpose. We have encountered a technical problem of insufficient 1. Enrollment: Using the enrollment application pin code security duo to the unavailability of encrypted phone lines that is sent to the valid user, the user must login with his on the Bevocal server, the phone calls between the users valid ID to the voicexml server’s database and save his/her and the enrollment or the verification applications are not voice sample. encrypted, which makes it vulnerable to the interception 2. Verification: Using the Verification application pin attacks. Although there is a small chance for an attacker to code, the user must login to the voicexml database using intercept or steal any private data, but it is better for the the verification application and using his /her valid ID in server administrators to consider using an encryption order to verify the voice Print, the voice print will be taken protocol (especially for the SIP/VOIP calls) such as the from the user verbally by making the user repeat a number, ZRTP to achieve an acceptable level of privacy. this number will changed randomly every session, to make it fairly hard to any attacker who may try to tamper the XII. CONCLUSION voice print. Verification Phase: To achieve the functionality of this Verifying the user himself through a biometric or a authentication system, the user’s identity is verified in certificate or both of them (human authentication) is still a terms of voice print and certificate by verifying the user’s convincing authentication method of authentication and a verification code or ticket, a valid ticket should help the good way of making an almost accurate identification of user to get to his/her unique code. the participating parties. Each registered user has a unique code, during the call session, the called is assumed to be already logged into The Integration of certificate/voice biometric based the system’s database by his ID and PIN code, and the authentication components with the machine security and caller must say this code to the called before starting the authentication protocols such as the ZRTP will increase the conversation. the called will take the caller’s code and ability of these protocols to achieve a very good level of match it on the database of the authentication system that reliability, trustiness, and privacy, in other words, it helps contains the codes of the trusted user in order to get a to achieve what is expected from such protocols, thus, my trusted query about how is exactly calling him. proposed authentication protocol can be used with any VOIP security protocol, for example ,it can be used with The user’s unique code is a variable code, that means the MIKEY protocol for example or any other key there is a new code every day, and the code is only valid establishment protocols that may appear in the future. for one day, thus, the user must login and verify his voice sample every day, to prevent any possible attacks on the In this paper the authentication system has been specially long term ids. designed to fit the nature of Diffie Hellman key exchange which is exchanging the keys of the encrypted sessions with The unique code is only granted for the valid user, a the two communicating parties without a prior knowledge valid user must have a verification ticket, and this ticket of each others.

208

The voice verification by using the voice biometric third [5] E. Rescorla; Diffie-Hellman Key Agreement Method, Request for party has been done by the using the voice XML Comments no.: 2631 for IETF:draft-ietf-smime-x942,1999. technology, the code has been introduced on Doctor- [6]WILLIAM STALLINGS BOOK: CRYPTOGRAPHY AND NETWORK Dobb’s journal[2], we has modified the code by adding a SECURITY: PRINCIPLES AND PRACTICE, PRENTICE HALL, ISBN-13: 978- different way of feedback to fulfill our primary goal of 0130914293, 2002. connecting the voice XML server with the certificate [7] Dalgic and Fang; H.323 and SIP descriptions adapted Dalgic and authentication server. Fang, “Comparison of H.323 and SIP for IP Telephony Signaling”, The Implementation of the ZRTP protocol which is called presented to Photonics East Conference, 1999. the project and the SDK tools are available on the Phil Zimmerman’s ZFONE project website [16]. [8] H. SCHULZRINNE, G. CAMARILLO, A. JOHNSTON, J. PETERSON, R. SPARKS, M. HANDLEY, E. SCHOOLER, REQUEST FOR COMMENTS: - SIP: Testing the ZRTP with conjunction with the proposed SESSION INITIATION PROTOCOL, DYNAMICSOFT, COLUMBIA U., ERICSSON, authentication system has showed a real improvement on WORLDCOM, NEUSTAR, ICIR, AT&T, 2002. the authentication part, and the protocol is now reliable [9] Riccardo Bresciani; The ZRTP Protocol: Security Considerations, enough to keep the privacy of the calls. uni.boris-web.net/pub/ZRTP_Protocol-Security_Considerations.pdf, 2007. Implementation: the following tools and technologies [10] Definition of man-in-the-middle, Webpage 2002-03-26, Retrieved 2002-09,http://www.wordspy.com/words/maninthemiddleattack.asp. have been used for the implementation of our proposed [11] P. Gupta,V. Shmatikov. "Security Analysis of Voice-over-IP protocol: Protocols," in Proceedings of 20th IEEE Computer Security Foundations a. PHP language tool, and used for creating the web Symposium (CSF), Venice, Italy, July 2007. based authentication and verification [14]. [12] Bevocal; Voice XML applications Hosting server; café.bevocal.com. [13] The Openssl tool; www.openssl.org. b. SQL server database, used for creating the user’s [14] MYSQL; the open source database tool, www.mysql.com. database that is used for authentication and [15] PHP: Hyper Text Preprocessor, www.php.net. verification. [16] Zimmerman Phil; the ZRTP protocol website, c. The open-SSL tool, used for creating the www.zfoneproject.com. [17] IEEE Explorer Image; certificate for test purposes, and also has been used for encryption and decryption. d. The Be vocal voice xml server [14], for hosting and inserting and verifying the user’s voice

biometrics.

XIII. FUTURE WORK

This proposed authentication system has been developed and tested with the ZRTP protocol’s implementation as two separate components; we suggest a new hybrid implementation that contains both the ZRTP Protocol libraries in conjunction with the authentication system, the user will have to register first with a valid certificate and add his/her voice print before using the system.

We also suggest a registration of machine session details, the pre-shared secrets that are stored from the previous sessions and used by the ZRTP protocol to identify the machine can be also be registered and stored on the certificate verification server of our proposed authentication system protocol in order to act as a secure third party reference that can be used for the next calls. A similar solution has been suggested on a paper written by Riccardo Bresciani on the conclusion section of his paper [9].

XIV.REFERENCES [1] P. Zimmerman; ZRTP: Media Path Key Agreement for Secure RTP Draft- zimmermann-avt-zrtp-06, 2008.

[2] Jonathan Erickson ; Doctor Dobb’s journal, Voice Biometrics & Application Security, www.ddj.com/security/184405193, 2002.

[3] D. Richard Kuhn, Thomas J. Walsh, Steffen Fries; Security Considerations for Voice Over IP Systems, National Institute For Standards and Technology NIST, Special Edition 800-58, 2005.

Hans Nilsson;A SIP-ISDN Gateway; CS Lab by Ericsson Utvecklings AB, [email protected], 1999.

209