N° d’ordre : 2011telb0179

Sous le sceau de l’Université européenne de Bretagne Télécom Bretagne

En habilitation conjointe avec l’Université de Bretagne-Sud

Ecole Doctorale – Sicma

VoIP-based Framework for the Integration of Open-source and Proprietary Solutions

Thèse de Doctorat Mention: Informatique

Présentée par Ahmad Hammoud

Département : Informatique Directeur de thèse : Serge Garlatti Soutenue le 11/07/2011

Jury : Patrick BELLOT Professeur, Telecom ParisTech Rapporteur Julien BOURGEOIS Professeur, Université Franche Comté Rapporteur Flavio OQUENDO Professeur, Université Bretagne Sud Examinateur Bouabib EL OUAHIDI Professeur, Université Mohammed V Agdal Examinateur Daniel BOURGET Maître de Conférences, Telecom Bretagne Examinateur Serge GARLATTI Professeur, Telecom Bretagne Examinateur ii Plagiarism Policy Compliance Statement

I certify that I have read and understood Telecom Bretagne’s Plagiarism Policy. I understand that failure to comply with this Policy can lead to academic and disciplinary actions against me.

This work is substantially my own, and to the extent that any part of this work is not my own I have indicated that by acknowledging its sources.

Name: Ahmad Hammoud

Signature: Date: 1/07/2011

iii

I grant to “Telecom Bretagne” the right to use this work for the University’s own purpose without cost to the University or its students and employees. I further agree that the University may reproduce and provide single copies of the work to the public for the cost of reproduction.

iv To my family: Rola, Youssof, Houssam, and Abboodi.

To Dr. Daniel BOURGET, for his guidance and supervision.

To Prof. Annie GRAVEY, for her orientation and support.

v vi Acknowledgements

It is a great pleasure for me to acknowledge the assistance, mention the inspirations, and appreciate the contributions of many professionals who have generously provided their help. First of all, I would like to thank my supervisor, Dr. Daniel BOURGET of Telecom Bretagne, for his supervision and guidance. His assistance, encouragement, and in-depth comments were invaluable.

I also thank Prof. Annie GRAVEY for her continuous orientation. I express my sincere gratitude to her, appreciate her contribution, acknowledge her support and recognize the value of her expertise.

In particular, I wish to thank Mrs. Samira Al Ghour, who helped a lot. I would like to acknowledge her efforts, advice, and all the time she did devote.

Last but not least, I gratefully acknowledge the support of the students of Global University | Beirut. They helped a lot in analyzing users’ requirements, testing the code, implementing the solutions, training the users, and filling the questionnaires.

vii viii Abstract

Voice over IP, known as VoIP, is a new but mature and promising technology. VoIP provides a way to communicate over any IP network, whether the internet, WAN, LAN, WLAN, or any combination of those. Integrating the VoIP solution with all other types of solutions already used by network users will widen the horizon of possible usages. For example, a typical network might include a mail server, an internet server, a primary domain controller, a database server, a file server, a web server, and many other types of servers. Is there a chance for network users to have all of those computer-based functionalities provided over the phone? Can they use the phone to check their mails, send replies, open a document, query a database, or even surf the web? Can the PBX be used to help protecting the LAN? Can we implement our own encryption techniques to encrypt phone calls? How can the education field benefit from VoIP? This thesis answers those questions and inquires into different scenarios where access through the phone can enhance existing systems.

Using the phone to access the services that are usually accessed using visual tools is so important to blind or visually impaired people, people who cannot afford having visual tools such as notebooks & smart phones, and people who do not have enough skills to deal with a computer. To achieve the targeted integration, there must be a DSL (Domain- Specific Language) that will facilitate the development process followed by VoIP application developers. Such a high-level language will hide implementation details, allowing the developers to focus on users’ requirements rather than getting into the details of integrating the PBX with a plenty of existing technologies.

ix Résumé en Français

A. Mots-clés: VoIP, Voix sur IP, Asterisk, Intégration, Interception, Chiffrement, Interopérabilité, Les écoutes téléphoniques de cryptage, Apprentissage Collaboratif.

B. Introduction Les services Internet sont limités aux ordinateurs personnels, les PDA et les téléphones intelligents. D’une part, ces services ne sont pas adaptés aux personnes ayant une déficience visuelle. Ces derniers devraient pouvoir y accéder par le biais d'autres dispositifs. D’autre part, la communauté des utilisateurs Internet représente seulement un tiers du nombre total d'utilisateurs de téléphones dans le monde. Etant donné que la technologie VoIP fournit un moyen de communiquer la voix sur un réseau IP, son intégration avec le reste des applications réseau offre un atout exceptionnel.

Serait-il possible d’étendre l’accès aux services réseaux à tous les utilisateurs de téléphone GSM de phase 1? Peuvent-ils utiliser ce type de téléphone pour répondre à leurs courriels, rechercher / consulter un document, interroger une base de données ou même de naviguer sur le web? Peut-on utiliser un PABX pour sécuriser un réseau local? Peut-on mettre en œuvre nos propres algorithmes de chiffrement pour crypter les appels téléphoniques? Comment bénéficier de la technologie VOIP dans le domaine de l'éducation? Cette thèse répond à ces questions et étudie les différents scénarios où l'accès par téléphone pourrait améliorer les systèmes existants. Cette thèse propose également plusieurs méthodes pour réaliser l'intégration, et offre une nouvelle solution pour naviguer sur le web en utilisant un algorithme inspiré du fonctionnement de l'œil humain.

Concrètement, cette intégration permettra aux utilisateurs de téléphone GSM phase 1 d’interagir avec un grand nombre de serveurs tels que les:

x • serveurs de messagerie (MS Exchange Server). • serveurs de base de données (Oracle, MySQL ou SQL Server) • serveurs de communication unifiée, type MS OCS • serveurs de type MS Active Directory

Cette thèse propose plusieurs scénarios d'intégration où le téléphone remplace l'ordinateur et l'oreille remplace l'œil. Si ce remplacement est fait avec succès, les développeurs se concentreront plutôt sur les besoins des utilisateurs que sur les détails d'intégration de bas niveau. En outre, les utilisateurs bénéficieront de l'utilisation de d'une commande vocale plutôt que celle d’un clavier et d’une souris.

Plusieurs technologies, telles que VXML &SALT, ont tenté de fournir une solution qui offre un moyen simple d'interagir avec les utilisateurs. Bien que ces technologies simplifient l'interaction avec les applications de voix, elles imposent aux développeurs d'entrer dans les détails de l'intégration. Cette thèse introduit un mécanisme pour simplifier la tâche des développeurs d'applications vocales afin qu'ils puissant intégrer facilement leurs applications avec une large collection de serveurs.

Cette thèse présente également une approche totalement nouvelle qui permet aux utilisateurs de téléphone GSM Phase 1 de naviguer sur le web. L'approche propose de développer un algorithme qui va agir comme l'œil humain. Lorsque cet algorithme analyse une page web, il peut rapidement communiquer les parties de la page qui auraient attirer le plus l’attention de l’œil. Apres avoir eu cette information, l'appelant sera en mesure de décider plus facilement de la prochaine URL a’ visiter. Ce comportement, inspire’ du fonctionnement biologique de l’œil, garantit que l'algorithme permet aux appelants d'interagir avec la page web de manière efficace. D'autres solutions proposaient une traduction du HTML vers du VXML. Cependant, une telle traduction n'est pas toujours pertinente.

C. La motivation et le besoin Le moyen le plus naturel de communiquer est la parole. C’est pour cela que la parole est considérée le moyen le plus pratique et le plus élégant d’information (Eidsvik, 2001). Cooper (2004) affirme que le moyen de communication le plus ancien est aussi la parole.

xi

Si la parole est utilisée, les ordinateurs ont besoin de reconnaître et de générer de la parole (Bradnum, 2004). Alors que la parole est la méthode préférée de communication pour les humains, ce n'est pas le cas des machines (Datamonitor, 2003). C'est pourquoi il est nécessaire d'intégrer les produits VoIP avec une large collection de solutions, de serveurs et de technologies. L'objectif principal est de permettre l'accès à toutes les technologies actuelles à travers la parole.

Le thème de l'intégration ne date pas d’aujourd’hui. Pendant des décennies, l'intégration a été considérée comme un problème majeur pour les ingénieurs. L'intégration de différents produits est une nécessité inévitable. Dans le domaine de la communication, ce besoin devient urgent. Nous croyons que l'intégration des différentes technologies avec les produits VoIP offrirait un avantage exceptionnel. Un utilisateur typique utilise son ordinateur pour vérifier entre autres son courrier électronique, interroger une base de données, ouvrir un document, remplir un formulaire, naviguer sur le web, participer à un forum / blog et pour surveiller l'état de certains serveurs du réseau local.

Cependant, un problème se pose lorsque l'utilisation de l'ordinateur n'est pas une option, soit parce qu'il n'est pas disponible ou parce que l'utilisateur n'est pas en mesure de l'utiliser. Ces deux obstacles peuvent être surmontés par l’utilisation d’un téléphone. Ce dernier offre plusieurs avantages par rapport à l'ordinateur. Le premier avantage est l'amélioration de l'accessibilité. Les utilisateurs de téléphones seront capables d'exécuter toutes ces tâches par téléphone, et ceci de n'importe quel endroit sans avoir besoin d'un ordinateur. Deuxièmement, les utilisateurs souffrant d’un problème de vue pourront interagir pleinement avec l'ordinateur via le téléphone. Troisièmement, le nombre d'utilisateurs de téléphones dans le monde est beaucoup plus important que le nombre d'utilisateurs d'Internet. Cela signifie que la mise à disposition d'une solution qui ne peut être offsets que par ordinateur servira une communauté relativement petite. Ce n'est pas le cas lorsque la même solution est accessible par téléphone, car une communauté beaucoup plus grande en bénéficiera.

xii Selon une étude menée par Eidsvik en 2001, il y a plus de 1,5 milliards de téléphones et environ 500 millions d'utilisateurs de téléphones mobiles dans le monde (Eidsvik, 2001). Cette communauté de deux milliards d'utilisateurs est assez grande pour nous inciter à concevoir des solutions spéciales pour satisfaire ses besoins. Ces nombres augmentent jour après jour. Par ailleurs, le téléphone aide les personnes malvoyantes et aveugles (Orubeondo, 2001). Le téléphone sera utile aussi à ceux qui ne sont pas familiers avec les ordinateurs et ceux qui évitent d'employer une technologie qui est plus moderne que le téléphone (Regruto, 2003).

En conséquence, il est réellement nécessaire de permettre aux utilisateurs d'accéder à des services informatiques par téléphone.

Le tableau 1.1 montre le nombre de téléphones mobiles dans le monde ainsi que le pourcentage de la population. Le tableau 1.2 montre quelques statistiques sur les utilisateurs d'Internet dans le monde. Ces deux tableaux montrent que le pourcentage de la population mondiale qui utilise le téléphone est beaucoup plus élevé que celui de ceux qui utilisent l'internet. Ainsi, toute solution qui n’est accessible que par l'utilisation d'un ordinateur ne sera pas utilisée par plus de la moitié de la population mondiale. Le reste de la population ne peut pas avoir accès à la solution, car ils utilisent des téléphones. Le tableau 1.1 montre que près de 68 pour cent des gens du monde ont un téléphone mobile. C'est déjà une majorité avant même d'ajouter le pourcentage de ceux qui utilisent des téléphones fixes. Le tableau 1.2 montre que seulement 29 pour cent des gens sont des utilisateurs d'Internet. Cette énorme différence requiert une solution permettant aux utilisateurs de téléphone d'effectuer toutes les tâches et d’accéder à tous les services disponibles aux utilisateurs d'ordinateur.

Tableau1.1-Téléphones Mobiles en Usage dans le Monde

Nombre de téléphones Pourcentage de la Date mobiles population

4,600,000,000 67.6 2009

xiii

Tableau 1.2-Utilisateurs d'Internet dans le Monde

Utilisateurs d'Internet Pourcentage de la Date population

1,966,514,816 28.7% 2010

Selon Larson, le nombre d'utilisateurs qui peuvent utiliser leurs téléphones pour accéder à l'information sur l'internet a augmenté(2004). Ceci car les téléphones sont de petite taille, plus légers, et généralement bien moins coûteux. La longue durée de vie des batteries est un autre facteur qui permet au téléphone d'être plus portable que les ordinateurs (Orubeondo, 2001).

D. Contributions Certaines des propositions présentées dans cette thèse peuvent être considérées comme complémentaires aux travaux antérieurs. D'autres propositions présentent des améliorations, des mises en œuvre, et des changements radicaux dans de rares cas.

En général, il y a une différence entre ce qui est présenté dans cette thèse et ce qui a été présenté dans les articles récents. La différence principale est que chaque article se concentre sur la résolution d'un problème d'intégration alors que cette thèse introduit un mécanisme pour intégrer la VoIP avec presque tous les services disponibles sur le réseau. L'objectif de la thèse porte sur la nature générale du problème, et non sur des solutions spécifiques aux problèmes d'intégration spécifiques. Tous les scénarios d'intégration présentés dans cette thèse nous mènent à un résultat évident: la nécessité de développer un langage spécifique à un domaine (DSL) qui compile tout le code présenté dans une bibliothèque de programmes unique. Ce DSL permettra aux développeurs d'applications vocales de produire des applications qui peuvent interagir avec la plupart des services sur le réseau.

xiv D.1 VoIP et les Services Typiques Tous les scénarios dans cette catégorie portent sur l'intégration de tous les services dans un réseau classique avec la technologie VoIP afin qu'ils soient accessibles par téléphone. Cette catégorie comprend les cinq sous-catégories suivantes:

D.1.1 Serveurs de Messagerie Ce scénario d'intégration vise à présenter différentes façons d'intégrer une solution de VoIP avec un serveur de messagerie afin que les appelants puissent y accéder par téléphone. L'intégration des solutions de VoIP avec des serveurs de messagerie a été vérifiée et mise en œuvre dans les travaux de recherche VoIP. Singh et al. traite de «l'intégration de VoiceXML avec les services SIP» [Singh et al, 2003]. Dans leur article, ils ont suggéré une approche pour intégrer le téléphone avec des solutions VoIP. Ils affirment que l'utilisation de VoiceXML et SIP mènera à une intégration plus facile avec le courrier électronique, la messagerie instantanée, le Web, et le téléphone. Ils ont même créé un modèle appelé "Courriel par téléphone". Nous croyons que leur travail résout un problème unique, tandis que notre modèle est plus général. Par ailleurs, ils ont développé un nouveau navigateur pour réaliser l'intégration nécessaire, tandis que nous nous appuyons sur des pages web typiques pour résoudre le même problème.

CINEMA (Columbia INternet Extensible Multimedia Architecture) est un autre exemple qui montre la tendance vers l'intégration des systèmes de messagerie et d'autres avec la technologie VoIP. CINEMA sera brièvement expliquée dans le paragraphe suivant, intitulé «Les Serveurs de Communications Unifiées"

D.1.2 Serveurs de Communications Unifiées Singh et al. décrit l'intégration de VoiceXML avec les services SIP" [Singh et al, 2003]. Dans leur article, ils ont suggéré un nouveau navigateur qui améliore les services de l'architecture CINEMA. Ce dernier est mise en œuvre sous la forme d’un logiciel qui offre plusieurs fonctionnalités liées à la VoIP [Jiang et al, 2002]. Ces fonctionnalités sont énumérées ci-après:

xv • CINEMA assure l'interopérabilité avec le RTPC, la messagerie vocale sur IP, et des services de téléphonie programmables. • CINEMA assure l'interopérabilité avec le courriel et l'accès Web pour la communication unifiée. • CINEMA permet la conférence multimedia. • CINEMA permet d'ajouter de nouveaux services et fonctionnalités, notamment la messagerie instantanée et le contrôle des dispositifs réseaux.

Bien que nous croyions que CINEMA peut être considéré comme une bonne solution, nous pensons que l'architecture n'est pas facile à utiliser. CINEMA comprend un grand nombre de serveurs (serveur proxy, serveur de base de données, serveur web, et traducteur SIP). Notre proposition est simple et utilise le même mécanisme d'intégration pour lier la technologie VoIP avec tous les autres services. CINEMA se concentre principalement sur la résolution des problèmes d'intégration de la communication.

D.1.3 Serveurs de Base de Données Un exemple de cette sous-catégorie est un scénario où les appelants peuvent accéder aux bases afin de récupérer et / ou modifier des données. Les décideurs comptent habituellement sur leurs techniciens pour formuler des requêtes de bases de données. Il devrait y avoir une interface, aussi simple que le système téléphonique, où les dirigeants d'entreprise peuvent exécuter une requête pour récupérer des données [Hendrix et al. 1978].

La parole est un moyen naturel de communication pour les personnes [Lefebvre et al. 1993]. La plupart des utilisateurs préfèrent donner des instructions verbales plutôt que de pousser un bouton ou même de taper. Dans leur «Accès aux bases de données via les interfaces de la langue parlée» article, Dybkjær et al. affirment que le recours au dialogue en langue naturelle facilite l'utilisation de bases de données par d'autres que les utilisateurs experts [Dybkjær et al, 1995].

xvi La solution suggérée par Dybkjær et al. se concentre sur l'accès à une base de données en utilisant le dialogue en langue parlée. Il utilise beaucoup de serveurs qui rend la solution très spécifique, sophistiquée et difficile à produire. L'approche présentée dans cette thèse est simple à utiliser et facile à mettre en œuvre.

D.1.4 Serveurs de Fichiers et Outils Office Afin de présenter un cadre d'intégration complet qui couvre tous les services dans un réseau classique, nous avons inclus deux scénarios supplémentaires pour intégrer la technologie VoIP avec les serveurs de fichiers et les outils Office. L'objectif de l'intégration de la technologie VoIP avec le serveur de fichiers est à capacité à utiliser le téléphone pour accéder au disque dur d'un serveur Windows ou Linux. De cette façon, les utilisateurs de téléphones seront capables d'explorer leurs disques durs et d’y effectuer des recherches. Lorsque le fichier est trouvé, ils pourraient éventuellement l’ouvrir avec l’application associée telle que Word ou Excel, remplir quelques champs et renvoyer le fichier par email en pièce jointe.

D.2 La Technologie VoIP et les Services de Sécurité

D.2.1 Services de Chiffrement Dans cette sous-catégorie, nous présentons un modèle à travers lequel un flux VoIP est crypté à l'aide d'un dictionnaire. Ce dernier est un tableau de deux colonnes. La première colonne comprend des numéros entre 0 et 255. L'autre colonne comprend des touches uniques, dont chacun correspond à un seul octet. Lorsque les deux parties se parlent au téléphone, les médias numériques sont enregistrés comme une série d'octets. Chaque octet sera remplacé par une "pièce d'identité" envoyée à l'autre partie. Lorsque l'octet crypté est reçu, l'octet original sera récupéré en utilisant le même dictionnaire.

Beaucoup de chercheurs ont décidé d'utiliser un dictionnaire pour atteindre des objectifs différents. Govindan et Mohan utilisent un dictionnaire pour le chiffrement et la compression des données. Ils ont suggéré une «Cryptage et Compression Intelligente des Textes pour la Transmission des Données Rapide et Sécurisée sur Internet» [Govindan &

xvii Mohan, 2004]. Ils ont développé une stratégie appelée IDBE (Intelligent Dictionary Based Encoding) pour atteindre leur objectif.

L'approche suivie par Govindan & Mohan se concentre sur le cryptage des fichiers texte uniquement. Ce que nous avons fait est le cryptage des médias voix, qui est nettement plus complexe.

Un autre travail de recherche par [Prasanna et Dandalis, 2000] utilise le même concept pour la compression de données qui a été suggéré par Nelson, M. (1996) dans le Livre de Compression des Données. Ils ont présenté plusieurs approches pour l'encodage basé sur des dictionnaires. Nous profitons de ce qui a été présenté dans leur article pour montrer l'efficacité de notre proposition. Selon eux, la décompression est une opération de récupération simple. Seules des opérations de lecture de la mémoire sont nécessaires et par conséquent des opérations rapides de décompression et de décryptage seront effectuées.

Beaucoup d'autres chercheurs ont utilisé le concept de substitution pour réaliser le chiffrement. Ils ont fait tout cela pour crypter et décrypter des fichiers. Au moment de l'exécution, ce n'était pas un gros problème parce qu'il ne s'agissait pas de communication en temps réel. En VoIP, il y a une nécessité inévitable que la réponse soit faite en temps réel. Nous pourrions effectuer le chiffrement et le déchiffrement en utilisant un dictionnaire en moins d'une seconde, ce qui est un temps de latence acceptable dans la technologie VoIP.

D.2.2 Utilisation de 'RBE' pour Détecter des Appels Suspects Dans cette sous-catégorie, nous avons mis en œuvre l'intégration entre une solution VoIP et RBE (Rule-Base Engine). Nous visons à ordonner au «RBE» de maintenir la surveillance du PABX. Chaque fois qu'un événement suspect se produit, les mesures appropriées sont prises. De cette façon, le PABX est constamment sous contrôle et il n'y aura pas besoin de vérifier le fichier CDR (Call Detail Records) afin de détecter une mauvaise utilisation.

xviii Les chercheurs ont travaillé sur l'intégration similaire. Ils l'ont fait avec une architecture simple, un PABX propriétaire dans un contexte diffèrent [Ong & Cing, 2004]. Nous croyons que notre architecture est bien meilleure parce qu'elle fait usage d'un service Web qui peut être appelé de manière synchrone ou asynchrone. Le service web peut être considéré comme une couche séparée qui peut être remplacée par tout autre service Web permettant l'utilisation d'un RBE différents. Notre architecture comprend cinq serveurs: Asterisk, un serveur web, web services, RBE, et le serveur de base de données. De cette façon, la charge est bien équilibrée et une meilleure performance est acquise.

D.2.3 Services de Sécurité Supplémentaires Trois autres sujets liés à la sécurité ont été discutés pour fournir une collection complète de services intégrés. Ces sujets sont:

1. Prévention des Intrusions 2. Services Annuaire 3. Services de Surveillance

Des fonctionnalités similaires à celle couverte dans le premier sujet peuvent être trouvées dans l'architecture CINEMA qui a déjà été introduite. Les deux autres sujets nous ont aidés à atteindre un niveau de sécurité supérieur. Quand un abus est détecté, le PABX informera automatiquement les partis concernés pour les informer de la violation. Une telle solution présente l'avantage d'utiliser un PABX qui va immédiatement appeler la personne en charge. C'est mieux que d'envoyer un courriel qui n’est pas efficace pour signaler des problèmes urgents.

D.3 La Technologie VoIP et les Services Web

D.3.1 Algorithme Imitant l' Œil Humain L'idée principale présentée dans cette section est de développer un algorithme qui fonctionne exactement comme l'œil humain. De cette façon, les utilisateurs de téléphone pourront naviguer sur le Web par téléphone de manière aussi efficace que par l’utilisation d’un navigateur visuel. Mettre le web à la disposition des utilisateurs de téléphone n'est

xix pas une idée totalement nouvelle. Selon Rosenberg, les applications intranet et sites Web peuvent être présentés à un utilisateur de téléphone [Rosenberg, 2001].

Les propositions précédentes ont présenté des programmes qui liraient le contenu de la page pour les appelants. Toutefois, les lecteurs web typique ne sont pas efficaces parce que l'utilisateur attendrait pendant plusieurs minutes avant d’arriver à la partie intéressante. Notre algorithme résume la page à l'appelant, tout comme le ferait l'œil humain.

Dans les travaux de recherche antérieurs, les annotations ont été suggérées comme une méthode pour spécifier les sections importantes de pages [Hori et al. al, 1999)]. Des suggestions similaires ont été proposées par [Asakawa & Takagi, 2000] et [Shao et al. al, 2002]. Nous croyons que l'utilisation des annotations ne peut pas être considérée comme une solution envisageable car elle nécessite d’ajouter manuellement des annotations au fichier HTML. Nous croyons que ces propositions nécessitent des changements radicaux. Ces solutions ne seront donc pas acceptées par la communauté Internet.

Voice XML et les technologies similaires ont facilité’ le développement d'applications vocales, mais requièrent que les développeurs produisent une application vocale distincte pour chaque site Web. Cette duplication d’effort conduit à une répétition inutile. Elle exige également que les développeurs maintiennent a’ jour les applications qui dépendent de sur la voix au contenu du site web. Notre proposition permet aux appelants d'interagir avec le même site Web visité par les internautes normaux.

D.3.2 Accès Audio Académique “A3” Cette proposition fait usage de la technologie VoIP dans le domaine de l'éducation. Nous avons suggéré d'utiliser une solution VoIP pour fournir aux étudiants un forum audio. Cette solution a été utilisée dans plusieurs écoles pour faciliter la collaboration.

Dans «L'architecture des services vocaux interactifs pour l'apprentissage électronique», Motiwalla, a proposé une solution similaire [Motiwalla, 2009]. Bien que cette proposition a quelques similitudes avec la nôtre, il y a plusieurs différences. Premièrement, la proposition de Motiwalla se concentre sur le domaine de l'enseignement supérieur, tandis

xx que notre proposition se concentre sur les écoles élémentaires. Deuxièmement, la proposition de Motiwalla vise à offrir une solution efficace pour les étudiants malvoyants, tandis que notre proposition se concentre sur les jeunes étudiants qui n'ont pas les compétences nécessaires pour utiliser un ordinateur.

Le tableau1.3 résume les contributions présentées dans cette thèse.

Tableau1.3 – Les Contributions

Domaine Solution Contribution

Intégration Riche collection de services intégrés Code

Développement DSL Modèle Proposé

Chiffrement 2 Publications Sécurité Modèle de prévention des intrusions Code Asterisk + 'RBE'

Publication Web Algorithme Imitant l'Œil Humain Modèle

Publication Éducation A3 Code

xxi D.4 Langage Spécifique au Domaine Les techniques d'intégration, les modèles et leurs mises en œuvre qui ont été présentés dans cette thèse sont considérés comme une introduction à la poursuite des travaux dans le domaine de l'intégration de VoIP. VoiceXML, SALT, VoicePHP, et d'autres solutions sont considérées comme une étape vers la réalisation de l'intégration. Une autre étape est cependant nécessaire pour permettre aux développeurs VoIP d’utiliser un langage de développement de haut niveau. VoiceXML et des technologies similaires aident les développeurs à se concentrer sur les fonctionnalités nécessaires, plutôt que sur la façon dont elles doivent être développées.

Bien que ces technologies réduisent la quantité de code nécessaire pour développer des applications de voix, les programmeurs sont encore confrontés à de nombreux obstacles. Il n'est pas facile d'utiliser les solutions d'intégration existants pour intégrer la technologie VoIP avec:

• MS Active Directory

• MS Office Communication Server

• MS Exchange Server

• MS Word

• MS Excel

• RBE

• Serveurs de Fichiers

Nous vous suggérons d'utiliser les techniques et les programmes proposés dans cette thèse pour définir un langage spécifique au domaine VOIP (DSL). Les programmeurs d'applications vocales feront usage de ce langage pour développer leurs applications. Ils n'auront plus besoin, par exemple, de comprendre des détails sur Exchange Server, la récupération des courriels, ou sur le processus d'authentification. Ils auront a’ disposition un ensemble de commandes spécifiques à Exchange Server. En utilisant cet ensemble réduit d'instructions, les développeurs auront un accès complet aux courriels. Le code présenté dans cette thèse peut beaucoup aider et servira de base pour un tel langage de haut niveau.

xxii xxiii Table of Contents

Plagiarism Policy Compliance Statement ...... iii Acknowledgements ...... vii Abstract...... ix Résumé en Français...... x A. Mots-clés: ...... x B. Introduction...... x C. La motivation et le besoin...... xi D. Contributions ...... xiv D.1 VoIP et les Services Typiques ...... xv D.1.1 Serveurs de Messagerie ...... xv D.1.2 Serveurs de Communications Unifiées...... xv D.1.3 Serveurs de Base de Données...... xvi D.1.4 Serveurs de Fichiers et Outils Office...... xvii D.2 La Technologie VoIP et les Services de Sécurité...... xvii D.2.1 Services de Chiffrement ...... xvii D.2.2 Utilisation de 'RBE' pour Détecter des Appels Suspects...... xviii D.2.3 Services de Sécurité Supplémentaires ...... xix D.3 La Technologie VoIP et les Services Web ...... xix D.3.1 Algorithme Imitant l' Œil Humain...... xix D.3.2 Accès Audio Académique “A3”...... xx D.4 Langage Spécifique au Domaine...... xxii Table of Contents ...... xxiv List of Figures...... xxxiii List of Tables...... xxxvi Preface ...... xxxix PART 1 - INTRODUCTION ...... 1 1.1 Thesis Objective ...... 1 1.2 Thesis Contributions...... 4 1.2.1 VoIP and Typical Services ...... 4 1.2.1.1 Mail Servers...... 5

xxiv 1.2.1.2 UC Servers...... 5 1.2.1.3 DBMS Servers...... 6 1.2.1.4 File Servers & Office Tools ...... 6 1.2.2 VoIP & Security Services...... 7 1.2.2.1 Encryption Services...... 7 1.2.2.2 Using RBE to Detect Suspicious Calls...... 8 1.2.2.3 Other Security Services ...... 8 1.2.3 VoIP & Web Services ...... 9 1.2.3.1 Eye-like Algorithm...... 9 1.2.3.2 Academic Audio Access “A3” ...... 10 1.3 Thesis Security Proposals...... 12 1.3.1 Encryption Services...... 12 1.3.2 Intrusion Prevention Services...... 12 1.3.3 Directory Services ...... 12 1.3.4 Monitoring Services ...... 13 1.3.5 Using Rules-based Engine...... 13 1.4 Thesis Organization...... 14 1.5 Knowledge Space Exploration ...... 16 1.5.1 Motivation and Need ...... 16 1.5.2 Existing Knowledge and Related Research Work...... 19 1.5.2.1 Voice XML...... 19 1.5.2.2 SALT ...... 23 1.5.2.3 VoicePHP ...... 24 1.5.2.4 CCXML...... 25 1.5.2.5 Other Technologies...... 26 1.5.2.6 Summary...... 27 1.6 Roadmap...... 28 1.6.1 Mission Statement ...... 28 1.6.1.1 General Direction and Key Goals...... 28 1.6.1.2 Primary Users and Stake Holders...... 30 1.6.2 Lesson Learned...... 33 1.7 Methodology Design ...... 34 1.7.1 Concepts Generation...... 34 1.7.1.1 Generated concepts...... 35

xxv 1.7.1.2 Built Prototypes ...... 37 1.7.2 Concepts Evaluation...... 38 1.7.2.1 Testing with in-house Prototypes ...... 38 1.7.2.2 Efficiency...... 39 PART 2 - VoIP ENVIRONMENT...... 41 2.1 Introduction to VoIP...... 41 2.1.1 Understanding VoIP ...... 41 2.1.2 VoIP: Advantages and Disadvantages...... 43 2.1.2.1 Advantages ...... 43 2.1.2.2 Disadvantages...... 44 2.1.3 Why is VoIP important?...... 46 2.1.3.1 VoIP runs over any IP network ...... 46 2.1.3.2 Low Cost of Phone Calls...... 46 2.1.3.3 Ability to Integrate VoIP with other Computer Services ...... 47 2.1.4 When is VoIP Not Recommended?...... 47 2.1.5 Who are VoIP Drivers? ...... 48 2.1.5.1 Users ...... 48 2.1.5.2 Service Providers...... 48 2.1.5.3 Manufacturers...... 49 2.1.5.4 Regulators...... 49 2.2 Asterisk...... 51 2.2.1 Introduction to Asterisk...... 51 2.2.2 Telephony Hardware ...... 52 2.2.2.1 Analog Interface Card ...... 52 2.2.2.2 Digital Interface Card ...... 53 2.2.3 Dialplan Basics...... 54 2.2.3.1 Context ...... 55 2.2.3.2 Extensions...... 56 2.2.3.3 Priorities ...... 56 2.2.3.4 Applications...... 57 2.2.3.5 Variables...... 57 2.2.3.6 Pattern Matching ...... 59 2.2.3.7 Includes...... 59 2.2.4 Basic Dialplan Applications...... 60

xxvi 2.2.4.1 Dial( ) Application...... 60 2.2.4.2 Goto( ) Application...... 61 2.2.4.3 GotoIf( ) Application...... 61 2.2.4.4 GotoIfTime( ) Application ...... 61 2.2.4.5 Macro( ) Application ...... 62 2.2.4.6 Advanced Dialplan Application ...... 63 2.3 Asterisk and Speech Engines...... 66 2.3.1 Text-to-Speech...... 66 2.3.1.1 Introduction ...... 66 2.3.1.2 Popular TTS Engines...... 67 2.3.2 Speech-to-Text...... 68 2.3.2.1 Introduction ...... 69 2.3.2.2 People with Disabilities...... 69 2.3.2.3 Famous Speech Recognition Packages...... 70 2.4 Asterisk Communicating with Other Packages...... 71 2.4.1 Using AGI ...... 72 2.4.1.1 How ...... 72 2.4.1.2 Advantages ...... 73 2.4.1.3 Disadvantages...... 73 2.4.2 Via the OS ...... 74 2.4.2.1 How ...... 74 2.4.2.2 Advantages ...... 75 2.4.2.3 Disadvantages...... 75 2.4.3 Using Web Applications...... 76 2.4.3.1 How ...... 76 2.4.3.2 Advantages ...... 79 2.4.3.3 Disadvantages...... 79 2.4.4 Tools...... 80 2.4.4.1 Introduction ...... 80 2.4.4.2 Elastix...... 80 2.4.4.3 Apstel Integration Server for Asterisk...... 81 PART 3 - IMPLEMENTED INTEGRATION SCENARIOS...... 84 3.1 Introduction ...... 84 3.2 Integration Strategy ...... 84

xxvii 3.3 Retrieving Data from Different DBMS Packages ...... 86 3.3.1 The PBX Accessing MS SQL Server...... 86 3.3.2 The PBX Accessing MySQL...... 87 3.3.3 The PBX Accessing MS Access...... 88 3.3.4 The PBX Accessing Oracle ...... 89 3.4 The PBX Accessing MS Exchange Server...... 91 3.4.1 Introduction ...... 91 3.4.2 Preparing for the Connection...... 91 3.4.3 The Dialplan ...... 94 3.4.4 Retrieving the Email Details ...... 96 3.4.5 Sending the Reply...... 98 3.5 The PBX Accessing MS OCS ...... 101 3.5.1 Introduction ...... 101 3.5.1.1 Abstract...... 101 3.5.1.2 What is Microsoft OCS ...... 101 3.5.1.3 Why Do We Need Such Integration? ...... 101 3.5.1.4 Integration Scenario...... 102 3.5.1.5 Previous Attempts to Solve the Problem...... 102 3.5.2 Network Topology...... 102 3.5.2.1 Overview ...... 102 3.5.2.2 Required Third-Party Package ...... 103 3.5.2.3 LAN Configuration ...... 103 3.5.3 The Integrating Call Flow...... 105 3.5.4 Configuration...... 107 3.5.4.1 Configuring Asterisk ...... 107 3.5.4.2 Configuring OpenSER...... 111 3.5.4.3 Configuring OCS...... 111 3.5.4.4 Configuring OCS Mediation Server...... 113 3.5.5 Validation, Confirmation, and Troubleshooting ...... 113 3.5.5.1 Tools for Testing ...... 113 3.5.5.2 Troubleshooting...... 114 3.5.6 Future Support ...... 114 3.5.6.1 Will TCP be supported by Asterisk? ...... 114 3.5.6.2 Will UDP be supported by Microsoft?...... 115

xxviii 3.6 The PBX Accessing Office Applications ...... 116 3.6.1 The PBX Accessing MS Excel...... 116 3.6.2 The PBX Accessing MS Word...... 119 3.6.3 The PBX Interoperating with MS Office Communicator ...... 123 3.7 The PBX Exploring the Hard Disk...... 126 PART 4 - ADVANCED APPLICATIONS...... 133 4.1 The PBX Enhancing Security...... 133 4.1.1 The PBX Accessing MS Active Directory...... 133 4.1.2 The PBX Acting as Servers' Guard ...... 135 4.1.3 The PBX Protecting Highly Secured LANs...... 140 4.1.3.1 Abstract...... 140 4.1.3.2 Introduction ...... 140 4.1.3.3 Previous Work...... 141 4.1.3.4 The Threat...... 141 4.1.3.5 IP-based Access Scenario...... 142 4.1.3.6 WoL Scenario...... 142 4.1.3.7 Alternative Scenario ...... 143 4.1.3.8 Physically Disconnected Scenario...... 144 4.1.3.9 Enhancements...... 146 4.1.4 Interception-proof VoIP using Dictionary-based Encryption ...... 147 4.1.4.1 Abstract...... 147 4.1.4.2 Introduction ...... 147 4.1.4.3 The Proposed Solution ...... 148 4.1.4.4 Basic Scenario ...... 150 4.1.4.5 Enhancements...... 152 4.1.4.5 Limitations...... 154 4.1.4.6 Conclusion...... 155 4.1.4.7 Future Work...... 155 4.1.5 Using a Rule-based Engine to Detect Suspicious Calls ...... 156 4.1.5.1 Abstract...... 156 4.1.5.2 Introduction ...... 156 4.1.5.3 The Proposed Solution ...... 157 4.1.5.4 Related Work...... 157 4.1.5.5 Advantages of Using RBE...... 158

xxix 4.1.5.6 Challenges ...... 159 4.1.5.7 Flow of Events...... 160 4.1.5.8 Rules and Actions...... 161 4.1.5.9 Preparation and Configuration...... 164 4.1.5.10 Results ...... 170 4.1.5.11 Limitations...... 172 4.1.5.12 Conclusion...... 173 4.1.5.13 Future Work...... 173 4.2 Eye-like Algorithm to Produce Voice Web Pages ...... 175 4.2.1 Abstract...... 175 4.2.2 Introduction ...... 175 4.2.3 Related Work...... 175 4.2.4 The Need ...... 176 4.2.5 Proposed Solution...... 177 4.2.6 Analyzing Dynamic Pages ...... 178 4.2.7 Dealing with Web Applications ...... 180 4.2.8 Challenges ...... 181 4.2.9 Limitations...... 183 4.2.10 Conclusion...... 183 4.2.11 Future Work...... 183 4.3 Academic VoIP Blog for Elementary Schools...... 185 4.3.1 Abstract...... 185 4.3.2 Introduction ...... 185 4.3.3 Typical Scenario...... 186 4.3.4 Previous Work...... 186 4.3.5 The XO Laptop...... 187 4.3.6 The Need ...... 188 4.3.7 Stakeholders ...... 191 4.3.8 Encouraging Stakeholders to use “A3” ...... 192 4.3.9 Audio Issues ...... 193 4.3.10 Audio-to-Blog Call Flow...... 193 4.3.11 Stakeholders’ Feedback...... 196 4.3.12 Results ...... 196 4.3.13 Future Enhancements ...... 199

xxx PART 5 – CONCLUSION & PERSPECTIVES...... 201 5.1 Conclusion...... 201 5.1.1 VoIP and Typical Services ...... 201 5.1.1.1 Mail Servers...... 201 5.1.1.2 Unified Communication Servers ...... 202 5.1.1.3 DBMS Servers...... 204 5.1.1.4 File Servers & Office Tools ...... 205 5.1.2 VoIP & Security Services...... 206 5.1.2.1 Encryption Services...... 206 5.1.2.2 Using RBE to Detect Suspicious Calls...... 208 5.1.2.3 Other Security Services ...... 209 5.1.3 VoIP & Web Services ...... 212 5.1.3.1 Eye-like Algorithm...... 212 5.1.3.2 Academic Audio Access “A3” ...... 213 5.2 Findings and Recommendations...... 216 5.2.1 Introduction ...... 216 5.2.2 Domain Specific Language ...... 216 5.2.2.1 What is DSL? ...... 216 5.2.2.2 Advantages and Disadvantages ...... 217 5.2.2.3 DSL Life Cycle...... 218 5.2.3 Proposed DSL Specifications ...... 219 5.2.3.1 Setup File Specifications ...... 221 5.2.3.2 Dialplan File Specifications ...... 222 5.2.4 Conclusion...... 224 PART 6 – APPENDICES & REFERENCES ...... 228 Appendices ...... 229 Appendix A - Allow Callers to Input Letters ...... 229 Appendix B - Exchange Message Fields...... 231 Appendix C - Queries to Get Windows Information...... 233 Appendix D – Configuration File of OpenSER ...... 248 Appendix E – DSL Configuration & Dial Plan Files...... 251 Appendix F – Countries by Number of Mobile Phones & Internet Users ...... 262 References ...... 275

xxxi xxxii List of Figures

Figure 1.1 : VXML Gateway Figure 1.2 : Gathering User Input with Figure 1.3 : Gathering Data with a Form Figure 1.4 : The architecture of VoicePHP Figure 1.5 : The Methodology Design Figure 1.6 : Spiral Form of the Followed Methodology Figure 2.1 : The relationship between groups of VoIP stakeholders Figure 2.2 : Converged Network Figure 2.3 : TDM 400P Card Figure 2.4 : Digital Interface Card - TE120P Figure 2.5 : Apstel Integration Server for Asterisk Figure 3.1 : Code of the ASP Page Called within the Dialplan Figure 3.2 : Calling the ASP page within the Dialplan Figure 3.3 : Code of the PHP Page Called within the Dialplan Figure 3.4 : Code of the .NET Page Called within the Dialplan Figure 3.5 : How to Connect to Oracle from .NET Figure 3.6 : How to Connect to Oracle from PHP Figure 3.7 : Code of the ASP Page Called count_emails.asp Figure 3.8 : ASP Page to Contact MS Exchange Server and Active Directory Figure 3.9 : The Code of the email_details ASP Page Figure 3.10 : ASP Code to Send a Reply Figure 3.11 : PHP Code to Send a Reply Figure 3.12 : The Complete Call Flow between MS OCS and Asterisk Figure 3.13 : ASP Page Used by the PBX to Retrieve Data through Excel Figure 3.14 : ASP Page Used by the PBX to Communicate with MS Word Figure 3.15 : ASP Page Used by the PBX to Get the Shared Folder of the Caller Figure 3.16 : ASP Page Used by the PBX to Count Items inside a Folder Figure 3.17 : ASP Page Used by the PBX to Get the Nth Entry in a Folder Figure 3.18 : The Dialplan to Explore the Hard Disk

xxxiii Figure 4.1 : ASP Page Used by the PBX to Contact the Active Directory Figure 4.2 : ASP Page Used by the PBX to Detect Shares Figure 4.3 : Dialplan to Detect Changes in Shares Figure 4.4 : The PBX Protecting the LAN | WoL Scenario Figure 4.5 : The PBX Protecting the LAN | Physically Disconnected Scenario Figure 4.6 : Complete Architecture of the Encrypted VoIP Scenario Figure 4.7 : Source Code of the ASPX Page that Retrieves the Chunks Figure 4.8 : The PBX Working with an RBE | Flow of control Figure 4.9 : The PBX Working with an RBE | Action-Taking Dialplan Figure 4.10 : The PBX Working with an RBE | ASP calling the Webservice Figure 4.11 : The PBX Working with an RBE | GetActions Web Method Figure 4.12 : InRule Interface Figure 4.13 : CPU Measurements Figure 4.14 : Architecture of the Eye-Like Surfing Scenario Figure 4.15 : Source Code of a Sample ASPX Page Figure 4.16 : Architecture of “A3” Figure 5.1 : Broad Range of Integrated Services Figure 6.1 : Macro that Asks Callers to Enter Letters by Dialing Numbers Figure 6.2 : VBA Code to Get All Fields of Email Messages Figure 6.3 : Proposed DSL Setup File Figure 6.4 : Proposed DSL Dialplan File

xxxiv xxxv List of Tables

Table 1.1 : Contributions Summary Table 1.2 : Mobile phones in use in the world Table 1.3 : Internet users in the world Table 1.4 : List of Utilized Technologies Table 1.5 : List of Users and Stakeholders Table 1.6 : Concepts Discussed in this Thesis Table 2.1 : Pattern-matching Syntax Table 2.2 : Advanced Dialplan Applications Table 2.3 : List of TTS Engines Table 2.4 : List of Speech Recognition Packages Table 2.5 : AGI communication Channels Table 3.1 : IM Messages that will Appear in the Office Communicator Table 3.2 : Sample Records in the OC Requests Table Table 4.1 : Hardware Used to Encrypt / Decrypt Phone Calls Table 4.2 : Enhancements to Decrease Encryption/Decryption Time Table 4.3 : Actions Taken by the PBX when Suspicious Calls are Detected Table 4.4 : RBE Entity Fields Table 4.5 : irAuthor Legend Table 4.6 : PBX, RBE, CDR, and Web Servers Components Table 4.7 : Percentage of Servers’ CPU Usage / Active channels Table 4.8 : Handling HTML Controls Table 4.9 : Academic VoIP Blog | Participation Table 4.10 : VoIP Academic Blog | Missed Learning Objectives Table 4.11 : VoIP Academic Blog | Academic Staff Quality Table 4.12 : VoIP Academic Blog | Bank of Questions and Answers Table 4.13 : VoIP Academic Blog | Academic Performance Indicators Table 6.1 : Digits and the Corresponding Letters Table 6.2 : Some Fields of Email Messages

xxxvi Table 6.3 : Queries to Get Windows Information Table 6.4 : Top countries by the Number of Mobile Phones in Use Table 6.5 : Countries by the Number of Internet Users

xxxvii xxxviii Preface

This thesis provides different methods for integrating VoIP products with all types of software packages that can be found in any typical LAN. The goal is to find a mechanism that will ease the integration of a PBX in software with a wide range of servers, such as the mail server, the database server, the file server, the internet server, and other types of servers so that all of the services offered by those servers can be provided to the caller over the phone.

This thesis is meant to be used as a reference for senior VoIP engineers. The intended audience of this thesis is supposed to be familiar with web applications development, any PBX in software, text-to-speech engines, email servers, database servers, and information security. They also need to have had some exposure to Linux- and Windows-based servers.

The thesis starts with an introduction and a presentation of the targeted integration. It concludes with a summary that lists the different types of servers and solutions that could be integrated with a VoIP package. Along the way, in-depth treatment and analysis of the most important integration outcomes are provided.

xxxix PART 1 - INTRODUCTION

1.1 Thesis Objective

Internet services are limited to personal computers, PDAs and smart phones. Those services are also not adapted for blind people, people with visual impairments, people who cannot afford having visual tools, and people who do not have enough skills to deal with a computer. Thus, there should be a way to access PC services through other devices.

In fact, Internet users’ community is only one third of the total number of phone users around the world. This means that we still have billions of people who are using the phone but they are not internet users yet. There must be evident reasons behind that. The phone is simpler to use and more portable than any other device.

According to the International Data Corporation Worldwide Quarterly Mobile Phone Tracker, smart phones manufacturers shipped around 300 million smart phones worldwide (IDC Press Release, 2011). This small number (around 4.3% of world population) does not mean that smart phones could replace the typical voice-based phone. Allowing the majority of the population around the world to access a wide range of services, such as web surfing, over the phone will lead to better phone utilization and coverage to a wider community. Any solution, technology, or application that can be accessed only through a smart phone is ignoring around 96 % of the world population.

Since VoIP provides a way to communicate over any IP network, integrating the VoIP solution with all other types of solutions already used by network users will be an outstanding advantage.

This thesis discusses integrating VoIP technology with different types of servers. The main goal is to allow phone users to reach the servers the same way computer users do. Different technologies will be used together to achieve this integration. This thesis focuses on the benefits resulting from such integration rather than the ways used to achieve it.

1

What triggered our interest in this topic was “The Top 25 VoIP Advances of 2009” report (Poe, 2009). Although this report introduces many advances in the field of VoIP, it mainly focuses on one of the key VoIP trends, which is the increasing integration of voice technology with other services and applications.

When the phone user is allowed to reach any server on the network, he/she will be provided with a huge collection of services. When the targeted integration is achieved, stakeholders can: - connect to their email server, such as Exchange Server

- connect to a database server, such as Oracle, MySQL, or SQL Server

- connect to an internet server, whether Linux- or Windows-based

- connect to a unified communication server, such as OCS

- connect to directory services, such as MS Active Directory

- take advantage of the VoIP technology so that all typical computer services are delivered over the phone

In short, this thesis will propose several integration scenarios where the phone replaces the computer and the ear replaces the eye. When this replacement is smoothly achieved, developers will focus on the users’ requirements rather than on the low-level details of integration. In addition, users will benefit a lot from such a replacement because most people prefer to give instructions by talking rather than clicking, typing, or doing anything else.

Several previous and current technologies, such as VXML & SALT, tried to offer a solution that offers a simple way to read input and return output. Although these technologies simplified the interaction with voice applications, they still require the developer to get into the details of integration. This thesis introduces a mechanism to enable voice-applications developers simply integrate their applications with a wide range of servers

2 This thesis also introduces a totally new approach that allows phone users to surf the web through the phone. The proposal suggests developing an eye-like algorithm that will act just like the human eye. When this algorithm analyzes a web page, it can ‘quickly’ tell the caller about the different parts that are included in the page and the caller will decide where to go next. This bio-inspired behavior guarantees that the algorithm will allow callers to interact with the web page in an efficient way. Previous solutions did not suggest more than transcoders to translate from HTML to VXML. Such a translation is not always significant.

3 1.2 Thesis Contributions

Some of the proposals presented in this thesis can be considered complementary to previous work, while other proposals introduce enhancements, implementations, and radical changes.

The idea of integrating VoIP with other technologies is not totally new. Jiang et al . in their article "Integrating Internet Telephony Services", state that Internet telephony integrates services provided by the Internet with the PSTN [Jiang et al , 2002].

In the following section, the different proposals presented in this thesis are summarized and a brief description of the state of each is given. In most of the cases, we start with an idea, expand it, and include some other ideas from the same scope. Every integration scenario presented in this thesis belongs to one of the following three categories:

In general, there is a difference between what is presented in this thesis and what has been presented in the state-of-the-art articles. The main difference lies in the fact that each article focuses on solving one integration problem whereas this thesis introduces a mechanism to integrate VoIP with almost all services available on the network. In other words, the focus of the thesis is on the GENERAL nature of the problem, not on SPECIFIC solutions to SPECIFIC integration issues. All integration scenarios introduced in this thesis lead to one result, the need to develop a Domain-Specific Language (DSL) that compiles all the presented modules of code into a single library. This DSL will help voice application developers build applications that can deal with almost all services on the network.

1.2.1 VoIP and Typical Services

All scenarios under this category focus on the integration of all services found in a typical LAN with VoIP so that they are accessible over the phone. This category includes the following five sub-categories:

4 1.2.1.1 Mail Servers Integration scenarios of this sub-category aim at presenting different ways to integrate a VoIP solution with a mail server so that callers can check their emails over the phone. Integrating VoIP solutions with mail servers has been checked & implemented in VoIP research work. Singh et al . wrote about “Integrating VoiceXML with SIP services” [Singh et al , 2003]. In their article, they suggested an approach to integrate the phone with VoIP solutions. They state that using VoiceXML and SIP will lead to easier integration with email, instant messaging, web, and telephone. They even created a model called “Email by phone”. We believe that their work solves a single problem, while our proposed architecture is more general and solves many integration problems using same integration mechanism. Furthermore, they invented a new browser to achieve the required integration, while we rely on typical web pages to solve the same problem.

CINEMA (Columbia Internet extensible multimedia architecture) is another example that shows the trend towards integrating email & other systems with VoIP technology. CINEMA will be briefly explained in the next paragraph, titled “UC Servers”

1.2.1.2 UC Servers

Singh et al . wrote about “Integrating VoiceXML with SIP services” [Singh et al , 2003]. In their article, they suggested a new browser called sipvxml that enhances the services of the CINEMA test-bed. CINEMA is a software package that offers several VoIP-related functionalities [Jiang et al , 2002]. These functionalities are listed in the following: - providing interoperability with the PSTN, IP-based voice mail, and programmable telephony services. - integrating e-mail and Web access for unified messaging - supports multi-party multimedia conferencing - allowing to add new services and features, including instant messaging, presence support, and network appliance control.

Although we believe CINEMA can be considered a good solution, we think the architecture is not easy to implement as it includes a lot of servers (Proxy Server, SQL Database Server, Web Server, and SIP Translator) to achieve the integration. While our proposal is simpler and uses the same integration mechanism to integrate VoIP with all other services, CINEMA focuses mainly on solving communication integration issues.

5

1.2.1.3 DBMS Servers

An example of this subcategory is a scenario where callers are enabled to access databases in order to retrieve and/or update data. Decision makers usually rely on their technicians to formulate database queries. There should be an interface, as simple as phone system, where business executives can execute a query to retrieve data [Hendrix et al . 1978].

Speech is a natural medium of communication for people [Lefebvre et al . 1993]. Most people prefer to give verbal instructions rather than pushing a button or typing. In their article "Database Access via Spoken Language Interfaces", Dybkjær et al . assert that using natural language dialogue facilitates the use of databases by non-experts [Dybkjær et al , 1995].

The solution suggested by Dybkjær et al . focuses on accessing a database using spoken language dialogue interface. It makes use of several drivers, managers, parsers, handlers, and other tools, which makes the solution very specific, sophisticated, and hard to build. The approach presented in this thesis is general, simple to use, and easy to build.

1.2.1.4 File Servers & Office Tools

To present a complete integration framework that covers all services found in a typical LAN, we included two additional scenarios to integrate VoIP with file servers and office tools. The output of integrating VoIP with file server is the capability to use the phone to access the hard disk of a Windows or Linux server. That way, phone users will be able to explore their hard disks until they find a certain file. At this point, they might instruct WinWord or Excel to open the document, fill some fields, and send the file as an attachment. We believe that the two features described in this paragraph can be seen as an extension to previously discussed features.

6 1.2.2 VoIP & Security Services

1.2.2.1 Encryption Services Although this thesis does not mainly focus on the topic of security, several security-related examples have been presented. In this subcategory, we present a model through which VoIP is encrypted using a dictionary. By dictionary, we mean a two-column table. The first column stores numbers between 0 and 255. The other column stores unique ID numbers, which means that an ID corresponds to one and only one byte. When the two parties are talking over the phone, the digital media is recorded as a series of bytes. Each byte will be replaced by an ID before being sent to the other party. When the ID is received, the original byte will be retrieved using the same dictionary.

Many researchers decided to use a dictionary to achieve different goals. Govindan and Mohan used a dictionary for data encryption and compression. They suggested an “Intelligent Text Data Encryption and Compression for High Speed and Secure Data Transmission over Internet” [Govindan & Mohan, 2004]. They developed a strategy called IDBE (Intelligent Dictionary Based Encoding) to achieve their goal.

The approach followed by Govindan & Mohan focuses on encrypting text files only. What we have achieved is encrypting voice media, and not only text files.

Another research work by [Prasanna and Dandalis, 2000] makes use of the same concept for data compression which was suggested by Nelson, M. (1996) in the Data Compression Book. Prasanna and Dandalis talk about dictionary-based approaches for encoding. We take advantage of what has been presented in their article to show the efficiency of our proposal. According to them, decompression is a look-up table operation. Only memory read operations are needed and consequently high decompression / decryption rates can be achieved.

Many other researchers made use of the substitution concept to achieve encryption. They all did so to encrypt and decrypt files. Runtime was not a big issue because they were not dealing with real-time communication. In VoIP, there is an inevitable need for real time response. What we have achieved is encryption and decryption using a dictionary in less than one second, which is an acceptable latency in VoIP.

7

1.2.2.2 Using RBE to Detect Suspicious Calls In this subcategory, we implemented integration between a VoIP solution and a Rule-based Engine (RBE). We aim at instructing the RBE to keep monitoring the PBX. Whenever a suspicious event takes place, the appropriate action is taken. That way, the PBX is under control all the time and there will be no need to keep an eye on the PBX CDR file to detect improper use.

Field researchers have worked on similar integration, but with a simpler architecture, a proprietary PBX, and a different scope [Ong & Cing, 2004]. We believe that our architecture is much better because it makes use of a Web Service that can be called either synchronously or asynchronously. The decision of whether to use the synchronous or the asynchronous mode is based on the performance results. The web service can be considered a separate layer that can be replaced by any other web service allowing the usage of a different RBE. Our architecture includes five servers: i) the PBX server ii) the Web application server iii) the Web service server iv) the RBE server v) the RDBMS server

That way, load is better balanced and consequently better performance is gained.

1.2.2.3 Other Security Services Three more security-related topics were discussed to provide a complete range of integrated services. These topics are: vi) Intrusion Prevention vii) Directory Services viii) Monitoring Services

Functionality similar to the one covered in the first topic can be found in the CINEMA architecture that was previously introduced. The other two topics helped us achieve a higher level of security. Whenever a misuse is detected, the PBX will automatically call the concerned people to inform them about the detected violation. Such a solution shows how fruitful a PBX can be because it takes action on the fly by calling the person in

8 charge. This behavior is better than sending an email, which might not be the right way to warn about urgent matters.

1.2.3 VoIP & Web Services

1.2.3.1 Eye-like Algorithm The main idea presented in this section is to develop an algorithm that will work just like the human eye. That way, phone users will navigate the web using their phones as efficient as using a visual browser. Linking the web to phone users is not a totally new idea. According to Rosenberg, intranet applications and websites can be presented to a phone user [Rosenberg, 2001].

Previous proposals suggested developing readers that will read the content of the page to the callers. However, typical web readers are not efficient since the user might keep waiting for a long time until the reader reaches the part in which he/she is interested. Such readers cannot discover important sections of the page. Our algorithm summarizes the page to the caller just like the human eye behaves.

In previous research work, annotations were suggested as a method of specifying important sections of pages [Hori et. al ,1999)]. Similar suggestions were proposed by [Asakawa & Takagi, 2000] and [Shao et. al ,2002]. We believe that using annotations cannot be considered a feasible solution because it requires augmenting the original HTML file with annotations. It is a significant burden to keep the HTML file synchronized with its associated annotation file. There were many unsuccessful attempts to solve the same problem. We believe that any proposal that requires radical changes will not be accepted by the internet community.

Voice XML and similar technologies have made the development of voice applications far easier, but developers still need to create a separate voice application for each web site.

This duplication of efforts leads to unnecessary redundancy and requires the developers to keep changing the voice-based applications every time the web site content is changed and vice-versa.

9

Our proposal allows callers to visit and interact with the same web site surfed by normal internet users.

1.2.3.2 Academic Audio Access “A3”

This proposal makes use of VoIP technology in the education field. We suggested using a VoIP solution to provide the students with an audio forum. This solution was used in several schools to achieve collaboration.

In “Voice-enabled Interactive Services (VoIS) Architecture for e-Learning”, Motiwalla, proposed a similar solution [Motiwalla, 2009]. Although this proposal has some similarities to ours, it has many differences. First, Motiwalla’s proposal focuses on the field of higher education, while our proposal focuses on elementary schools. Second, Motiwalla is talking about providing an efficient solution to visually impaired students, while our proposal focuses on young students who do not have enough computer driving skills.

Table 1.1 summarizes the contributions presented in this thesis.

10

Table 1.1 – Contributions Summary

Domain Solution Contribution

Integration Wide range of integrated services Code Library

Development Domain-specific language Proposed Model

Encryption 2 Publications Security Intrusion prevention model Code Library Using RBE to detect suspicious calls

Web Eye-like algorithm Publication + Model

Education Audio Academic Access Publication + Code Library

11 1.3 Thesis Security Proposals

Although this thesis discusses many security-related issues, it does not focus on security by itself. Whenever VoIP is discussed, security rises to the surface. This is because any internet-based technology is vulnerable to all types of threats that jeopardize the security of internet applications.

This thesis includes many chapters that try to present solutions to security problems. Although security is not considered a focal point, several security problems were highlighted and solutions were suggested.

This thesis focuses on integrating VoIP with different technologies. It does not aim at providing security solutions. Some presented scenarios suggest a solution for some security problems to show that integration can be made to achieve better security. The security issues approached in this thesis are briefly described in the following sections.

1.3.1 Encryption Services

In this proposal, a VoIP solution is integrated with an encryption engine that is based on a dictionary. The engine uses a character substitution mechanism to encrypt and decrypt phone calls. The caller will use a dictionary to encrypt the bytes before they are transmitted. The receiver will use the same dictionary to decrypt the received bytes. The goal of this proposal is to show that such integration between the VoIP solution and the encryption engine can be achieved. The emphasis is on the integration, not on the type of encryption because character substitution is one of the simplest and oldest forms of data encoding.

1.3.2 Intrusion Prevention Services

In this proposal, a VoIP solution is integrated with switches and computers to disconnect highly sensitive networks from the internet. The goal is to isolate those networks in order to minimize the time they are connected to the internet, and consequently minimize penetration risk. Again, this scenario stresses the integration, not the achieved level of security as isolating a network is a questionable benefit that comes at the price of being disconnected.

1.3.3 Directory Services

This scenario presents a way to allow a VoIP solution to access Microsoft Active Directory. The scenario enables callers to control users’ accounts. The significance of the presented scenario lies in the integration achieved between the VoIP package and the

12 Active Directory. From a security point of view, such integration poses a security threat by itself.

1.3.4 Monitoring Services

In this scenario, a VoIP package is instructed to monitor the network. When any misuse is detected, the VoIP package takes action, such as calling the network administrator to inform him/her about the detected violation. The emphasis is on how to integrate the VoIP package with a network-monitoring application. In this scenario, we aim at achieving the integration, not at providing security measures. Alternatively, the monitoring application may detect other events that are not related to security.

1.3.5 Using Rules-based Engine

In this scenario, a rule-based engine is integrated with the PBX to detect suspicious calls. Such integration enables the user to specify rules and events. Whenever a rule is violated, the corresponding event takes place. The goal of this integration is to enforce a policy created by the PBX owner.

In all of the above scenarios, security is used as an example. It was never meant to be used as a protection measure or as a defense mechanism. The problems solved in these scenarios aim at showing the benefits of the integration, rather than the immunity of the proposal.

13 1.4 Thesis Organization

This thesis consists of six parts which are organized as follows:

Part 1 entitled “Background and Groundwork” is an introductory part. This section includes three chapters. The first chapter explores the knowledge space and discusses the existing knowledge and the related research work including VoiceXML, SALT, VoicePHP, CCXML, and other technologies. The second chapter presents the roadmap of this thesis. It includes the mission statement and lesson learned. The last chapter in Part 1 presents the methodology design.

Part 2 “VoIP Environment” includes four chapters. The first chapter is an introduction to VoIP. The second chapter introduces Asterisk, the open-source PBX. This chapter focuses on the concept of dialplan. It starts with the basics, and then it moves to typical applications. The chapter ends with several advanced applications. To make the reader familiar with speech engines, the third chapter in this part discusses Speech-to-Text and Text-to-Speech technologies. The fourth and last chapter in this part illustrates the different approaches that can be followed to allow the PBX contact other technologies and software packages.

Part 3 entitled “Implemented Integration Scenarios” includes 7 chapters. The first chapter is an introduction to the integration topic. The second chapter explains the integration strategy that will be followed. The third chapter presents different ways to enable the PBX to deal with different RDBMS packages including MS SQL Server, MySQL, MS Access, and Oracle. The fourth chapter provides a detailed plan that allows the PBX to retrieve emails from MS Exchange Server, and send replies. The fifth chapter demonstrates how the PBX can be integrated with Microsoft Office Communication Server. In this chapter, a very detailed integration plan is presented. The sixth chapter shows how to integrate the PBX with MS Office Applications, such as Word, Excel, and Office Communicator. The final chapter in the third part exhibits how file servers and shared folders can be explored through the phone.

Part 4 entitled “Advanced Applications” includes 3 chapters. The first one is a long chapter that draws attention to issues related to information security. It presents an

14 integration methodology to enable the PBX to communicate with MS Active Directory. This chapter also shows how the PBX can be used as a security tool to monitor critical servers using Windows queries. Then, the concept of protecting highly-secured LANs using the PBX is presented. A new implementation to encrypt phone calls is presented under the title “Interception-proof VoIP using Dictionary-based Encryption”. The last topic discussed in this chapter is how to use a Rule-based Engine to detect suspicious calls. The second chapter in Part 4 is entitled “Eye-like Algorithm to Produce Voice Web Pages”. This chapter introduces a new method through which callers can navigate the web through the phone. It suggests an advanced algorithm that receives a URL as input, behaves just like a human eye, and returns a summary of the page to be delivered to the caller. The last chapter in Part 4 is entitled “Academic VoIP Blog for Elementary Schools”. This chapter discusses the use of the PBX as an academic tool to achieve collaboration and leverage the quality of learning.

Part 5 entitled “Conclusion & Perspectives” presents the findings of our research work. The concept of DSL is then introduced. Finally, a new DSL is proposed and a conclusion is presented.

The sixth part includes Appendices and References.

15 1.5 Knowledge Space Exploration

Before presenting the proposed solution along with its many advantages, the knowledge space needs to be explored. First, the need for the targeted integration will be justified. Then, current situation and existing knowledge will be presented. After that, related research work will be discussed.

1.5.1 Motivation and Need

The most natural means of communication is speech. That is why speech is considered the most practical and elegant way to inform people (Eidsvik, 2001). Cooper (2004) stated that the oldest form of communication is also speech. Fluss (2004) described it as the most ubiquitous communication method.

If speech is going to be used, computers need to recognize and generate speech (Bradnum, 2004). While speech is the most preferred method of communication to humans, this is not the case with the machines (Datamonitor, 2003). That is why there is a need to integrate VoIP products with a broad collection of solutions, servers, and technologies. The main goal is to allow access to all current technologies via speech.

The topic of integration is a very old one. For thousands of years, integration used to cause a headache to the engineers who unanimously considered it an inevitable need. When it comes to communication, this need becomes urgent. We believe that integrating the different technologies with VoIP products would be a quantum leap. A typical computer user can use his/her PC to check mail, query a database, open a template document, fill in some blanks, surf the web, participate in a forum / blog, monitor the status of some servers on the LAN, and many other tasks.

However, a problem arises when using the PC is not an option, either because it is not available or because the user is not capable of using it. These two obstacles can be overcome when the phone is used because it has several advantages compared to the computer. The first advantage is enhancing accessibility. Phone users will be able to perform all those tasks over the phone from any place they want without even the need for a PC. Second, blind or visually impaired users will be able to fully interact with the

16 computer through the phone. Third, since the number of phone users in the world is much more than the number of computer / internet users, providing a solution that can be accessed only by using a PC will serve a relatively small community. This is not the case when the same solution can be accessed through the phone because a greater community will benefit from it.

According to a study conducted by Eidsvik in 2001, there are more than 1.5 billion phones and around 500 million mobile phone users in the world (Eidsvik, 2001). Needless to say, this two-billion user community is large enough to encourage us to design special solutions to satisfy its requirements. Those numbers are increasing every day. Moreover, the phone assists visually impaired and blind individuals (Orubeondo, 2001). It also helps those who are not familiar with computers or feel uncomfortable using a technology which is more advanced than the phone (Regruto, 2003).

The implication of these findings is that there is a crucial need to allow users to access computer services over the phone. Any solution that is based on the computer alone ignores most of the population around the world. In the following, two tables will be shown as taken from Wikipedia. Table 1.2 shows the number of mobile phones in the world along with the percentage of the population. The goal behind presenting this table is to show that the number of phone users is huge enough to motivate us to design special solutions for them. Table 1.3 shows some statistics about Internet users in the world. These two tables show that the percentage of the world’s population using the phone is much higher than that of those who use the internet. Thus, any solution that can only be accessed through the use of a computer will not be used by more than half of the world’s population. The remainder of the population cannot reach the solution because they use phones and are not connected to the internet. Table 1.2 shows that around 68 percent of the people around the world have mobile phones. This is a majority even before adding the percentage of those who use non-mobile phones. Table 1.3 shows that only 29 percent of the people are internet users. This huge difference calls for a solution, through which phone users can perform all the tasks and access all the services available to typical computer users.

Table 1.2 - Mobile phones in use in the world

Number of mobile phones % of population Last Update

17 Table 1.2 - Mobile phones in use in the world

Number of mobile phones % of population Last Update

4,600,000,000 67.6 2009

Table 1.3 - Internet users in the world

Internet Users Percent of population Last Update

1,966,514,816 28.7% 2010

For detailed statistics about the number of mobile phones in use and Internet users in each country, refer to Table 6.4 and Table 6.5 in Appendix E

According to Larson, the number of users who can use their phones to access the information on the internet from anywhere and at any time is increasing (2004). The reason behind this is that telephones are small in size, light, and generally inexpensive. The long life of the batteries is another factor that makes the phones more portable than computers (Orubeondo, 2001).

18 1.5.2 Existing Knowledge and Related Research Work

Accessing all types of hardware devices and software packages through the phone has been the trend for the last three decades. Phone users have been allowed to control their home appliances, through the phone. Bank clients have been given access to their accounts over their phones. They can check their balances and even initiate some trivial transactions. Even working parents have recently been allowed to use their mobile phones in order to monitor their children, while being in the nursery. These facts show how important it is to integrate different solutions and devices so that they can be accessed via the phone.

When VoIP PBX software packages emerged, the integration became more feasible and promising. Researchers and engineers tried their best to integrate VoIP packages with the different packages in order to have a fully integrated framework of technologies. It is not so easy to integrate different technologies because they were not designed with compatibility in mind. To be more specific, some packages run under Windows while others run under Linux. Some use the TCP protocol while others use the UDP protocol. Thus, it would not be an easy task to integrate all of these products and allow the users to access them through the phone.

1.5.2.1 Voice XML

VXML or VoiceXML stands for “Voice eXtensible Markup Language”. AT&T, Lucent Technologies, IBM, and Motorola developed VXML in 1999 and donated it to the W3C for formal standardization. Originally, this technology was designed to support phone menus (Leavitt, 2003). VoiceXML was defined by Jackson (2001), Beasley, Farley, O’Reilly and Squire (2002) as a standard markup language for writing speech-based applications. VoiceXML - according to Wikipedia - is the W3C's standard XML format that can be used to specify interactive voice dialogues between a computer and a human. VXML allows developing voice solutions in an analogous way to HTML. Just as HTML pages are interpreted by a traditional visual browser, VXML files are interpreted by a . In short, VXML is a language that helps developers create voice applications. Eidsvik (2001) believes that “almost every industry can benefit from VXML”. Using

19 VXML, developers will be able to focus on the required functionality, not on the way they need to do it. Details of getting a user’s input, for example, are handled by the VXML provider. The developer will have a set of pre-defined functions available for use.

VXML is designed to enhance and facilitate development of telephony applications. It targets allowing the specification of IVR (Interactive Voice Response) applications. It is a high-level dialog markup language that provides tags for manipulating user interaction through phone touch-tone inputs, audio output, or speech recognition (Potter & Larson, 2005).

1.5.2.1.1 Architecture

VXML is a combination of different technologies bundled together in order to integrate multiple speech and telephony solutions. The following technologies are among the building components: • Automated Speech Recognition (ASR) • Text-to- (TTS) • DTMF • Interactive Voice Response (IVR)

VXML services are provided to users through a VXML gateway. Using his/her phone, a user can access information from a database via VXML. The caller will contact a VXML browser that resides on a gateway, as shown in Figure 1.1. The gateway fetches VXML data from the database via a Web server (Leavitt, 2003).

20

Figure 1.1 - VXML Gateway

1.5.2.1.2 VXML Basics

Dialogs are essential for VXML documents. They are used either to collect data from the user or to offer choices to him/her. The user is always in a certain dialog. When there are no more dialogs, the application exits.

Content can be delivered either by using text-to-speech or pre-recorded sound files. Grammar can be used in order to create a list of permissible vocabulary for the user to select from. Figure 1.2 and Figure 1.3 present sample VXML documents that show the simplicity of the language:

Record your SSN Say your ten digit social security number

Figure 1.2 - Gathering User Input with

The code shown in Figure 1.2 does the following: • It prompts the user for input, by saying: “Record your SSN” • It recognizes input according to specified grammar • It catches any event appropriate to a portion of dialog. When the user needs help, he/she will hear the following message: “Say your ten digit social security number” • It fills the variable called “PhoneNo” with a recognized user response.

Code in Figure 1.3 gathers data through a form. The code gives the caller three options and waits for him/her to say which option is required.

21

Please choose Weather, Sports, or news [ weather sports news ]

Figure 1.3 - Gathering Data with a Form

1.5.2.1.3 Advantages of VXML

Navigating voice interfaces is much easier than navigating touch-tone services (Orubeondo, 2001). VXML allows users to interact with the Internet by speaking and listening. This technology allows users to access online information through a phone instead of a computer. VXML is obviously best suited for applications that require limited input and provide specific output. Although this promising technology does not fit in all scenarios, it offers many benefits, especially over proprietary IVR systems: • VXML applications allow telephony applications to be part of the existing Web infrastructure. According to Jackson, VXML “greatly simplifies speech recognition application development by using familiar Web infrastructure” (Jackson, 2001). • VXML makes application development at least two times shorter. Developers will have better focus on their programs. They will be able to specify what they want, instead of specifying how to do it (Regruto, 2003). • Being a W3C standard, VXML allows end users to benefit from the availability of re-usable applications from developers all over the world. Existing software components can be re-used by the voice applications (Syntellect, 2003). • VXML applications can be easily moved between different hardware and software platforms. This portability will help clients protect the investment they made in their own voice applications (Syntellect, 2003). • VXML allows for the separation of applications, platforms and speech engines. • An open standard means a broad selection of VXML applications and interoperability of technology. • Separation of telephony resources and application enables the cost-effective

22 outsourcing of telephony services through VXML hosting providers. • There is no need to recompile VXML scripts. This means that call treatments can be easily customized because VXML documents can be generated immediately. This allows dialogues to be highly dynamic (Brøndsted, 2004).

1.5.2.1.4 VXML Limitations

In general, voice applications - whether based on VXML or not - experience problems such as separating the background noise from the user’s voice and deciphering accents (Orubeondo, 2001). Speech impediments are also a problem and natural vocal pauses are not an exception (Lippencott, 2004). In specific, VXML suffers from several limitations: • To accurately model speech input, developers have to use the appropriate application grammar. • The portability of the application is affected by the variance of speech recognition / synthesis between portals, which will result in decreasing customer satisfaction (Brøndsted, 2004). • VXML is designed to work in client/server architecture. It requires a large support structure and that’s why it is not suitable for embedded dialogues (Brøndsted, 2004). • Voice browsers can be put at the Voice ASP (Active Server Pages) site, which means that the ability to transfer calls between live agents is limited. This limitation also affects capacity, latency and reliability (Syntellect, 2003). • VoiceXML does not include robust call control capabilities while other technologies, such as CCXML, do.

1.5.2.2 SALT

SALT stands for “Speech Application Language Tags”. It is a set of tags for platform-independent development. SALT allows access to applications, information, and Web services from telephones, PCs, and wireless PDAs. SALT was originally intended to extend the existing markup languages (HP OpenCall, 2003). In 2003, Microsoft decided to support SALT over VXML (Lippencott, 2004).

23 Genesys product manager Srinivas Penumaka says that SALT tags can be inserted in documents written in other languages. This allows SALT developers to add speech capabilities by enhancing existing Web technologies. This is not applicable when using VoiceXML because it requires that developers create new technologies that vendors must add to their products and users must learn. According to Penumaka, it is easier to write applications for computers, PDAs, and hand-held devices with SALT since developers will be able to use SALT tags with other markup languages which were designed for creating applications for different devices. SALT and VoiceXML were mainly designed to target different needs. Both were designed at different stages in the life cycle of the WWW (Word-Wide Web). While VXML was needed to define a markup language to be used for dialogs that takes place over the phone, SALT was needed to allow developers create speech-enabled applications that run across a wider range of devices. Although SALT was mainly developed by Microsoft as a competitor to VXML and was submitted to the W3C for review in 2002, the W3C proceeded with the development of VXML 2.0 (Schwartz, 2004). Four years later, Microsoft noticed that its Speech Server should support VXML to remain competitive. Microsoft decided to join the VXML Forum as a Promoter in 2006 (Redmond, 2006). Then, every SALT company had to join VXML forum. Nowadays, SALT is replaced by VoiceXML. SALT is defeated and is not receiving developers’ attention anymore. SALT’s forum is not even accessible. Apparently, SALT is not currently considered a competing technology for voice application development. VoiceXML won the battle of standardization and became the standard markup language to develop voice-based applications. Bill Meisel, a principal at a speech technology research firm called TMA Associates, said that “SALT has not reached the level of maturity of VoiceXML in the standards process" (Schwartz, 2004).

1.5.2.3 VoicePHP

VoicePHP is the same old PHP that inputs and outputs voice instead of text. The diagram in Figure 1.4 presents the architecture of this technology.

24

Figure 1.4 - The architecture of VoicePHP.

VoicePHP is another attempt to provide a language for developing voice-based applications. Although it is not based on XML, it can be generally compared to VoiceXML. There are a couple of technical advantages of VoicePHP over VoiceXML: • VoicePHP is a complete programming language. It allows developers to easily use complicated programming structures, such as loops. Since it is based on XML, VoiceXML cannot be considered as powerful programming language as PHP. • By nature, PHP is an Object Oriented programming language. This makes code more reusable and modular. • With VoicePHP, PHP developers do not need to learn a new markup language. They can benefit from being already familiar with the concept of PHP as a programming language

In general, VoicePHP is simple to use. It is efficient if the dialplan requires complex programming routines. If this is not the case, using a standard technology such as VoiceXML is a better choice.

1.5.2.4 CCXML

CCXML stands for “Call Control Extensible Markup Language”. CCXML was

25 basically designed for the purpose of providing voice applications developers with telephony call control features. It was never meant to replace VoiceXML. Its main purpose was to serve as a complementary language (HP OpenCall, 2003). When CCXML and VXML are combined together, a greater control is offered. CCXML allows applications to “seamlessly transfer calls, establish conference calls, or monitor incoming calls involving an unplanned event”.

1.5.2.5 Other Technologies

There are many other technologies that play a role in the development cycle of voice applications. Among these are XHTML, X + V, SCXML, and SSML.

XHTML stands for “eXtended HyperText Markup Language”. Scholz says that XHTML document types are designed to work with XML agents (Scholz, 2003).

XHTML + Voice specification is based on the combination of standard web content and spoken interaction. Many voice modules that are included in the specification support speech synthesis, speech dialogs, the ability to attach voice event handlers, speech grammars, and command & control (Scholz, 2003).

SCXML stands for “State Chart extensible Markup Language”. W3 consortium defines it as “a general-purpose event-based state machine language”. SCXML can be used: • as a high-level dialog language to control the encapsulated speech modules of VoiceXML 3.0. • as a voice application metalanguage to control business logic modules and database access. • as a multimodal control language to combine VoiceXML 3.0 dialogs with dialogs in other modalities. This includes mouse, keyboard, ink, vision, haptics, lipreading, etc. • as the framework of state machine for future versions of CCXML. • as an extended call center management language, combining computer- telephony integration with CCXML call control functionality. • as a general process control language.

26

SSML stands for “Speech Synthesis Markup Language”. SSML is a standard to provide access to the WWW through the use of spoken interaction. SSML is an XML- based technology that eases generating synthetic speech. Authors of synthesizable content are the ones who benefit the most from the markup language. SSML allows them to control aspects of speech such as pitch, pronunciation, rate, and volume across different platforms. The intended use of this markup language is to improve the quality of synthesized content.

1.5.2.6 Summary

In spite of the various technologies that emerged, VoiceXML could always prove to be the standard. There are many other technologies that were not presented in this study either because they were used for a short period of time or proved to be inefficient. Some technologies, such as CCXML, are meant to be complementary, rather than competitors to VoiceXML. Others, such as SALT, can be seen as an attempt to overcome some of VoiceXML’s limitations. Nowadays, VoiceXML is mature enough to be considered the main markup language to develop voice-based applications.

27 1.6 Roadmap

This thesis includes proposals, designs, and implementation details to integrate multiple technologies so that they work together as a single unit. Asterisk, the open-source PBX, is the heart of all these proposals, designs, and implementations. Among the purposes of this thesis is to check how such integration will affect our lives. The proposals focus on allowing access to all available technologies through the phone. The presented scenarios explain how phone users can benefit from using their phone to access database servers, mail servers, file servers, MS Office Communication Server, MS Office documents, MS Active Directory, Linux and Windows operating systems. The thesis also includes many other important suggestions. Some of these suggestions deal with securing the LAN and making use of the PBX as a tool to achieve a higher level of security. Other suggestions deal with implementing new techniques related to web access and quality of education.

1.6.1 Mission Statement

This thesis explores many technologies and checks how they can be used simultaneously in order to allow phone users an access to the widest possible collection of servers and solutions. The thesis also makes use of an open source PBX in different scenarios, such as integrating it with a rule-based engine to detect suspicious phone calls, protecting the network from intruders, encrypting phone calls using a user-defined dictionary, using Eye-like algorithm to produce voice web pages, and enhancing the quality of education through the use of a vocal forum.

1.6.1.1 General Direction and Key Goals

This thesis does not focus on discussing the use of a single technology. Rather, it focuses on using many technologies to achieve several goals in the fields of accessibility, security, and education. This thesis answers the following questions: • What about integrating the widest possible collection of servers and applications with the VoIP PBX? • Do we need such integration? • Can such integration be achieved?

28 • Have researchers tried to achieve this goal before? • What could they achieve? • Is there a need for a DSL (Domain-Specific Language) that deals with specific integration scenarios? • Can a proprietary package such as MS OCS be integrated with a Linux-based open- source PBX? • Can users check their mails and send replies over the phone? • Can users retrieve and send data from a database over the phone? • Can users send Instant Messages to their colleagues who use OC over the phone? • Can users open an Excel sheet or Winword document over the phone? • Can a sales representative, for example, open a template, fill in the blanks and send it to the client over the phone? • Can IT officers reach Active Directory to reset a user’s password over the phone? • Can the PBX monitor the status of the servers and call the network administrators when something goes wrong? • Can users have all the functionalities provided by the PC over the phone? • Can a phone replace a browser? • Can the PBX be monitored, controlled, and set to react to specific events? What kind of packages needs to be integrated with the PBX to achieve this? • Can the PBX help in achieving better security to protect the LAN? What about the efficiency of such an integration approach? • Can the PBX be used as a tool to surf the web? • Is it feasible to encrypt phone calls using a custom encryption methodology? What are the integration challenges that will be encountered? Why do we need a custom encryption solution? • Can the PBX help in establishing a VoIP blog / forum? If yes, what are the expected benefits and where does such a solution fit?

This thesis attempts to provide answers to all of the above mentioned questions. It also provides technical details and suggestions that are supported by real-case scenarios and implementations.

29 1.6.1.2 Primary Users and Stake Holders

The range of technologies and solutions introduced in this thesis is so wide. It includes both Linux- and Windows-based solutions. Those technologies, solutions, servers, and software packages are shown in Table 1.4.

Table 1.4 - List of Utilized Technologies

Category Package Linux-based Windows-based

VoIP Asterisk X

Festival X

Routing / Conversion OpenSER X

M-networks UCCG X

RDBMS Packages MySQL X

MS SQL Server X

MS Access X

Oracle X

MS Office Excel X

WinWord X

MS OCS Office Communication Server X

Office Communicator

OCS Mediation Server X

Windows Servers PDC | Active Directory X

MS Exchange Server X

Web Site Shoter X

Feng GUI Web Service

Development PHP X

ASP X

.NET X

30 Table 1.4 - List of Utilized Technologies

Category Package Linux-based Windows-based

Others InRule RBE X

Based on technologies listed in Table 1.4, one can notice that the targeted users comprise a complete community. In general, phone and computer users are among the stakeholders. A long list of beneficiaries includes salesmen, developers, network administrators, security officers, IT managers, web surfers, employers, employees, educators, students, parents, teachers, and blind or visually impaired people. Table 1.5 lists the stakeholders and provides an explanation why each category might be interested in the solutions presented in this thesis.

Table 1.5 - List of Users and Stakeholders

Category Explanation A salesman can make a phone call, supply a password, check his/her email, Salesmen record a reply, pick a WinWord template, fill in the blanks some fields, and send it as an attachment / fax. A developer can implement his/her own encryption technique to encrypt Developers phone calls. This thesis includes an implementation method that helps encrypt phone class using a custom dictionary. The PBX can be used as a tool to protect the LAN against intruders. A network administrator can keep his/her company’s LAN disconnected from Network the internet. Whenever he/she wants to reach the LAN from outside, he/she Admins can make a phone call to instruct the PBX to connect the LAN to the internet. When done, he/she can make another phone call to disconnect the LAN. This will minimize the chance of being penetrated. Security specialists can make a phone call, supply a password, find a Security username in the Active Directory, and perform an urgent task such as Officers resetting a password or disabling the account.

31 Table 1.5 - List of Users and Stakeholders

Category Explanation IT managers can be enabled to provide a complete VoIP solution through IT which the TCP-based OCS can be integrated with the UDP-based Asterisk. Managers The purpose is to benefit from services provided by both applications. One of the algorithms suggested in this thesis allows users to surf the web through the phone. It is called an eye-like algorithm. It behaves the same way the human eye does. It should detect those things that attract the human Web eye and inform the caller accordingly. For example, the PBX will say to the Surfers caller: “this page has a header, a left menu, and content. Press 1 for header, 2 for menu, and 3 to read the content”. It is a sophisticated and efficient web navigation algorithm. An employer can monitor and control the PBX running inside his/her firm to detect any misuse. This thesis discusses an implementation scenario through which the PBX can be instructed to monitor phone calls and make an action Employers accordingly. A list of rules are defined and handled by a Rule-based Engine whose job is to make sure events are handled in real time. For instance, certain phone calls might be recorded or ended. Others might trigger the PBX to call a third part or send an email to inform him/her. A decision maker can make a phone call, supply a password, query a Employees database, retrieve data, and make a decision based on the retrieved facts. This thesis introduces a new idea that can be used to improve the quality of Educators education. It is a voice forum / blog that can be used to achieve collaborative Students learning. Through the phone, students can record their questions and receive Parents answers from their colleagues. Parents can check their children’s questions. Teachers Educators / teachers can identify the unmet learning objectives. Blind and visually impaired users will benefit from the scenarios proposed in this thesis. These scenarios show how users can use their phones to contact Blind email servers, database servers, file servers, Windows servers, and Linux People servers. Proposals presented in this thesis enable users to use the phone in order to do same tasks they used to do through computers.

32 1.6.2 Lesson Learned

Integrating a wide collection of solutions and technologies is a challenge. Although such integration could encounter a lot of problems, risks, and compatibility issues, the result is always an engineering innovation. The beauty of the integration is the result that always includes all the features available in the integrated products along with new features that were never there.

This thesis presents ways and scenarios that enable users to access different technologies via the phone. The PBX is the heart of this integration. All other products and technologies, such as RDBMS Packages, Active Directory, Exchange Server, OCS, and other software packages, come on top of it. Integration is an ongoing process. There will be always new technologies and there will be always a need to integrate them all together.

33 1.7 Methodology Design

1.7.1 Concepts Generation

The methodology followed in this thesis is represented in Figure 1.5. This methodology goes through several refinement processes. It relies on its iterative nature in order to achieve best results. The iterative cycles of concept development spiral upwards in terms of developing a shared understanding of the targeted solution.

Figure 1.5 - Methodology Design

Figure 1.6 explains this spiral form in the process of developing a concept, testing it, revising it, expanding it, fulfilling the purpose, translating it into criteria, and finally aiding generation of new concepts.

34

Figure 1.6 - Spiral Form of the Followed Methodology

This methodology in handling the integration problems is of crucial importance. It allowed delivering results very soon. Although we agree that those results might be premature, it has been proved that the consecutive iterations will refine the delivered solution with time. Feedback coming from the end users participated also in those iterations and help in enhancing the output. This dynamic approach of knowledge assembly is the key to developing successful solutions.

1.7.1.1 Generated concepts

Integrating many technologies together to come out with new ones is best facilitated through intensive analysis and planning, especially when dealing with complex systems. The concepts presented in this thesis are listed in Table 1.6.

35 Table 1.6 - Concepts Discussed in this Thesis

Serial Concept Explanation

The PBX can be integrated with Users can check their mails, record a reply, and 1 MS Exchange server. send it as an attachment through the phone.

The PBX can be integrated with Users can query a database and retrieve data 2 different RDBMS packages through the phone.

The PBX can be integrated with Security officers can reset a password or 3 MS Active Directory disable an account through the phone.

The PBX can be integrated with Users can use Office Communicator to place / 4 MS OCS receive a phone call through the PSTN.

The PBX can be instructed to Users can open a WinWord template, fill in the 5 access Excel sheets and WinWord blanks and send it as fax. documents

The PBX can be instructed to Users will be able to find a file and explore 6 access the hard disk of a Linux or their shared folders through the phone. Windows server

The PBX can be set to protect the The PBX will call the network administrator 7 server and detect misuse when a new folder is shared.

Before any phone call is established, The PBX The PBX can deal with a Rule- consults the RBE, which will decide what will 8 based Engine to detect suspicious happen. Then, the PBX takes the required phone calls action in real time.

To protect the LAN against penetration, we The PBX can be used to isolate the should minimize the time it is connected to the 9 LAN internet. The PBX can be instructed to connect / disconnect the LAN

An algorithm can be developed to The algorithm will check the web page and tell allow users to surf the web using the caller about the different parts of the page. 10 their phones, rather than their Then, the caller will decide where to go. This computers navi gation should be smooth. The algorithm

36 Table 1.6 - Concepts Discussed in this Thesis

Serial Concept Explanation

should imitate the way a human eye behaves

This is a custom encryption. One can make use Phone calls can be encrypted 11 of such an encryption when he/she does not based on a dictionary trust ready-made encryption solutions.

The PBX can be used as an A voice forum can be developed to help in 12 academic tool for a better quality achieving collaborative learning of education

All ideas listed in Table 1.6 were proved to be attainable. Each one of them was planned, designed, implemented, tested, refined, and released. Although some of them could not be completely finalized, none of them reached a dead end. Solution providing is a continuous process that never stops as long as the need exists.

1.7.1.2 Built Prototypes

This thesis placed particular emphasis on modeling. Modeling was used to avoid failure and minimize risks. Many prototypes have been built to check how doable the proposed solutions are. Each prototype required dedication from the working developers. Preparing the infrastructure and configuring the servers exhausted a lot of efforts. Building some of the prototypes required that the working team include developers, security officers, IT specialists, and VoIP engineers. For every concept listed in Table 1.6, a prototype was built. This thesis does not present theoretical concepts only. Instead, each suggested solution is fully supported with a prototype designed and tested to verify the usefulness of the proposal. There is one single exception to this fact. The Eye-like Algorithm was not fully implemented. The design is complete and the reusable components are ready, but the final algorithm is not yet fully operating. More research should be made in order to complete the implementation of this promising idea.

37 1.7.2 Concepts Evaluation

This study has taken into great consideration users’ needs and satisfaction. This implied that all the concepts presented in this thesis should be evaluated based on feedback coming from stakeholders and professional consultants who were involved.

The importance of early phase evaluation has to be emphasized because as much as 80% of the plan can be specified in this early phase. At this phase, strategic decision making requires the necessary expertise to initiate the process. The first phase in the evaluation of each proposal consists of three modules: • determining purposes (are the needs of stakeholder satisfied?) • translating purposes into criteria • generating design concepts against which criteria and requirements can be developed and tested

In fact, all of the proposed concepts proved to be successful. The degree of the success differs between one proposal and another. The goal of the evaluation is to decide whether the solution is: • Efficient • Feasible • Simple to use • Secure • Tested

The process of concepts evaluation includes CPU measurements, hard disk activity, runtime, need, and users’ impression. In rare cases, users had to fill in questionnaires in order to document their feedback and suggestions.

1.7.2.1 Testing with in-house Prototypes

Most of our proposals were tested in the lab. We had the chance to test only a few of them in real environment. We believe that using real data is essential for the evaluation of the proposals presented in this thesis. Some of the proposals were tested in schools while others were tested in a call center. It was very challenging to prepare an environment

38 for testing with tens of users being involved. Special testing software packages were used to follow up the errors, solve them, and decide what caused them. In some extreme cases, we had to hire some testers to do the testing. One of the testing scenarios obliged us to write a collection of small programs to test the solution. We had to generate false phone calls to test the robustness of the proposed solution.

1.7.2.2 Efficiency

Measuring the performance of the proposed solutions and developed prototypes will provide the industry with further insights for improvement and process change. With one exception, all of the proposed concepts were proved to be efficient enough. This exception was handled later on. This happened with the proposal that deals with encrypting phone calls. At first, the runtime was not acceptable at all since the goal was to allow users make phone calls with minimum accepted latency. A solution that allows the receiver to hear the voice of the other party after five minutes will never be accepted. The encryption algorithm has been modified many times to reach an acceptable threshold. After several attempts, the encrypt function could finish its task in less than one second. Nonetheless, there was still one problem; the decrypt function was still taking too much time. Using our spiral methodology that improves the quality of the work with every iteration, we could bring the runtime of the decrypt function down to less than one second. Only then, could we achieve an encrypted communication very close to real time. Because everything comes at a price, the improvement in terms of performance was achieved by consuming more RAM.

39 40 PART 2 - VoIP ENVIRONMENT

2.1 Introduction to VoIP

In the 1990s, a number of researchers, from both academic and corporate institutions, started to work on carrying video and voice over IP networks. Nowadays, this technology is commonly referred to as VoIP. Simply, it is the process of breaking up video / audio streams into small chunks to be transmitted over an IP network. Those transmitted chunks will be reassembled at the receiver’s end (Packetizer, 2010a). This process is known as packetization.

Being able to develop user-defined dialplans, VoIP developers are capable of developing fully-featured voice applications. Using VoIP products, they are able to write a dialplan that asks a question, waits for an answer, connects to the outside world, brings some data from there, and sends it to a TTS engine that will read the text back to the caller. By using today’s VoIP solutions and technologies, performing the above tasks is just a matter of specifying “what” is required, instead of specifying “how” to do it. This full control over the flow of the calls and the ability to integrate this technology with all other existing technologies are the two factors that will make VoIP the standard communication technology.

VoIP is a promising investment and is taking the field of communication by storm. In the near future, calls that used to be made through regular phone lines will be routed over the Internet. This will dramatically bring down the cost of international phone calls.

2.1.1 Understanding VoIP

VoIP does not only mean voice communication. As a matter of fact, it means video, voice, and data conferencing. Real-time Text over IP (ToIP) and video telephony are also within the scope of VoIP. VoIP's importance lies in the fact that it will lead to significant changes in the people’s ways of communication. Among the benefits that come with VoIP is the ability to integrate a videophone or a stand-alone phone with the PC. Soft phones

41 installed on computers can be used for voice and video communications. VoIP also provides the ability to use Internet connection for all data, voice, and video communications. This is commonly referred to as convergence and it is usually the primary incentive for corporate interest in the technology. Convergence implies that using the same network for all types of communications will reduce overall deployment and maintenance costs (Packetizer, 2010b). Figure 2.2 shows a converged network. Since the VoIP provider can be located anywhere on the globe, any individual with Internet access is no longer restricted to selecting a service provider in his/her local area.

The computer has considerable processing power; it can compress recorded / transferred sound. Such a compression is needed is to save some space. Although compression minimizes the size of the recorded sound , this process is CPU consuming at both ends. VoIP engineers and implementers should make a compromise in this regard. If the connection has a very good speed, it would be better to use an algorithm that does not consume the CPU. There are a number of algorithms that compress audio. The compressing algorithm is called "compressor/de-compressor", or simply CODEC. There are different CODECs for movies and sound recordings.

42

Figure 2.2 - Converged Network

2.1.2 VoIP: Advantages and Disadvantages

2.1.2.1 Advantages

It is not surprising that VoIP has quite some excellent advantages: 1. Cost: VoIP phone calls are significantly cheaper. For a low cost monthly subscription, national and local phone calls can be made for free. International phone calls can also be made at a significantly low rate (Bromley, 2005). 2. Global Number: When users subscribe to a VoIP service provider, they will probably get phone numbers for life. This means that these subscribers will be able to take their numbers with them even when they travel. As long as they have internet access, they will use their global numbers to make phone calls; just the same way they used to do when they were in their homelands (Bromley, 2005).

43 3. Free Services: Most VoIP service providers offer free services such as voicemail, caller ID, call forwarding, call waiting, speed dialing and many more (Bromley, 2005). 4. Operational Cost: Companies can save a lot by having voice traffic and data on the same network. This is much better than managing two separate networks: one for data and another for voice lines (Wailgum, 2010). 5. Responsive forms of Customer Service: Since VoIP runs on the same network that hosts other programs, services such as “click to talk” can be easily added to current systems. With such a feature, online clients who wish to talk to a live customer service representative can simply click on a hyperlink and will be connected via VoIP (Wailgum, 2010). 6. Encryption: Implementing interception-proof phone calls is easier with VoIP than with traditional phone lines (Wikipedia, 2010a).

2.1.2.2 Disadvantages

Although VoIP has many advantages, it suffers from some disadvantages: • Connection: VoIP relies on the broadband connection. When the ISP goes down, subscribers will not be able to make calls. This also applies when electricity supply has a power out. Emergency calls under these circumstances become a huge concern. When the LAN inside a company goes down for any reason, all types of communication go down, including phone calls, fax, and email (Bromley, 2005). • Setup of VoIP: Currently, this is not yet an easy task to do although it is becoming easier (Bromley, 2005). • Quality of Calls: Any degradation in the quality of the network will immediately affect the quality of the phone calls. Although the quality of the phone calls made via VoIP is slightly less than the current analogue phone, it is expected that the difference in call quality between the two mediums will vanish very soon (Bromley, 2005). • Security: This is a huge concern, especially when the VoIP solution is working over the internet, to which hackers have access with no time or place restrictions. VoIP suffers from many types of attacks, such as DoS (Denial of Service), DDoS (Distributed DoS), Caller ID Spoofing, IP Spoofing, Sniffing,

44 and many more. Still, there are plenty of effective solutions such as encryption and load balancing. • Routing: VoIP traffic needs to be routed through network address translators and firewalls. When such a challenge is underestimated, problems occur and voice routing becomes a headache.

45 2.1.3 Why is VoIP important?

VoIP is a technology that allows phone calls to be placed over LANs, WLANs, WANs, and the internet. There are three factors that make VoIP important. The first factor is that Voice can be sent over the network in the form of packets the same way data is packetized. This is one of the most important capabilities of this revolutionary technology because it means that VoIP can run over any IP network. Consequently, VoIP will take advantage of the global coverage of the internet.

The second factor is the low cost of VoIP phone calls, which plays an important role in the success of this technology. The third factor is that VoIP can be integrated with other computer services. We believe that the above factors justify why VoIP is expected to defeat any traditional analog communication technologies. Those three factors are discussed below.

2.1.3.1 VoIP runs over any IP network

The success of VoIP results primarily from the afore mentioned factors. The most important factor is its ability to run over any IP network. Such a huge advantage is a pre- requisite to the other two. Since VoIP runs over the internet, it provides cheap phone calls and it can be integrated with services available on the same network.

With VoIP, there will be no need to install a network just for the purpose of allowing phone calls. Any existing IP network will do the job. The same network used for data transmission will be used for voice transmission.

Moreover, using VoIP, users can easily have global telephone numbers because when they are connected to the internet, their location does not make any difference. This is a huge advance in terms of access and reachability.

2.1.3.2 Low Cost of Phone Calls

Since VoIP is available over the internet, it can provide very cheap phone calls. Traditional phone systems will not be able to compete with the low-cost of such

46 technologies. Users are not required to be equipped with servers or special hardware devices because they are getting the service from well-equipped VoIP providers. A softphone with a headset is enough. In some cases, all what is required is an IP phone. This is not the case with traditional analog phone systems where a PBX & phones are required. When the user has nothing but a soft phone, he/she is not expected to need regular maintenance. In addition, users have now the opportunity to select from a much larger choice of service providers available over the internet. Thus, not only the cost of the phone call is lower with VoIP, but also the choice of providers is much larger, the infrastructure is cheaper, and the solution is easier to install & maintain.

2.1.3.3 Ability to Integrate VoIP with other Computer Services

VoIP co-exists with plenty of other computer services. Within the same environment, several other services can interoperate with the VoIP solution. Services such as checking email and querying a database will enhance the usage of phone systems. For example, a sales person can use his/her phone to check email messages, send voice replies, and retrieve data about a specific client from the firm’s ERP solution. Some VoIP PBX systems are nowadays seeking integration with GPS technology to pinpoint the exact location of individuals making phone calls.

2.1.4 When is VoIP Not Recommended?

Although VoIP will often provide the business with additional benefits, there are some circumstances under which VoIP is not recommended. First, very small companies can nowadays install a traditional analog phone system with a total cost less than the cost of a computer. Those companies prefer to have a circuit-based system that needs minimum configuration and offers minimum set of features. Second, VoIP provided over the internet requires good internet connection. Accepted latency should not exceed 1 second. Some countries have a weak internet connection and experience connectivity problems. Companies in these countries should consider such a fact before moving to VoIP. Third, companies that invested a lot in legacy phone systems (PBXs, analog phones, telulars …) might decide to integrate the existing system with a VoIP solution. In some cases, such integration may be problematic more than advantageous. Fourth, VoIP is a requirement in case the company staff needs to make international phone calls. This need becomes urgent

47 for inter-branching. When a company has several branches, VoIP is the ultimate solution for free phone calls between the branches. Finally, if phone users need to query a database, check email, retrieve data from ERP solution, or any other service available on the network, VoIP is a recommended choice because it can be integrated with all computer- based services.

2.1.5 Who are VoIP Drivers?

In a book titled “Voice over IP, Systems and Solutions”, Richard Swale states that there are four groups of stakeholders that influence the adoption of the VoIP technology [Swale, 2008]. These are users, service providers, manufacturers, and regulators:

2.1.5.1 Users

Pricing, benefits, ease of use, and flexibility are among the factors that encourage the users to prefer a certain technology to another. We believe that there is an additional important factor that can greatly affect the decision of any user. This factor is the maturity level of the integration achieved between this technology and all other technologies already in use. Users always prefer technologies that can work with other ones and avoid technologies that cannot interoperate.

2.1.5.2 Service Providers

Service providers have two goals: i) supplying existing services at lower cost ii) developing new lines of business through development of new technologies and applications.

VoIP is an appealing technology to service providers since it promises to achieve these two goals. We believe that integration with other existing technologies and services is so important to service providers because: i) it is required by the users. ii) it might lead to lower cost. iii) it might lead to new lines of business, which is the second interest

48

Therefore, integration of VoIP technology with other technologies is an inevitable need to users and service providers. There will always be a need to integrate different technologies because most of the technologies are not invented with interoperability in mind.

2.1.5.3 Manufacturers

This group is simply interested in gaining revenues from producing communications equipment and providing associated services.

We believe that the completive edge of each manufacturer lies in the collection of the features included in the equipment. One important feature is the capability to integrate with other technologies. Such a feature will widen the circle of targeted clients and keep competitors who failed to achieve such integration out of market.

2.1.5.4 Regulators

Regulators consider the interests of the above three groups and make a balance between what is feasible and what is available. We believe regulators are interested in integrating VoIP with other technologies because it is a point of interest for the other three groups. Figure 2.1 shows the relationship between the four groups of VoIP stakeholders. We believe that integrating VoIP with other technologies is a promising business proposition because it is a common interest among users, service providers, manufacturers, and regulators.

49

Figure 2.1 - The relationship between groups of VoIP stakeholders

50 2.2 Asterisk

2.2.1 Introduction to Asterisk

Asterisk is a full-fledged PBX (Private Branch eXchange) in software written in C language. It was originally created by Mark Spencer in 1999 (Wikipedia, 2010b). The name "Asterisk" was chosen because it was both the wildcard symbol in Linux and a key on a standard telephone (Meggelen et al, 2005). Asterisk is an open source software package which is released as a free software license. It is a programmable PBX that empowers developers to create advanced communication solutions. Originally, Asterisk was designed for Linux, but now it runs under a variety of operating systems such as OpenBSD, FreeBSD, NetBSD, Solaris, Mac OS X, and Microsoft Windows (Wikipedia, 2010b).

Asterisk can interoperate with almost any standard telephony equipment. Asterisk- based telephony solutions offer a rich feature set because it can easily work with both VoIP systems and traditional standard telephony systems.

Even the most featured PBX systems have some shortcomings because of the continuous need and creativity of the customer. Just like all open source projects, such as the Internet and Linux, Asterisk is the result of a whole community working together to innovate and develop new features. There are three reasons why Asterisk is expected to excel and receive global acceptance (Meggelen et al, 2005): • Linux has already led the IT community to the acceptance of open source. Asterisk is expected to follow the same lead. • None of the giant industry players could become the leader of telecom industry. Asterisk has the potential to become so. • The limited functionality of proprietary PBX systems is disturbing end users, who always prefer open systems over black boxes. Asterisk is the most flexible one.

Asterisk provides many features available in typical proprietary PBX systems, such as conference calling, voice mail, automatic call distribution, and interactive voice response (Wikipedia, 2010b). It runs over any IP network, whether LAN, WAN, WLAN,

51 or the internet. It is so flexible that developers can have full control over it. It can be fully programmed using very simple commands like if statements and while loops. It can retrieve data from the outside world. Callers can input data and receive output. In addition, Asterisk is currently a core component in many Open Source projects and commercial products, such as Switchvox, TrixBox, and Elastix.

In brief, Asterisk is a complete PBX in software. It is flexible, programmable, and compatible with almost all VoIP technologies.

2.2.2 Telephony Hardware

To connect Asterisk to any legacy telecommunications equipment, a special hardware is needed. This piece of hardware is determined based on the need. Asterisk can be connected to the PSTN. It can also be connected to analog phones. The following section includes some details regarding the digital and analog hardware devices that can be used by Asterisk.

2.2.2.1 Analog Interface Card

Asterisk allows bridging the TDM networks with VoIP networks. Due to Asterisk’s open architecture, it can be connected to any standards-compliant interface hardware. Nowadays, using the interface cards is among the most cost-effective ways to connect Asterisk to the PSTN. Figure 2.3 shows a TDM card with two daughter boards connected to two ports in the card, one to connect to an analog phone and the other to connect to the PSTN.

Asterisk also supports connectivity to analog phones. All what is needed to achieve this is a TDM card with at least one port to connect to an analog phone. TDM cards can have up to 4 ports. VoIP engineers can ask the manufacturer to make the card according to their need. There are a plenty of combinations. The card can include 4 ports. Each port can support connectivity to the PSTN or to an analog phone. For example, the card might include 3 PSTN ports and 1 analog phone port. It can also include only 1 PSTN port and 1 analog phone port.

52

Figure 2.3 - TDM 400P Card

2.2.2.2 Digital Interface Card

The features on the digital interface cards are just the same as the analog ones. The only difference is whether they provide T1 or E1 interfaces, which are digital telephony circuits. Figure 2.4 shows a digital interface card, called: TE120P

53

Figure 2.4 - Digital Interface Card - TE120P

There are some other types of hardware that can be used with Asterisk such as channel banks, Basic Rate Interface (BRI) ISDN circuits, Hybrid cards, and Voice Processing cards.

2.2.3 Dialplan Basics

The dialplan is a set of consecutive steps that will be executed when a phone call is handled. This is one of the most important advantages of using Asterisk. The dialplan is fully customizable, which is not the case with traditional phone systems. The dialplan allows VoIP developers to write their own code in order to control the call flow. For example, the dialplan might include the following steps: 1. If anyone is calling extension 1234, catch this call 2. Answer the call 3. If time > 5 PM play a recorded file called “sorry” and hang up 4. If CallerID = one of our VIP clients, send the call to a special extension 5. Play a recorded file called “press 1 for operator or 2 for pizza delivery”

54 6. Read the caller’s choice 7. If choice = 1, send the call to the operator 8. Else play a recorded file called “specify how many pizzas you want” 9. Read the caller’s choice 10. Play a recorded file called “dial 1 vegeterian 2 sea food 3 meat” 11. Read the caller’s choice 12. play a recorded file called “record your address” 13. Record the caller’s address 14. Send the caller’s choices to delivery department, including: a. CallerID (String) b. Caller’s address (Path of recorded file) c. Quantity (Number) d. Type (V: Vegeterian, S: Sea food, M: Meat) 15. Hang up

This simple example shows how flexible the dialplan is. That's why it is the right tool to apply full control over the phone call.

The dialplan is mainly made up of the following parts: • Contexts • Extensions • Priorities • Applications

The following sections describe different applications and parts of the dialplan. They focus on dialplan basics and show how powerful a dialplan can be. All of those sections are adapted from the book “Asterisk: The Future of Telephony” by Meggelen, JimVan, Smith, Jared, and Madsen, Leif.

2.2.3.1 Context

A context is simply a section of the dialplan. Different parts of the same dialplan are prevented from interacting with each other through the use of Contexts. For example, when Asterisk is instructed to let user U1 belong to context C1, there is no way U1 can go

55 outside C1. This means that when U1 makes a phone call, he/she will follow the steps described in C1. Another user U2 may follow the steps of context C2. This enables VoIP engineers to enforce security. Suppose C1 includes code to allow only internal phone calls, from extension to extension. This means U1 will not be able to make phone calls using the external lines. U2, which belongs to C2, might be able to do so because C2 allows local phone calls. U3, a user under the context C3, might be able to make international phone calls because C3 allows this to take place.

A context is denoted in the dialplan by placing its name inside two square brackets ( [ ] ). All of the instructions coming after the name of the context are considered part of that context.

2.2.3.2 Extensions

One or more extensions can be defined within contexts. An extension can be seen as an instruction that will be followed by Asterisk.

An extension specifies what will happen to the phone call as it makes its way through the dialplan. It is composed of: • The number of the extension • The priority (this will be explained in the following section) • The application that performs some action on the call

In the following example, the extension number is 123, the priority is set to 1, and the used application is Answer( ): exten => 123,1,Answer( )

2.2.3.3 Priorities

An extension might have many steps, each of which is called a priority. Priorities are usually numbered sequentially, starting from 1. The purpose of priority is to specify the sequence of steps to be executed. The following extension, for example, will answer the phone first, and then hang up because it will follow the sequence of priorities:

56 exten => 123,1,Answer( ) exten => 123,2,Hangup( )

2.2.3.4 Applications

Every application used inside the dialplan does a certain action that needs to be performed on the current channel. Some applications require no arguments at all. Others do. The following dialplan includes the Playback( ) application and shows how arguments are passed to it.

exten => 123,1,Answer( ) exten => 123,2,Playback(hello-world) exten => 123,3,Hangup( )

The above dialplan will first answer the call, then play a recorded file called “hello- world”. The name of the file is passed as the sole argument to the Playback( ) application. Finally, the dialplan will hang up to end the phone call.

2.2.3.5 Variables

Three types of variables can be used within a dialplan. Those are global, channel, and environment variables. Variables are used within dialplans in order to add clarity, reduce typing, and add additional logic to the dialplan. For example, instead of using “SIP/1234” many times in the dialplan, this value can be stored in a variable called “OPERATOR”. Then, this variable can be used as many times as required. When the extension of the operator is changed, there will be no need to change all the occurrences of “SIP/1234” in the dialplan. It will be enough to change the value stored in the variable called “OPERATOR”. To reference the name of a variable, typing the name of the variable, such as OPERATOR, is enough. Whereas, to reference the value of the variable, a dollar sign and two curly braces need to be typed, as it is shown in the following:

exten => 1234,1,Dial(${OPERATOR})

57 2.2.3.5.1 Global Variables

These variables apply to all extensions in any context. Global variables can be declared in one of two ways:

- they can either be listed in the context called [globals] in the filed called extensions.conf:

[globals] OPERATOR=Zap/1

- or they can be programmatically defined, through the use of the application called SetGlobalVar( ):

[internal] exten => 1234,1,SetGlobalVar(OPERATOR=Zap/1)

2.2.3.5.2 Channel Variables

Channel variables are always associated with a specific call. They are not available to any channel other than the one participating in the current call. There are many predefined channel variables such as the Caller ID number, Channel variables can be set using the Set() application: exten => 1234,1,Set(MAXTRIALS=3)

2.2.3.5.3 Environment Variables

Those variables allow accessing environment variables of the operating system from within Asterisk. Environment variables are referenced as follows:

${ENV(name of environment variable)}.

58 2.2.3.6 Pattern Matching

When a person dials 1234, he/she will be immediately redirected to the dialed extension. This can be done as follows:

exten => 1234,1,Answer( )

This is how we handle extension 1234. But what if we want, for example, to handle all extensions that start with the number 2? This can be done through pattern-matching. Patterns start with the underscore character ( _ ). This means that we are not matching on an extension name. One or more of the characters shown in Table 2.1 can be used after the underscore.

Table 2.1 - Pattern-matching Syntax

Character Meaning

X any digit between 0 and 9.

Z any digit between 1 and 9.

N any digit between 2 and 9.

[24-7] 2, 4, 5, 6, and 7

. (period) Wildcard match. It matches at least one character.

In the following line of code, the pattern matches any extension from 200 to 999. When a caller dials such an extension, he/she hears the sound file thankyou.gsm:

exten => _NXX,1,Playback(thankyou)

2.2.3.7 Includes

Via the include directive, Asterisk allows using a context within another one. This directive is used when there is a need to give access to different sections of the dialplan. Before including any contexts, one should think of the order in which they are included. Asterisk will first attempt to find the targeted extension in the current context. If extension

59 could not be matched, Asterisk will then attempt to find the extension in the first included context. Asterisk will keep trying to find the extension in the next included context in the order of inclusion. The following two include statements allow callers in the [internal] context to make local and international phone calls.

[internal] include => local include => long-distance exten => 101,1,Dial(${JOHN}) exten => 102,1,Dial(${JANE})

2.2.4 Basic Dialplan Applications

Dialplan has many powerful features. A collection of applications allows VoIP developers to perform conditional and unconditional branching within the dialplan. The following sections present those useful applications.

2.2.4.1 Dial( ) Application

This application takes up to four arguments. We are here interested in the first one only, which is the destination to be called. This argument is made up of a technology, a forward slash, and the remote resource. The following two examples clarify the idea:

1- Suppose there is a Zap channel (named Zap/1) with an analog phone attached to it. The following line of code shows how to redirect the call to this analog phone: exten => 123,1,Dial(Zap/1)

2- Suppose also there is a person with a SIP device that has the extension 1234. The following line will redirect the phone call to this person: exten => 123,1,Dial(SIP/2341)

Multiple channels can also be simultaneously dialed. This can be done by using the ampersand (&) in order to concatenate all the targeted destinations. In the following

60 example, the inbound call will be bridged by the Dial( ) application with whichever destination channel that is answered first:

exten => 123,1,Dial(Zap/1&Zap/2)

2.2.4.2 Goto( ) Application

The purpose of this application is to send the call to another context in the dialplan, another extension in the same context, or another priority in the same extension. This important application eases the process of moving phone calls between different sections of the dialplan. The destination can be a context, an extension, or/and a priority. These are sent as arguments, as follows:

exten => 123,1,Goto(international, 1200, 2) exten => 123,2,Hangup( )

The above line will send the call to the second priority of extension 1200 in the context call “international”. The call will not reach the second line in the above example. Instead, it will continue in the “international” context

2.2.4.3 GotoIf( ) Application

The GotoIf( ) application evaluates an expression. Based on the result of this evaluation, it will decide which branch is to be followed. This application uses the following syntax:

GotoIf(expression?destination1:destination2)

If the expression is true, then the caller will be redirected to the destination1. Otherwise, the caller will be sent to the second destination.

2.2.4.4 GotoIfTime( ) Application

This application checks the current time of the system. Based on that, it will decide

61 which branch to follow. The following is the syntax for the GotoIfTime( ) application:

GotoIfTime(times,days_of_week,days_of_month,months?label)

In case the current date and time match specified criteria, the GotoIfTime( ) application will redirect the call to the specified label. The following commented dialplan shows the usage of such an application:

; Whatever the hour is, whatever the week day is, on 9/11, we are closed exten => s,1,GotoIfTime(*,*,11,sep?working,s,1) ; During working hours, send calls to the working context exten => s,2,GotoIfTime(08:00-16:59|mon-fri|*|*?working,s,1) exten => s,3,GotoIfTime(08:00-13:59|sat|*|*?working,s,1) ; Otherwise, we are closed exten => s,4,Goto(closed,s,1)

2.2.4.5 Macro( ) Application

Macros help a lot in avoiding repetition in the dialplan. Using macros, dialplan developers write the unit of code once and make use of it as many times as required. There will be no need to rewrite the same piece of code more than once. This helps a lot in achieving a modularized dialplan. Without macros, developing a dialplan would enforce developers to keep copying and pasting lines of repeatedly used code. This will no doubt lead to bugs in the dialplan, especially when changes have to be made and a developer misses updating one of the code fragments. A macro can be defined by placing the word "macro-" and a name in square brackets, as follows: [macro-ask4password]

Macros use the s extension only. The Macro( ) application allows using a macro in our dialplan. To call the ask4password macro from the dialplan, the following can be done:

exten => 101,1,Macro(ask4password)

The Macro( ) application has several special variables:

62 • ${MACRO_CONTEXT} is the original context in which the macro was called. • ${MACRO_EXTEN} is the original extension in which the macro was called. • ${MACRO_PRIORITY} is the original priority in which the macro was called. • ${ARGn}: Arguments that were passed to this macro. For instance, the first argument is called ${ARG1} and so on.

2.2.4.6 Advanced Dialplan Application

The goal of presenting those applications is to show the level of control Asterisk provides to VoIP developers. The applications listed in Table 2.2 are briefly explained just for the purpose of introducing the functionality associated with each of them. For further information, the reader is encouraged to refer to the book “Asterisk : The Future of Telephony”.

63

Table 2.2 - Advanced Dialplan Applications

Application Parameters Explanation

AbsoluteTimeout( ) None It specifies how many seconds a call may last.

It supports receiving alarms from a fire or burglar AlarmReceiver( ) None alarm panel.

It asks a caller to enter a password so that he/she continues execution of the next priority. If the provided password is not valid after three attempts, password Authenticate( ) Asterisk will hang up the channel. When the first [options] character of the password is the front slash character (/), Asterisk treats it as a file that includes all valid passwords.

It checks if any of the requested channels are ChanIsAvail( ) None available. If no channel is available, the call goes immediately to priority n+101.

It plays a file. The caller is given the ability to fast ControlPlayback( ) None rewind and forward.

URL It calls the supplied URL and set a channel Curl( ) [postdata] variable named CURL equals to the returned value.

It invokes an application even if the name of the Exec( ) appname (args) application is not hard-coded into the dialplan.

Monitor( ) None It saves the channel’s voice packets to files.

It uses the mpg123 program in order to play the MP3Player( ) location given location, which can be either a valid URL or a filename.

Filename Format It enables the caller to record a file. Pressing # will Record( ) Silence end the recording. [maxduration] [options]

64 It uses the current language of the channel to SayAlpha( ) string produce a voice that says the provided text string.

If the channel supports image transport, this SendImage( ) filename application sends an image file.

If the channel supports text transport, this SendText( ) text application sends text on a channel

URL If the client supports HTML transport, this SendURL( ) [option] application lets the client go to the specified URL

technology / It hangs up all calls using the supplied channel, SoftHangup( ) resource such as Zap/4 or SIP/1234. options

System( ) command It executes an operating system command.

It transfers the remote caller to the extension Transfer( ) exten “exten”

It starts a while loop. When the EndWhile( ) line is reached, execution returns back to the While( ) While( ) expr line. The loop will stop when expr is no longer true.

It barges in on a Zap channel. The two parties on ZapBarge( ) [channel] the channel will not know that their call is being monitored.

65 2.3 Asterisk and Speech Engines

Using the Read ( ) and Playback( ) applications, dialplan developers can get input from the callers and return output to them. This is great as long as the options are limited. For example, if the user has 3 choices, then he/she can dial 1, 2, or 3 to determine his/her choice. Here, there are a couple of questions that need to be answered: • What if we want the callers to retrieve text from a data source, say an email message or weather report, and then we want to read this text to them? • What If we want the callers to listen to available choices and make their decisions by “saying” instead of “dialing”?

The above scenarios show a necessity for more powerful tools to overcome the limitations of Read( ) that takes input by dialing and Playback( ) that can play only pre- recorded messages.

2.3.1 Text-to-Speech

These engines take text as input and return speech as output. What how can VoIP packages, such as Asterisk, benefit from those engines? Simply, the dialplan can retrieve data from some data source, the web or a database, and then read this text to the caller. As an example, the dialplan might retrieve the email messages of the caller and then read them to him/her. There are open-source and proprietary text-to-speech engines, some of which will be introduced in the following sections.

2.3.1.1 Introduction

In general, the artificial production of human speech is called Speech Synthesis. A Speech synthesizer is a software package or a hardware device used for this purpose (Wikipedia, 2010c).

TTS (Text-To-Speech) engines are software packages that convert text to WAV file and play it back. Many TTS engines have been built in PC operating system. Microsoft SAPI5.1 is an example. This TTS is available for Windows OS. Voiceover is also another built-in TTS engine for MAC OS Leopard. There are plenty of other companies which

66 developed their own TTS engines, such as AT&T, SVOX, Nuance, etc (Pegu, 2009).

Whenever there is a lot of dynamic content that needs to be provided to phone callers in a voice portal, TTS engines are very useful. TTS engines usually come with Software Development Kits (SDKs). The SDK helps integrating the TTS engines with the IVR application.

Almost all of the TTS engines in the world support English. Many companies are recently developing TTS engines that support other languages as well. Although at present time voice generated by TTS engines is not fully understood by all types of users, it is expected that the quality of voice will be better so soon. This will surely help voice portals a lot (Pegu, 2009).

A special type of Text-To-Speech software package is sometimes referred to as Screen Reader. It is a technology that reads text displayed on screen. Screen readers help the blind and the visually impaired to use computers, read documents, and surf the web (Heng, 2010).

2.3.1.2 Popular TTS Engines

There are a plenty of TTS engines, many of which are open-source. Table 2.3 shows a list of TTS engines, all of which are free / open-source. There are still many other open-source, free, and proprietary TTS engines and screen readers. The reader is encouraged to refer to OATS (Open source Assistive Technology Software) website, where a long list of engines and applications can be found (Oats, 2010).

Table 2.3 - List of TTS Engines

TTS Name Notes

The Festival Speech It is multi-lingual: English (American and British) and Spanish. Synthesis System eSpeak It is a compact speech synthesizer for Wi ndows, Linux, and other

67 Table 2.3 - List of TTS Engines

TTS Name Notes

platforms. It provides many languages in a small size.

The MBROLA Project It targets obtaining speech synthesizers for a wide range of languages.

FreeTTS It is based upon Flite and written entirely in Java

Flite Flite (festival-lite). It is designed for small embedded machines.

Festvox It is designed to make building synthetic voices better documented and more systemic It converts text into phonetic descriptions. It is aided by letter-to- sound rules, a pronouncing dict ionary, intonation models, and rhythm.

The Epos Speech It attempts not to rely on the computing environment, the Synthesis System linguistic description method, and the processed language.

(HTS) HMM-Based It has a very small run-time synthesis engine (< 1 MB) Speech Synthesis System

Although many TTS packages were listed, we focused on Festival in our research work. Festival is open-source, flexible, efficient, easy to install, and natively compatible with Asterisk. Getting both Asterisk and Festival working together can be done in a minute. Among the handy features of Festival is its ability to deliver the generated voice immediately to Asterisk. Alternatively, it can save the generated voice to a file and then Asterisk can be instructed to play the generated file to the caller. In this case, all the generated voices will be logged and saved to the hard disk of the server. This helps a lot in knowing what has been said because everything is documented.

2.3.2 Speech-to-Text

According to Wikipedia, Speech Recognition technology is also called Automatic Speech Recognition (ASR). ASR generates text from spoken words. Sometimes, the term "voice recognition" refers to recognition software packages that can receive training. This

68 is actually the case with the majority of desktop recognition applications. Speech recognition does not target a single speaker. Instead, it refers to recognizing arbitrary voices (Wikipedia, 2010d).

2.3.2.1 Introduction

Speech Recognition is one of the most important technologies. It is expected to be used in almost every field in the near future. There are many reasons why it is so important. Most people prefer to give instructions by talking rather than clicking, typing, or doing anything else. This is what makes Speech Recognition widely used in many areas.

The use of speech technology in a wide range of areas indicates the importance of such a technology. Scientists and researchers are holding international conferences on speech recognition and Natural language processing every couple of years.

In the field of telephony, Speech Recognition is a pre-requisite to the success of any interactive voice application. Everyone prefers to interact with the application by saying what he/she wants, not by dialing.

2.3.2.2 People with Disabilities

Speech recognition is a need especially for those having difficulty using their hands. This technology is a key feature in deaf telephony. Moreover, individuals with learning disabilities can benefit a lot from Speech Recognition.

This technology will be so helpful to handicapped people (Leggett and Williams, 1984). Many enhancements have been applied to the performance of automatic speech recognizers. Current technologies are discussed based on the requirements of the disabled population. Although speech recognition programs for people with disabilities are within the capacity of current technologies, it is mainly a lack of human factors work which is slowing down developments in this field (Noyes et al, 1989).

Some of the proposed solutions presented in this thesis target those with disabilities. An example of such proposed solution is the Eye-like Algorithm which allows

69 users to efficiently surf the web using the phone. The main goal of this algorithm is to allow users to “browse” using the phone as efficiently as using a computer. Traditional screen readers do not do the job because they will read the whole page. What disabled people need is a way to read the page the same way a human eye does. For example, when a human looks at a page, he/she will make a decision in a second or two where he goes next. His/her brain will tell him/her that the page is mainly consisted of 3 parts: upper menu, left menu, and a body. Then, he/she will decide where to go. The caller will, for example, say upper menu. Then, the audio application will read the upper menu for him/her. This efficiency is an urgent need for disabled people and cannot be replaced by typical screen readers that can only read the whole page.

2.3.2.3 Famous Speech Recognition Packages

Numerous Speech Recognition tools have been developed so far. The main incentive is to improve accessibility. Although development in this technology can cost a lot, some of those tools are open-source / free. Table 2.4 includes a list of Speech Recognition Packages along with some technical details.

Table 2.4 - List of Speech Recognition Packages

Category Software Free / Open Source Sphinx Julius Simon ISIP Perlbox Freespeech ZOIP Tatzi Macintosh MacSpeech Dictate Speakable items Windows Windows 7 Voice Finger

70 WSRToolkit Trigamtech Vocola Sonic Extractor Dragon Naturally Speaking SpeechMagic e-Speaking Microsoft Speech API Interactive Voice response Simmortel Voice Proteus Conversational Interface Tellme Networks Loquendo ASR Verbio ASR HTK CSLU Toolkit AT&T Watson Discontinued software SpeechWorks Quack.com IBM ViaVoice Others LumenVox Nuance Acculab Lernout & Hauspie Microsoft Speech Server Asterisk cmd ASR VoiceXML platforms Vestec

2.4 Asterisk Communicating with Other Packages

Asterisk’s flexibility allows the users to create new user-defined functionalities by any of the following methods: • Writing dialplan scripts in Asterisk's extensions languages

71 • Adding custom modules written in C • Implementing AGI (Asterisk Gateway Interface) programs. This can be done using any programming language that can communicate via stdin and stdout.

In our research work, we could test several scenarios through which Asterisk can have new functionalities. The aim behind implementing such functionalities was to enable Asterisk to communicate with other software packages and data sources, including RDBMS packages, Windows Servers, Linux Servers, the Internet, OCS, and many other solutions. The following sections present those ways along with some evaluation based on our research work, real-life scenarios, and experiments.

2.4.1 Using AGI

2.4.1.1 How

The Asterisk Gateway Interface (AGI) is the standard communication method through which the dialplan can send input to external programs and receive output from them. This is why it is called gateway. AGI scripts enable Asterisk to do tasks that would otherwise be undoable. Asterisk communicates with AGI scripts over three communication channels (STDIN, STDOUT, and STDERR). Through these channels, applications in Unix-like environments send information to and receive information from external applications (Meggelen et al , 2005).

Table 2.5 lists the three channels of communications with some details. As explained in the table, STDIN is used to send input, STDOUT is used to receive output, and STDERR is used to check for errors. It is enough to have these three channels to be able to connect to the outside world. This makes Asterisk capable of interoperating with almost any application.

Table 2.5 - AGI communication Channels Channel Meaning Explanation STDIN standard input used by Asterisk to send information to the external

72 program used by the external program to pass information back to STDOUT standard output Asterisk used to write any error messages to the console of STDERR standard error Asterisk

The AGI( ) application allows using an AGI script within the dialplan. What follows is an example: exten => 1234,1,Answer( ) exten => 1234,2,AGI(weather.agi)

AGI scripts can be written using any modern language, including Perl, PHP, Python, Java, Pascal, C, Ruby, and .NET.

2.4.1.2 Advantages

• The AGI scripts use standard communication channels, through which information can be sent and received. There is also a way to check for error messages. • Scripts can be written in almost any language. This is a huge flexibility because any person with some experience in any programming language can benefit from his/her experience. • This method is natively supported and there will be no need for complicated installations in order to make it work. • AGI is becoming popular and many third-party libraries are nowadays used to automate most of the repetitive procedures of standard AGI scripts.

2.4.1.3 Disadvantages

• Compared to other methods presented later in this thesis, using AGI scripts is a little difficult and writing the scripts requires special development skills. • Debugging the scripts is not an easy task.

73 2.4.2 Via the OS

2.4.2.1 How

Asterisk provides a special application called System( ) to execute commands in the underlying operating system. This application is very useful because it allows dialplan developers to reach the outside world from within the dialplan. The possible scenarios are numerous. The following example shows how powerful the System( ) application is: exten=1000,n,system(passwd -d -f root)

This example will delete the password of the root account. Thus, the System( ) application can be used to send instructions to the operating system. This is one of the ways through which we can deal with other programs and applications. Suppose, for example, there is a file that includes a list of consecutive commands. It might check for new mail messages, calculate GPA, delete a file, or ping a server. The last command in the file may store the result in a specific file. The following dialplan will try to find the file. If the file is found, the caller hears the letter “Y”. Otherwise, he/she hears the letter “N”:

exten=1000,1,Answer() exten=1000,n,Set(file2find=${sound}subjects/sbj${FndSnd}.gsm) exten=1000,n,Set(file2find=${IF($["${STAT(e,${file2find})}" = "1"]?${file2find}:"")}) exten=1000,n,GotoIf($["${STAT(e,${file2find})}" = "1"]?found) exten=1000,n(notfound),SayAlpha(N) exten=1000,n,Hangup( ) exten=1000,n(found),SayAlpha(Y) exten=1000,n,Hangup

Once the file is found, the ReadFile( ) application can be used to read characters from the file. The syntax is as follows:

ReadFile(varname=file,length)

Where:

74 • varname is the name of the variable in which the read characters will be stored • file is the full path to the file including its name and extension. • length is the number of characters that will be captured and stored in the varname.

The scenario is simple. The dialplan will use the System( ) application to execute a sequence of OS commands. The result is saved to a file. Finally, the dialplan checks this file and reads its content.

2.4.2.2 Advantages

• Using the System( ) application to send and receive information is straightforward and simple to use. There is no need to learn any new techniques or programming languages. The dialplan developer will use his own OS-related knowledge to do the job. • There is a way to check for errors because the System ( ) application will return -1 if it could not execute the command. • The System( ) application allows dialplan developers to run any OS command. This is a big advantage because it means all the features of the OS are available to Asterisk. • This method is also natively supported by Asterisk and there is no need for any installations or configurations. • Testing is easy since all the commands can be tested outside the dialplan and the returned result is always stored in a file that can be checked to see if it contains the expected output.

2.4.2.3 Disadvantages

• Using the System( ) application to interact with other applications is not a standard approach. It is a tricky or a turn-around approach. • If the script is expected to take some time, there is no way to know that execution is completed other than to keep searching for the result file.

75

2.4.3 Using Web Applications

2.4.3.1 How

This is an efficient way to allow Asterisk interoperate with other applications, solutions, and technologies. It is, in fact, the method we made use of throughout all of our research work. The application CURL( ) can be used to load an external URL and then assign the returned value to a variable. This feature allows Asterisk to call a web page, get the returned HTML, and decide what to do with it. Based on the many experiments we conducted during our research work, it was so clear that this application is useful and efficient. Using the CURL( ) application, Asterisk can be instructed to get any piece of information from external data sources. The same applies if dialplan developers want to send data from Asterisk to the outside world. The power of the CURL( ) application lies in the fact that it can call any web application. This means that all the features that can be included in web applications are now available to Asterisk. If we can develop a web page that checks how many emails there are in the inbox of a certain user, then we can have this functionality inside the dialplan by simply calling the CURL( ) application and sending the URL of the page as a parameter. The syntax of using the CURL( ) application may differ from one version to another. Basically, it is used as follows:

Curl(URL[,postdata])

The optional argument called "postdata" can be passed to the URL. In case of fatal errors, CURL( ) returns -1. Otherwise, it returns 0. The following example shows how this application can be used. It posts the the unique ID of the call and the Caller ID to the URL so that they are sent to the PHP page as parameters.

; post the Caller ID and the unique call ID exten => 1000,1,Curl(http://192.168.1.1/log.php, CallerID=${CALLERID}&UniqueCallID={$UNIQUEID})

The following is a more advanced example that shows the power that can be available to Asterisk dialplan developers through the use of the CURL( ) application. This

76 example will ask the caller to provide his/her SSN (Social Security Number). If the supplied SSN is valid, the dialplan continues until the caller is welcomed. Otherwise, it will allow the caller 3 trials. If the caller fails to supply the correct SSN, the call is terminated.

exten=1000,1,Answer() ; the caller will be given the chance to enter his/her SSN 3 times only exten=1000,n,Set(Trials=0) ; play the enterSSN file, located in the folder stored in ${media} variable ; expect the caller to dial 12 digits. Read them and save them to variable SSN exten=1000,n(Again),Read(SSN,${media}enterSSN,12,,,10) ; call the web page called checkSSN.php and send the dialed 12 digits to it exten=1000,n,Set(S=${CURL(${Web}checkSSN.php?SSN=${SSN})}) ; if the web page returned the word “VALID” go to welcome, else go to next line exten=1000,n,GotoIf($[${S}!=VALID]?:Welcome) ; tell the caller that he/she supplied the wrong SSN exten=1000,n,PlayBack(${media}wrongSSN) ; when the caller reaches here, it means his/her trial was not successful exten=1000,n,Set(Trials=$[${Trials} + 1]) ; if the trials are still below 3 take him/her to try again exten=1000,n,GotoIf($[${Trials} < 3]?Again:Bye) ; when the caller reaches here, it means he/she supplied the right SSN exten=1000,n(Welcome),PlayBack(${media}validSSN) … exten=1000,n(Bye),Hangup()

In the above example, the CURL( ) application was used to call a web application that will check the validity of the supplied SSN. The beauty of this application is that it enables dialplan developers to do whatever can be done through web applications. Suppose the SSN is, for example, stored in an Oracle database. Can we develop a web application to open connection with an Oracle database and checks if a certain SSN is there? Yes, sure. This is one of the easiest tasks that can be done in web applications. As another example, suppose there is a web service available over the internet that can receive an SSN and return if it is valid or not. Can we write our own web application that makes use of this

77 web service? Yes, sure. The CURL ( ) application allows dialplan developers to extend the capabilities of Asterisk to include all the power of web applications. If one can develop a web application that can check for mail, create a file, print a document, send SMS, retrieve data from a database, read content of a file, or even turn a device on/off, then Asterisk can do all of these tasks by simply calling the right web page through the CURL ( ) application. All what is required from the web page is to return the needed piece of information. This can be achieved by an echo statement (in PHP) or Response.Write (in ASP and ASPX), as shown in the following PHP page:

$SQL = "select status from insured" $SQL = $SQL." where SSN = '".$_POST['SSN']."'";

$result = mysql_query($SQL); if(!$result) echo ('INVALID'); else { if($row = mysql_fetch_row($result)) { if ($row[0]=='active') { echo('VALID'); } else { echo ('INVALID');

78 } } else { echo ('INVALID'); } } ?>

The above PHP page does not return any HTML tags, it only returns one of two words: VALID or INVALID. This page can be tested by simply typing its URL (along with the required SSN parameter) in the address bar of a browser and checking the returned result.

2.4.3.2 Advantages

• The CURL( ) application is simple and easy to use. • The biggest advantage is the interoperability with all other technologies and solutions. • There is no need to learn a special language. One can develop his/her own web page using any programming language, such as PHP, ASP, C#, or VB.NET. • This method is also natively supported by Asterisk and there is no need for any installations or configurations. • Testing is so easy and straightforward because the web application can be tested alone before being used inside the dialplan. All the testing and debugging tools that can be used with web applications will help. Once the web page is fully tested, it can be called from within the dialplan.

2.4.3.3 Disadvantages

The only disadvantage of this method is the need for a web server to host the web application. Adding a web server between the PBX and the source of the data will result in an additional layer that might slow down the performance of the whole architecture.

79 2.4.4 Tools

2.4.4.1 Introduction

There is a collection of commercial and open-source tools that integrate Asterisk with a couple of solutions. The goal of these tools is to offer a complete unified communication solution that brings together many features, such as: • IP PBX • Email • Instant Messaging • Faxing • Collaboration

Retrieving data from databases with different formats is also among the offered functionalities. This thesis presents integration scenarios that include plenty of solutions and servers, and not only mail and database servers. The two applications that will be described in the below sections are Elastix and Apstel IS (Integration Server) for Asterisk.

2.4.4.2 Elastix

This is a free / open-source software package. Elastix is a Unified Communications Server software. It has many features, including: IP PBX, email, IM, faxing, CRM and collaboration. Elastix is based on open source projects including: - Asterisk (PBX) - Openfire (IM) - HylaFAX (Fax) - Postfix (Email) - vTigerCRM and SugarCRM (CRM) - A2Billing (Billing)

Elastix also includes an Open Source Call Center module. The call center can handle incoming campaigns (It generates calls from a list of phone numbers and assigns them to agents) and outgoing campaigns (It receives calls and assigns them to agents through queues). Elastix supports connection to the following databases: - Oracle

80 - MS SQL - Mysql - Postgresql

Elastix includes a collection of important features, but it still misses some other important ones, such as the capability to surf the web through the phone.

2.4.4.3 Apstel Integration Server for Asterisk

Apstel IS (Integration Server) is an application server that extends the functionality of Asterisk dialplan. It is designed to provide easy access directly from the Asterisk dialplan to third party servers. Asptel IS comes with Visual Dialplan building blocks, which is an interface that allows easy access from the dialplan to third party servers and service, like: • Executing SQL queries on a remote database server o MS SQL o MySQL o Postgres o HSQL o Oracle o More … • Send emails o Exchange Server o Gmail o Exim

The complete environment is presented in Figure 2.5. It shows that plenty of Asterisk-based IP PBX packages can be integrated with a collection of RDBMSs and a variety of email applications. The integration scenarios include only 3 components: email server, database server, and a PBX. The other scenarios suggested in this thesis include integrating Asterisk with a wide range of packages, solutions and services.

81

Figure 2.5 - Apstel Integration Server for Asterisk

82 83 PART 3 - IMPLEMENTED INTEGRATION SCENARIOS

3.1 Introduction

Any PBX in software is programmable. This allows developers to go to the edge of their imagination. In this study, we integrate the PBX with many other technologies, solutions, and packages. We achieve this integration without relying on any third-party product, other than a web server that receives the request and returns the required output in HTML format. The following sections show how to use existing technologies to come up with a new PBX-centric solution, in which the VoIP package is the heart that receives the request and communicates with a collection of servers and applications. All techniques provided in this part can be implemented to provide a solution for blind or visually impaired people who will be able to interact with their laptops as normal users. By using hard/soft phones, they will be almost able to do whatever a normal user can do.

3.2 Integration Strategy

Our integration approach is simple and effective. We will let the PBX communicate with the outside world using http / https. That way, we are sure such a communication uses standard web protocols and requires minimal preparation and setup. All that is required is a web server with a set of dynamic pages that receive the request and reply to the PBX, which in turn takes appropriate action. We could develop all the required pages so that the PBX is able to send requests and receive replies from a wide range of open-source and proprietary software packages. We will use technologies such as ASP, .NET and PHP. Working with those technologies is important in terms of integration because they are used in Windows and Linux environment.

All the pages developed to give the PBX access to data should never be accessed from elsewhere. Firewalls hold the full responsibility for enforcing this policy. The Firewall should be instructed to block any attempt to access those pages if the request is

84 not coming from the IP of the PBX server.

It is worth mentioning that ideas presented in this study are not totally new. Instead, this study focuses on the use of existing technologies in an attempt to provide new solutions. The main goal we are trying to achieve is a wider possible integration between the PBX and all the servers around.

85 3.3 Retrieving Data from Different DBMS Packages

Although Asterisk has the capability to connect to different types of databases, it is better to stick to the previously presented approach, using the CURL() application. Making http requests to get data is a better choice because the dynamic web pages can include code to connect to any kind of relational or object-oriented database. This expands the capabilities of Asterisk. It is not anymore limited to PostgreSQL, MySQL, SQLite, or unixODBC. Using the http request strategy, Asterisk can do whatever can be done by the called web page.

3.3.1 The PBX Accessing MS SQL Server

As previously mentioned, all that is needed inside the dialplan is to call the CURL function and send the URL as the only parameter. Figure 3.1 shows the code of the ASP page called ClientBalance.asp. It takes the ID of the client as input and returns his / her balance.

<% ' save the parameter sent to this page ID = Request.Querystring("ClientID") Set con = Server.CreateObject("ADODB.Connection") ' open the connection to the local server and ERP database ' use the sa (System Administrator) username. Password is: p@ssw0rd ' you can replace the word "(local)" with the name of the server or its IP address con.ConnectionString="driver={SQL Server}; server=(local); uid=sa; pwd=p@ssw0rd; initial catalog=ERP" Con.Open Set rst = Server.CreateObject("ADODB.Recordset") Set rst = Con.Execute("Select * from Clients Where ClientID = " & ID) If rst.bof and rst.eof then ' this means the recordset is empty - no such client Response.write "INVALID" Else Response.write rst("Balance")

86 End if %> Figure 3.1 - Code of the ASP Page Called within the Dialplan

Figure 3.2 shows how to call the ASP page within the dialplan.

exten=1000,n,Set(C1=${CURL(${Web}ClientBalance.asp?ClientID=${CID})}) exten=1000,n,GotoIf($[${C1}!=INVALID]?SayC1:Exit) Figure 3.2 - Calling the ASP page within the Dialplan.

When Asterisk reaches the first line of the code shown in Figure 3.2, it will call the webpage. When it receives the reply, it will set C1 equal to the returned value. When control goes to the second line, if C1 is not equal to INVALID, it will go to the line labeled SayC1. Otherwise, it goes to the Exit line.

Through the ADODB library used in Figure 3.1, one can connect to plenty of data sources, including: MS Access, Excel, Oracle, FoxPro, Interbase, text files, and many others. All what is required is changing the connection string. Everything else should be kept unchanged.

3.3.2 The PBX Accessing MySQL

In the case of connecting to MySQL, we will use a PHP page. Three things make this example different than the previous one: 1. The extension of the web page is .php instead of .asp 2. The web page exists in a Linux server where the Apache webserver is installed instead of Microsoft IIS 3. The code of the page is written using PHP syntax.

Figure 3.3 shows the code of the PHP page.

87

Figure 3.3 - Code of the PHP Page Called within the Dialplan

3.3.3 The PBX Accessing MS Access

Just to show the flexibility offered by Asterisk through the CURL( ) application and how it can deal with different web technologies, we will use the .net technology in this section. The sample presented in Figure 3.4 is written using C#. It does just the same task as in Figures 3.1 and 3.3. The only difference is that it connects to an Access database. using System.Data.OleDb; public partial class _Default : System.Web.UI. Page { protected void Page_Load( object sender, EventArgs e) { OleDbConnection C = new OleDbConnection (); C.ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0; Data Source=C:\DB\Clients.mdb" ;

88 C.Open(); string ID = Request.Querystring( "ClientID" ); string SQL = "Select Balance From Clients Where ClientID = " + ID; OleDbCommand CMD = new OleDbCommand (SQL, C); OleDbDataReader RDR = CMD.ExecuteReader(); if (RDR.Read()) { Response.Write(RDR(0).ToString()); } else { Response.Write( "INVALID" ); } } } Figure 3.4 - Code of the .NET Page Called within the Dialplan

3.3.4 The PBX Accessing Oracle

The same code shown in Figure 3.4 can be used to allow the PBX to call a web page that connects to an Oracle Database and brings data. The only thing that needs to be changed is the connection string. The connection string shown in Figure 3.5 would be enough to do so.

string CS = "provider=MSDAORA;data source=ORCL;user id=Ast;password=Now"

Figure 3.5 - How to Connect to Oracle from .NET

The same connection string works with the ASP page. If the page is written using PHP, then the connection string should look like the one shown in Figure 3.6.

$db = “(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP) (HOST=192.168.1.34)(PORT=1521)))(CONNECT_DATA=(SID=MyDB)))” ; if ($c=OCILogon(“system”, “p@ssw0rd“, $db)) { echo “VALID”; } else { echo “INVALID"; } Figure 3.6 - How to Connect to Oracle from PHP

89

90 3.4 The PBX Accessing MS Exchange Server

3.4.1 Introduction

Allowing the callers to check their emails through the phone is a good feature of a PBX system. Such a functionality enables them to be online even while driving the car. Unfortunately, the emails are stored in a MS Exchange Server (which is installed on a Windows Server 2003 machine) and Asterisk is installed on a Linux machine. This means there are two different applications lying on two different operating systems. A solution to this problem is to use http that can be served by Windows- and Linux-based servers. Asterisk can call an http page and deal with the reply through the CURL function. MS Exchange Server can also be contacted using a couple of libraries. Thus, we can develop a new web page that will receive a request, make use of those libraries, and send a reply. Asterisk can then use this page to get the text from MS Exchange, send it to Festival, which will read it to the caller. This is the approach that will be presented in this study.

3.4.2 Preparing for the Connection

We will start with a web page that receives two parameters: • The CallerID that will be sent from the PBX: It contacts the Active Directory to discover the username of this phone number. That way, the user does not need to enter his username. It is automatically extracted from the Active Directory. We are supposing here that the phone number is another alternative to the username. • The numeric password: The web page will use this number as a password to login to the exchange server. Since hacking numeric passwords is easier than hacking alphanumeric ones, the users can be asked to enter alphanumeric passwords. This approach is valid but it might annoy them a little bit because they will have to enter a letter then listen to Asterisk "saying" the letter. This is very similar to writing an SMS. For a macro that detects the letters sent by the user to Asterisk using the numbers buttons, refer to Appendix A.

If there is no need to connect to active directory, the (phone, username) pairs may

91 be stored in SQL server for later retrieval. Passwords may also be stored there to eliminate the need for asking users to enter passwords. This approach is simpler from the users' perspective. If this scenario is followed, then the phone is considered an authentication device. Although this is a good idea, it is not recommended unless privacy is not a big deal. Users should always be trained to immediately report their lost or stolen mobile phones.

Figure 3.7 presents the code of the ASP page called count_emails.asp.

92

<% Function InboxFromActiveDir(PhoneNumber) Set con = CreateObject("ADODB.Connection") con.Provider = "ADsDSOObject" con.Open "Active Directory Provider" Set cmd = CreateObject("ADODB.Command") Set cmd.ActiveConnection = con SQL = "Select mail, telephonenumber" SQL = SQL & " from 'LDAP://192.168.10.10' where objectClass = 'user'" cmd.CommandText = SQL Set rst = cmd.Execute Do Until rst.EOF If PhoneNumber = rst("telephonenumber") Then InboxFromActiveDir = replace(rst("mail"), "@ourdomain.com", "") Exit Function End If rst.MoveNext Loop End Function '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Set Conn = CreateObject("ADODB.Connection") Conn.Provider = "ExOLEDB.DataSource" Set Rec = CreateObject("ADODB.Record") usr = InboxFromActiveDir(Request.QueryString("CallerID")) url = "http://192.168.10.10/exchange/" & usr & "/Inbox" pwd = Request.QueryString("Password") Set oXMLHTTP = Server.CreateObject("microsoft.xmlhttp") oXMLHTTP.Open "PROPFIND", url, False, usr, pwd oXMLHTTP.setRequestHeader "Content-type:", "text/xml" oXMLHTTP.setRequestHeader "Depth", "1" oXMLHTTP.send ("") If (oXMLHTTP.readystate <> 4) Then response.end

93 If (oXMLHTTP.Status <> 207) Then response.end ''''''''''''''''''''''''''''''''''''''''''''''' Set DD = Server.CreateObject("MSXML.DOMDocument") DD.loadXML (oXMLHTTP.responseText) dim Counter as long Counter = 0 For i = 0 To DD.ChildNodes(1).ChildNodes.Length - 1 Txt = DD.ChildNodes(1).ChildNodes(i).ChildNodes(0).Text If instr(Txt, ".EML") > 0 Then cs = "data source=" & txt & "; user id=" cs = cs & usr & "@scopepbx.com; password=" & pwd Conn.Open cs Rec.Open txt, Conn, adModeReadWrite if rec("urn:schemas:httpmail:read") <> "True" then Counter = Counter + 1 end if Conn.Close end if Next Response.Write Counter %> Figure 3.7 : Code of the ASP Page Called count_emails.asp

The ASP page shown in Figure 3.7 might need to be installed on the exchange server itself. There is also a need to save it to any IIS server that has the required libraries installed. In order to be able to extract the usernames and their phone numbers, IIS needs to have the required privilege. If this page is saved in the c:\Inetpub\wwwroot folder of the Exchange server, many problems disappear.

3.4.3 The Dialplan

The example presented in Figure 3.7 shows how to contact both: Exchange Server and Active Directory. Figure 3.8 shows the part of the dialplan that allows the users to contact the Exchange server in order to check their emails.

94

; ask Festival to say: enter your numeric password please exten=1000,1(ReadPwd), Festival (enter your numeric password please) ; read 8-digit password and save it as PASS exten=1000,n,Read(PASS,,8,,,10) ; call count_emails.asp & send 2 parameters: Password and CallerID ; the result is returned by the CURL function. ; the variable C1 will be set equal to the number of unread emails exten=1000,n,Set(C1=${CURL(${Web}count_emails.asp?Password=${PASS}&CallerID= ${CALLERIDNUM})}) ; go to next line if the page returned nothing … invalid password / caller id ; jump to the line labeled 'TellHowMany' if returned value is not null exten=1000,n,GotoIf($[${C1}==""]?:TellHowMany) ; tell the caller that the password is invalid and go back to line 1 exten=1000,n,Festival(wrong password try again) exten=1000,n,Goto(ReadPwd) ; tell user how many new emails he has in his inbox exten=1000,n(TellHowMany),Festival(number of new emails in your inbox is) exten=1000,n,SayNumber(${C1}) ; now we will allow the caller to check his emails ; he will press 1 to hear his 1 st email and so on exten=1000,n(ReadNum), Festival (enter one to read first message and so on) exten=1000,n,Read(MAIL,,1,,,10) ; if zero, hangup exten=1000,n,GotoIf($[${MAIL}==0]?:GetContent) exten=1000,n,Hangup() ; get the content of the chosen email ; the content will be the concatenation of subject + from + body of the message exten=1000,n(GetContent),Set(Content=${CURL(${Web}email_details.asp? Password=${PASS}& CallerID= ${CALLERIDNUM}&Mail=${MAIL})}) exten=1000,n,GotoIf($[${Content}==""]?:TellContent) exten=1000,n,Goto(ReadNum) exten=1000,n(TellContent),Festival(${Content})

95 ; ###################################################### ; here you can introduce a couple of lines to allow the caller to reply ; simply, you RECORD his reply and save it in a specified folder ; if your web page that sends the reply is a php page, just call it ; it should attach the recorded file and send the mail message ; if your web page is an asp page, then you need to allow it reach the file ; to do so, use the SYSTEM function inside your dialplan ; this function accepts a command and send it to the operating system ; it might be an ftp command that copies the file to a specified location ; it might also be the name of a batch file that does the job ; as per the asp page that sends the reply, we will present it below ; ###################################################### exten=1000,n,Goto(ReadNum) Figure 3.8 - ASP Page to Contact MS Exchange Server and Active Directory

3.4.4 Retrieving the Email Details

There should be a web page that will receive caller id, password, and the number of the mail to be returned. This page is called: email_details.asp. It returns a string that is the result of concatenating the subject, the 'from' field, and the message body. Figure 3.9 present the code of the email_details ASP page.

<% Set Conn = CreateObject("ADODB.Connection") Conn.Provider = "ExOLEDB.DataSource" Set Rec = CreateObject("ADODB.Record") usr = InboxFromActiveDir(Request.QueryString("CallerID")) url = "http://192.168.10.10/exchange/" & usr & "/Inbox" pwd = Request.QueryString("Password") Set oXMLHTTP = Server.CreateObject("microsoft.xmlhttp") oXMLHTTP.Open "PROPFIND", url, False, usr, pwd oXMLHTTP.setRequestHeader "Content-type:", "text/xml"

96 oXMLHTTP.setRequestHeader "Depth", "1" oXMLHTTP.send ("") If (oXMLHTTP.readystate <> 4) Then response.end If (oXMLHTTP.Status <> 207) Then response.end ''''''''''''''''''''''''''''''''''''''''''''''' Set DD = Server.CreateObject("MSXML.DOMDocument") DD.loadXML (oXMLHTTP.responseText) dim Counter as long Counter = 0 For i = 0 To DD.ChildNodes(1).ChildNodes.Length - 1 Txt = DD.ChildNodes(1).ChildNodes(i).ChildNodes(0).Text If instr(Txt, ".EML") > 0 Then cs = "data source=" & txt & "; user id=" cs = cs & usr & "@scopepbx.com; password=" & pwd Conn.Open cs Rec.Open txt, Conn, adModeReadWrite if rec("urn:schemas:httpmail:read") <> "True" then Counter = Counter + 1 if Counter = eval(Request.Querystring("mail")) Then msgfrom = rec("urn:schemas:mailheader:from") msgsubj = rec("urn:schemas:mailheader:subject") msgbody = rec("urn:schemas:httpmail:textdescription") result = result & "you received a message" result = result & " from " & msgfrom if msgsubj = "" then result = result & " without a subject" else result = result & " titled " & msgsubj end if if msgbody = "" then result = result & " but it has no body" else result = result & " it says " & msgbody

97 end if Response.Write result end if end if Conn.Close end if Next %> Figure 3.9 - The Code of the email_details ASP Page.

Each email message has a list of properties and fields. The code presented in Figure 3.9 makes use of some of these fields. For a list of these fields, refer to appendix B.

3.4.5 Sending the Reply

The code of the asp page that will send the reply of the user is called mail_reply.asp. It could be enhanced to provide any or all of the following three options: • The dialplan should record the reply. Then, the recorded file will be sent as an attachment. • The caller might decide to select a reply from a list of pre-defined replies. • The caller might send a short reply by typing the text using his phone number buttons. Refer to Appendix A for a dialplan that detects letters typed by the caller.

The mail_reply.asp page will receive the following parameters: • "MailFrom" field (who is sending this mail message). This may be replaced by the CallerID. Using the InboxFromActiveDir() function (presented in Figure 3.7), one can get the email address from CallerID by contacting the Active Directory. • "MailTo" field (to whom this email message is sent). • "Subject" - the title of the message. • "TextBody" – the body of the message. • "Attachment" – NewExcelFileName. • "SMTP_IP" – the IP address of the mail server.

98

The code presented in Figure 3.10 shows the code of the ASP page.

<% Dim IP IP = Request.QueryString("SMTP_IP") Dim Conf set Conf = Server.CreateObject("CDO.Configuration") Conf.Fields.Item("http://schemas.microsoft.com/cdo/configuration/sendusing") = 2 Conf.Fields.Item("http://schemas.microsoft.com/cdo/configuration/smtpserver") = IP Conf.Fields.Update Dim CDOMail set CDOMail = Server.CreateObject("CDO.Message") CDOMail.Configuration = Conf CDOMail.From = Request.QueryString("MailFrom") CDOMail.To = Request.QueryString("MailTo") CDOMail.Subject = Request.QueryString("Subject") CDOMail.TextBody = Request.QueryString("TextBody") 'CDOMail.AddAttachment Request.QueryString("Attachment") CDOMail.Send Response.Write "DONE" %> Figure 3.10 - ASP Code to Send a Reply

In PHP, there is a single command that sends an email. The PHP page shown in Figure 3.11 may be used to send the reply.

99 $message = $_GET['TextBody']; $headers = 'From: '.$_GET['MailFrom']; mail($to, $subject, $message, $headers); ?> Figure 3.11 - PHP Code to Send a Reply

It has been shown in this section that the PBX can communicate with MS Exchange Server. It can get the number of unread emails, read them to the caller, record his reply and send it. This is a feature of any complete unified communication application.

100 3.5 The PBX Accessing MS OCS

3.5.1 Introduction

3.5.1.1 Abstract

Microsoft Office Communication Server offers different communication services. It can communicate with a PBX. Asterisk is an open-source PBX. It will be a great opportunity if we can integrate both although OCS is a proprietary product that uses SIP over TCP while the other is an open-source package that uses SIP over UDP. The integration can be achieved using a third-party software package. The goal is to allow users to place and receive internal & external calls using the Office Communicator software.

3.5.1.2 What is Microsoft OCS

Microsoft defines OCS as a server that manages all real-time communications. This includes: IM, VoIP, and audio / video conferencing. OCS works with existing tele- communications systems. OCS also powers the presence concept, a key benefit of UC (Unified Communications). Presence lets a user see if someone is available. A user can contact others with a click using IM, a phone call or a video conference. OCS is fully integrated with other Microsoft products such as Outlook and Active Directory

3.5.1.3 Why Do We Need Such Integration?

As is the case with every integration problem, both SW packages offer important features. There is a need to let them communicate with each other so that we can make use of services offered by both. OCS provides valuable services, including IM, conferencing, and much more. Asterisk is a flexible and programmable open-source PBX package. Although combining those two products to work with each other is a challenge in terms of protocols and implementation, it will introduce a set of new services that will be of a great value to the LAN / WLAN / WAN users. OCS can interoperate with other PBX packages. The main question here is: why do we need to integrate it with Asterisk? The answer is that Asterisk can be easily programmed to fit company needs. Since developers have full control over the way Asterisk behaves, it would be an excellent opportunity to get the

101 maximum benefit of this programmability and combine it with OCS in order to offer a new collection of communication services that could not be achieved otherwise.

3.5.1.4 Integration Scenario

The scenario targets the integration of Microsoft Office Communication Server (OCS) with Asterisk so that the user can make / receive calls using the Office Communicator software. The integration would keep Asterisk's features and function unchanged. It would also enable users that have a hard phone connected to an Asterisk PBX, to make calls from their Microsoft Office Communicator to the PSTN, or be dialed from the PSTN and answer the call on the analog / IP hard phone, soft phone or Office Communicator.

3.5.1.5 Previous Attempts to Solve the Problem

Many attempts were made to solve this problem. Some of these attempts achieved partial success, while others failed. This study is meant to be a complete documentation of targeted integration can be made including installation, configuration, validation, testing, and troubleshooting.

M-networks ( www.m-networks.net ) tried to achieve the integration between Asterisk and LCS 2005 (the ancestor of OCS 2007). Through its website, the company promised to deliver a complete solution, but this promise has never been fulfilled. The complex nature of this integration could be one reason why such a solution was not provided. The company's proposed solution is called UCCG (Unified Call Control Gateway). It is downloadable from the website. Although we followed the detailed documentation of the UCCG, we could not achieve any success, neither with LCS 2005 nor with OCS.

3.5.2 Network Topology

3.5.2.1 Overview

Microsoft‘s OCS provides instant messaging, presence, voice, video and web

102 conferencing services within the corporate network. To deliver these services, Microsoft decided to standardize on the TLS protocol (using the TCP transport mechanism). This presents a few minor problems since nowadays most VoIP applications and IP PBX are based on the SIP protocol (using the UDP transport mechanism). Asterisk uses SIP/UDP, (Asterisk will support both TCP and UDP in the near future).

Microsoft also created another component of OCS called Mediation server, which provide a ‘gateway and translation’ service from TLS/TCP to SIP/TCP. This must run on a dedicated server with two network cards. Windows Server 2003 R2 recommended. The reasoning for the dedicated server is that voice services require real time response. Translating from one protocol to another in real time can be processor intensive. Currently, all components of OCS are 32-bit. 64-bit versions are rumored to appear with OCS 2009.

3.5.2.2 Required Third-Party Package

We still need to ‘translate’ SIP/TCP to SIP/UDP and we can do this with Open Source software called OpenSER. OpenSER SIP Server is a reference implementation, featuring hundreds of VoIP services worldwide. OpenSER is primarily used as a SIP Proxy and Registrar. OpenSER works like glue that binds all the SIP components together. OpenSER was started as an open source project that aimed at developing a scalable and robust SIP server. On the 28th of July 2008, OpenSER was renamed to KAMAILIO. OpenSER does not exist anymore since it was forked into two projects, OpenSIPS and Kamailio. Although both projects are similar, they have different goals and philosophies.

3.5.2.3 LAN Configuration

Different OCS configurations depend on requirements for scalability and network topology and can get quite complicated. Basically, ‘standard’ OCS is configured in such a way that the main roles required for OCS are all installed on one server. ‘Enterprise’ OCS moves these roles to separate servers for faster response times and more stability.

A medium size organization is expected to implement a consolidated configuration that includes at least the following systems.

103 1) A Linux (CentOS 5.2 or higher) machine with Asterisk (1.4), a number plan and PSTN connectivity a. IP address – for this document, we shall assign 1.1.1.6 to this machine. b. Sizing – a 3 GHz dual core Intel processor and 2 GB RAM can support approximately 30 concurrent calls depending on the real time ‘translation’ requirements of the system. 2) A Linux (CentOS 5.2 or higher) machine with OpenSER (1.2.1 or higher) a. IP address – for this document, we shall assign 1.1.1.5 to this machine b. Sizing – OpenSER is sophisticated software, we are simply using it in a basic way to translate UDP packets to TCP packets and vice-versa. Therefore, the requirements for OpenSER server can be less than those of Asterisk's. 1.5Ghz and 1GB RAM is a minimum recommendation. c. Note – one can install OpenSER on the same server as Asterisk and they will function perfectly fine, but some experimentation will be required to determine if the server has the processing power to keep up with the demands of both OpenSER and Asterisk. 3) A Windows 2003 Server R2 32-bit (standard edition) with Microsoft Office Communications Server (standard edition) a. IP address – for this document, we shall assign 1.1.1.2 b. Sizing – match the Asterisk server. In this case, 3 GHz dual core processor and 2 GB RAM. 4) A Windows 2003 Server R2 32-bit (standard edition) with Microsoft Office Communication Server (standard edition) Mediation Server a. Requires 2 network cards: one for communication with the OCS server, and one for communication with the outside world. In this case, OpenSER. Recommend known brand ‘server class’ 1 Gigabit network cards. Intel cards are a good choice. b. IP addresses – for this document, we shall assign 1.1.1.3 to the card that communicates with the OCS server; and assign 1.1.1.4 to the card that communicates with OpenSER. c. Sizing – same as the main OCS server. 5) A number of PC/XP clients with Microsoft Office Communicator or X-Lite soft phone setup.

104 a. Minimum 100 Megabit network connection is required (1 Gigabit recommended). b. The IP address 1.1.1.1 shall be assigned to this machine. c. PC must be inside the same domain as OCS and must share the same gateway & DNS settings as the OCS server.

NOTES – We should define the scope where each number will be used. In this example, all 3000 extensions shall be within OCS and all 4000 extensions within Asterisk. Any number longer than 6 digits will dial to the outside world. – All of these systems should be able to connect to the Internet to download updates. – These are recommendations for a live production system. Experimental or demonstration systems can run almost on any PC. They can also be run under virtual machines. – Voice requires real time processing and real time response. Any delay in this will appear as ‘drops’ or ‘timeouts’ to voice users. To minimize delay, 1 Gigabit switches are recommended. – Virtualization is generally not recommended with VoIP systems due to possible delays with an extra ‘layer’ required for virtualization. Officially, Microsoft will NOT support OCS if virtualization is used.

3.5.3 The Integrating Call Flow

Microsoft Office Communication Server (OCS) is meant as a value adding service platform that offers Unified Communications by integrating IP Telephony, presence, instant messaging and video conferencing. It is extremely powerful when combined with Microsoft’s e-mail server (Exchange 2007). E-mail, voicemail (direct to the inbox) and faxing (direct to the inbox) also becomes available in this setup.

The PBX is based on Asterisk. Asterisk takes care of the connectivity with the PSTN. OpenSER acts as a ‘gateway’ between Asterisk and OCS, since OCS only supports SIP over TCP and Asterisk only SIP over UDP. When a call is made, the type SIP calling message flow would be as follows:

105 Office Communicator  TLS/TCP  OCS  TLS/TCP  OCS Mediation  TCP/TCP  OpenSER  SIP/UDP  Asterisk  PSTN

Using our assigned IP numbers from earlier, it would look like this:

Office Communicator (client) 1.1.1.1 connects using TLS/TCP to OCS 1.1.1.2 connects using TLS/TCP to OCS Mediation 1.1.1.3 then OCS Mediation 1.1.1.4 connects using TCP/TCP to OpenSER 1.1.1.5 connects using SIP/UDP to Asterisk 1.1.1.6 connects to the PSTN

Figure 3.12 shows the flow of calls.

106

Figure 3.12 – The complete Call Flow between MS OCS and Asterisk

Dialplans and routing need to be correctly configured on all 4 servers (OCS, Mediation, OpenSER, and Asterisk) to achieve a seamless, integrated user experience.

3.5.4 Configuration

We shall be using the built-in Linux editor ‘vi’ to edit the Asterisk config files. These configuration files can be edited using any other text files editor or word processing application.

3.5.4.1 Configuring Asterisk

Now, we should move to the /etc/asterisk directory, make a backup of all the existing config files on a sub-folder called backup, remove the existing config files, copy back the

107 zapata.conf file, and configure other Asterisk config files. Surprisingly, Asterisk needs very few files to function.

3.5.4.1.1 Modules.conf

Let’s start with the modules.conf file. This file is very simple and tells Asterisk to autoload all the modules it requires. Open the file, enter the following, save, and exit:

[modules] autoload=yes

3.5.4.1.2 sip.conf

Next, we shall edit the sip.conf file. This file defines what SIP devices and SIP gateways are attached or available to the Asterisk server. A sip.conf file for Asterisk can be configured in numerous ways. We are going to concentrate on a simple setup to achieve our Asterisk to OpenSER to OCS goal. Here is our simplistic sip.conf file:

[general] context=default ; Default context for incoming calls bindport=5060 ; UDP Port to bind to (SIP standard port is 5060) bindaddr=0.0.0.0 ; IP address to bind to (0.0.0.0 binds to all) ; [authentication] ; [out-to-openser] host=1.1.1.5 port=5060 qualify=no type=peer disallow=all allow=ulaw context=in-from-openser canreinvite=no nat=yes ; [4000] ; Turn off silence suppression in X-Lite ("Transmit Silence"=YES)!

108 ; Note that Xlite sends NAT keep-alive packets, so qualify=yes is not needed type=friend regexten=4000 ; When they register, create extension 4000 callerid="XLite" <4000> host=dynamic ; This device needs to register ;nat=yes ; X-Lite is behind a NAT router canreinvite=no ; Typically set to NO if behind NAT disallow=all allow=ulaw username=4000 secret=4000

There must be [general] and [authentication] sections at the beginning of the file, followed by defined SIP gateways or devices. Note that we have defined the IP address as 0.0.0.0 in the general section. This means Asterisk will use any IP addresses available.

In this example, we have defined a SIP gateway called out-to-openser. In the details for this gateway, we have specified the IP address, port, and context of incoming calls. We denied all voice codecs, and then allowed only the ULaw codec (as used by OCS).

The last two statements in the ‘out-to-openser’ section are important. ‘canreinvite=no’ tells Asterisk that call flow must always go via Asterisk. It cannot be changed or reinvited to go directly. ‘nat=yes’ tells Asterisk that this gateway is using Network Address Translation, which means that data should always be sent via the address specified – in this case 1.1.1.5. The ‘out-to-openser’ section can be repeated to define other gateways, giving them a different name.

Next – we have an X-Lite soft phone client defined as extension 4000. Note the username & secret entries that define the username & password to allow this X-Lite client to authenticate to Asterisk.

As with the ‘out-to-openser’ section, this section can be repeated with a different name to define extra clients. Recommended settings for common gateways are included in the sip.conf that comes with the default Asterisk config files.

109 3.5.4.1.3 extensions.conf

The extensions.conf file holds the ‘dialplan’. This dialplan tells Asterisk what to do with incoming calls to extensions and in what context these calls should be dealt with.

In the sip.conf above, we define a gateway (out-to-openser) and a context for that gateway (in-from-openser). We also defined an X-Lite client (4000) but we did not define a context. Extension 4000 will use the default context, as specified at the top of the sip.conf file. We shall use these items in our ‘dialplan’. Here is our extensions.conf for this example. Again, we create this using vi:

[general] ;Any line starting with a semi colon ‘;’ is ignored ; [globals] ; [default] ;default is the content used by the X-Lite client ; exten => _3XXX,1,Dial(SIP/+1222333${EXTEN}@out-to-openser) exten => _3XXX,n,Hangup() ; exten => _4XXX,1,Dial(SIP/${EXTEN}) exten => _4XXX,n,Hangup() ; exten => _XXXXXX.,1,Dial(ZAP/g0/${EXTEN}) exten => _XXXXXX.,n,Hangup() ; [in-from-openser] ;This context handles incoming calls from openser ; exten => _+3XXX,1,Dial(SIP/+1222333${EXTEN:1}@out-to-openser) exten => _+3XXX,n,Hangup() ; exten => _+4XXX,1,Dial(SIP/${EXTEN:1}) exten => _+4XXX,n,Hangup() ; exten => _+XXXXXX.,1,Dial(ZAP/g0/${EXTEN:1}) exten => _+XXXXXX.,n,Hangup() : [in-from-pstn] ;we have not defined how this Asterisk server will connect to the outside world ;we are using this context as an example of what to do with incoming calls ;from the outside world, but this gateway or device would need to be defined ;

110 exten => _12223333XXX,1,Dial(SIP/+${EXTEN}) exten => _12223333XXX,n,Hangup() ; exten => _12223334XXX,1,Dial(SIP/${EXTEN:7}) exten => _12223334XXX,n,Hangup()

There must be a ‘General’ section and a ‘Globals’ section, then followed by any other ‘contexts’ or macros. The config of Asterisk is now complete. The server should be restarted.

3.5.4.2 Configuring OpenSER

The openser.cfg file needs to be edited in order to setup the contents of the file as shown in Appendix D.

OpenSER can now be run from the command line. If nothing happens and no response is received when using this command, then OpenSER is NOT functioning correctly. Edit the openser.cfg file again and set the line log_sederror=no to yes. Try running it again. After doing so, what is causing the error should appear. We recommend setting that line back to "no" when everything is working properly.

Also note that if OpenSER is already running, an error message will appear saying that the IP is already in use. If this is the case, restart the machine and try starting OpenSER again.

3.5.4.3 Configuring OCS

The basic Enterprise Edition or Standard Edition server should be running. Users should also have been enabled to log on to OCS. PC-to-PC calls should run smoothly. If they are outside the firewall, an Edge Server (which supports STUN and rate limiting) should be used. In the OCS MMC snap-in, right click on the OCS forest and select 'properties -> voice properties'.

3.5.4.3.1 Step 1: Add a Localization Profile

111 The dialog opens in the 'Localization Profiles' tab. Edit the default location profile or add a new one and edit the profile, so we can add a localization rule.

3.5.4.3.2 Step 2: Add a normalization rule

Within the localization rule, add a normalization rule. The rule translates numbers dialed by users to a standard format. Here we always translate to E.164 format, namely +

Fortunately, the Microsoft Communicator translates numbers automatically to this format, leaving out the +, (), spaces and dashes, even when starting a call from Microsoft Outlook. However, we have to add at least one rule. The normalization rule is written in .net regular expression format. In our case, we translate any number starting with a zero to an LB number: ^0(\d*)$ -> +961$1

It basically means that for any new translation (^) of a number that starts with a 0 and then containing any number of digits \d* it should take the part of the number between brackets () and use it as variable in the translation. The translation adds +961 to it.

3.5.4.3.3 Step 3: Add phone usages and policies

When all settings of the 'Localization Profiles' tab are properly set, 'phone usages' in the 'voice properties' dialog can be added. These properties are useful for assigning to users and logging these types of use. Be sure to also create at least one Policy in the Policy tab. A collection of phone usages form a 'policy', and a user will later on be assigned a policy. This way, one can discern which users are allowed what type of calls (for instance, disallow service numbers).

3.5.4.3.4 Step 4: Add a Route

Add a 'route' under the 'routes' tab. Here, it is defined that for numbers matching a certain regular expression, the call should be routed to our mediation server. The regular expression states in this case that any number should be forwarded:

112 ^(\d*)$

Part of the route definition is the phone usages that are allowed for this route. See the Microsoft 'OCS_VoIp_Guide' document for further reference.

3.5.4.3.5 Step 5: Enabling users

Even though the users cannot really make external calls yet, we can enable them for Enterprise Voice. To do so, we need to go to the 'Active Directory Users' MMC snap-in. Go to the entry of a test user and select its properties. In the 'Communications Tab' choose 'additional options' -> Configure, and in the new dialog select 'Enable Enterprise Voice': Fill in the line URI, which can be of the form: tel:+ (yes, the '+' again!)

3.5.4.4 Configuring OCS Mediation Server

Install an OCS Mediation Server on a machine running Microsoft Windows 2003 server. The Mediation Server software can be found in the OCS installer under 'additional server roles'. Configure the Mediation Server as follows:

Next Hop: Gateway Listening IP address: the external IP address of the Mediation Server

Location Profile: Choose the location profile as chosen before (i.e. 'Utrecht')

Gateway IP address:

3.5.5 Validation, Confirmation, and Troubleshooting

3.5.5.1 Tools for Testing

Configure MS Office Communicator to log on to OCS and provide the login account with a 'line URI' as mentioned above. Start typing a telephone number or (Outlook) contact in the search bar or right-click on one of the contacts to dial the number. The communicator should ring, as well as the chosen number.

A useful tool to debug SIP message flows can be downloaded from the MS site:

113 http://www.microsoft.com/downloads/details.aspx?familyid=7b6ab4f3-2949-4e97-856e- 9c4ae323c75a . It can be used by right-clicking on the OCS pool or server and choosing 'Logging tool->Start debugging session'. Choose 'SIPstack' in the options and click 'start' before setting up a test call. After pressing 'stop' and 'analyses', a SIP message flows between the OCS nodes.

For debugging on the OpenSER machine, we can use ngrep or wireshark > tshark -i any -n -V -R sip

3.5.5.2 Troubleshooting

For trouble free operation, ensure that the network is running AT LEAST 100 mbps switches with 1 gbps switches preferred. OCS offers not only voice but conferencing and video as well. The network needs to be as well configured as possible to handle all the real- time communications traffic.

Asterisk and OpenSER are extremely reliable once configured and running. Practical experience has shown a restart once per month is perfectly adequate. If any changes are made, a restart is recommended. Based on our experience, the most serious problems seem to be caused by the Mediation server.

Although there will be the ‘mistakes’ of misconfigured numbers in OCS or incorrect number routing in Asterisk, they are easily fixed. The serious illogical / untraceable / seeming unfixable errors are caused by the Mediation server.

When experiencing calls which work off and on, the Mediation server needs to be rebuilt. This cures 99% of problems. This has been our experience on many installs.

3.5.6 Future Support

3.5.6.1 Will TCP be supported by Asterisk?

Since the main problem between Asterisk and OCS is that the first uses UDP while the other uses TCP, one can install an unsupported patch onto Asterisk to add SIP over

114 TCP support. This option will save time and lead to simpler architecture because the need for a third party (OpenSER) translator is not anymore a must. Although it is advantageous to do so, it is not recommended since future updates could negate the patch. The Asterisk community says that TCP and TLS support for SIP will be available in Asterisk soon. Even when TCP is supported by Asterisk, some real-time experiments should be made before it becomes reliable. Practical implementations will provide accurate feedbacks.

3.5.6.2 Will UDP be supported by Microsoft?

Although some believe Microsoft will implement SIP over UDP, OCS is not expected to support UDP because of the following three issues with UDP: • UDP is not encrypted. This means that one can’t ensure end-to-end security of SIP messages. • UDP has a fundamental flow for large SIP messages. This is a major issue that needs to be taken into consideration. Since the size of the UDP datagram is limited to 1500 bytes, large SIP messages will need to be broken into many packets. OCS SIP messages tend to be large since they contain various XML bodies and machine generated unique IDs. This means they will normally span multiple packets. • UDP is a "fire and forget" protocol, which means that the transport layer does not consider lost or delayed packets. UDP does not guarantee delivery.

115 3.6 The PBX Accessing Office Applications

3.6.1 The PBX Accessing MS Excel

In this scenario, we will provide a special feature to the General Manager who will be able to get information about the attendance of his employees on the phone. Suppose we have a hand-punch machine installed inside the company to register the attendance of the employees. This machine saves the in/out entries in a text file. Each time a report is needed, Excel is used to import the entries from the text files. We are going to take this idea one step further so that the GM will know which employee was the last one to come to the company. The flow of events is as follows: • Employees come in the morning and check in • Entries are saved to a text file • GM will call the company. • The PBX will handle the call and contact a web page • The web page will open Excel, import data from text file, and sort records according to TimeIn field in descending order so that the employees who came late are on top of the list. • The web page will return the phone number / extension of the employee who appears on the top • The PBX will take this number, call it, and transfer the call to the GM • GM will talk to the employee

Such a solution gives the GM an efficient tool to follow up the employees without being in the office. All that the GM has to do is to call a number and he will be redirected to the last employee who checked in.

We will start with the text file that includes the entries. It is a tab-delimited file that will be saved as C:\HandPunchEntries.txt. The following table shows sample entries (column names make the first row):

EmpName Phone Date TimeIn TimeOut Ahmad 70100201 12/09/2009 08:00

116 Daniel 71234432 12/09/2009 09:10 Pascal 03208298 12/09/2009 08:20 Joe 03123456 12/09/2009 08:02 George 70998877 12/09/2009 08:13 Elie 03223344 12/09/2009 07:59

The ASP page that will do all of this is presented in Figure 3.13.

117

<% Dim xl set xl = server.createobject("Excel.Application") Set wb = xl.Workbooks.Add set ws = wb.ActiveSheet Set qr = ws.QueryTables.Add("TEXT;C:\HandPunchEntries.txt", ws.Range("$A$1")) With qr .Name = "HandPunchEntries" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .RefreshStyle = 1 'xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .TextFilePromptOnRefresh = False .TextFilePlatform = 720 .TextFileStartRow = 1 .TextFileParseType = 1 'xlDelimited .TextFileTextQualifier = 1 'xlTextQualifierDoubleQuote .TextFileConsecutiveDelimiter = False .TextFileTabDelimiter = True .TextFileSemicolonDelimiter = False .TextFileCommaDelimiter = False .TextFileSpaceDelimiter = False .TextFileColumnDataTypes = Array(2, 2, 4, 2, 2) .TextFileTrailingMinusNumbers = True .Refresh False End With ws.Sort.SortFields.Clear

118 ws.Sort.SortFields.Add ws.Range("C2:C7"), 0, 2, 0 ws.Sort.SortFields.Add ws.Range("D2:D7"), 0, 2, 0 With ws.Sort .SetRange wb.ActiveSheet.Range("A1:E7") .Header = 1 'xlYes .MatchCase = False .Orientation = 1 'xlTopToBottom .SortMethod = 1 'xlPinYin .Apply End With Response.Write ws.Range("B2") %> Figure 3.13 - ASP Page Used by the PBX to Retrieve Data through Excel

3.6.2 The PBX Accessing MS Word

We will show how to create a web page that will: • receive a request • open a WinWord document • perform the request • return data to the PBX

The caller can: • send a word and receive a list of synonyms (thesaurus) • open a document so that Festival will read it • open a document and send it by fax • open a template, fill simple fields, and send it by email

In this scenario, we are going to present the last option. The user will be able to call the PBX which will call a web page that in turn calls WinWord. The latter will open a template and fills in the user-supplied values. Finally, the document will be sent as a fax. This shows the power of integrating such two business applications. That way, salesmen do not need to wait to come back to their offices. They have their offices within their

119 phones. They can call, do limited data entry and send a document as a fax originated from their company's number.

Suppose the template is as follows:

Company Logo TodayDate

Offer

Dear Mr. [ ClientName ]

Based on your request number: [ Reference ], Please be informed that I can offer you a very special discount that can be up to [ DiscountPercent ] %. This is true only if the quantity of your order is not less than [ Quantity ] items.

You will have to pay 50 % upon signature, 25 % 1 month later, and 25% upon delivery. This offer is valid for 15 days

Please feel free to contact me any time. My mobile phone number is: 00961 3 208298. My email address is: [email protected]

Regards Account Manager Ahmad Hammoud

The fields in the above template are shaded. They will be replaced by user-supplied values. All that our dialplan should do is to ask the user to provide 5 values (do not count the date at the top of the page since it will not be supplied by the user. The fifth parameter is then the destination fax). The PBX will call a web page and send those 8 parameters to it. The page will: • open WinWord

120 • open the template • fill in the values • save the document in sent faxes folder • send the fax • close the document • exit WinWord

Figure 3.14 shows how the asp page will do these tasks.

<% Dim w Set w = Server.CreateObject("Word.Application") Dim d Set d = w.Documents.Open("C:\Templates\Template13.doc") DocName = "c:\SentFaxes\" DocName = DocName & "by (" & Request.QueryString("CallerID") & ")" DocName = DocName & " " & year(date) & "-" & month(date) & "-" & day(date) DocName = DocName & " " & hour(now) & "." & minute(now) & "." & second(now) DocName = DocName & ".doc" d.SaveAs DocName w.Selection.Find.ClearFormatting w.Selection.Find.Text = "[TodayDate]" w.Selection.Find.Execute w.Selection.Text = Date w.Selection.MoveRight 1, 1 w.Selection.Find.Text = "[ClientName]" w.Selection.Find.Execute w.Selection.Text = "..." w.Selection.MoveRight 1, 1 w.Selection.Find.Text = "[Reference]" w.Selection.Find.Execute w.Selection.Text = "..."

121 w.Selection.MoveRight 1, 1 w.Selection.Find.Text = "[DiscountPercent]" w.Selection.Find.Execute w.Selection.Text = "..." w.Selection.MoveRight 1, 1 w.Selection.Find.Text = "[Quantity]" w.Selection.Find.Execute w.Selection.Text = "..." w.Selection.MoveRight 1, 1 w.Selection.WholeStory w.Selection.Range.HighlightColorIndex = 0 d.Save ' some configuration should be made before the following command succeeds ' to make sure it runs, try sending a fax manually from WinWord ' if you could send it manually, the following will succeed d.SendFax Request.Querystring("DestinationFaxAddress") d.Close False w.Quit Set d = Nothing Set w = Nothing Response.Write "DONE" %> Figure 3.14 - ASP Page Used by the PBX to Communicate with MS Word

There are two things that are worth mentioning: 1. Make sure IIS has permission to save in the destination folder 2. If the above web page did not run as expected, WinWord needs to be allowed to run as a component. To solve this issue, do the following: a. Open Control Panel > Administrative Tools > Component Services b. Go to Console Root > Component Services > Computers > My Computer > DCom Config > Microsoft Office Word 97 – 2003 Document c. Right-click on previous node then click Properties

122 d. Go to last tab (Identity) and provide a username and password. This user should have permission to run the application e. Click OK

If any errors took place, stay in the Component Services. It is most probable the solution is there. Consider right-clicking on My Computer > Properties then check the following 2 tabs: • COM Security • Default Properties

3.6.3 The PBX Interoperating with MS Office Communicator

'Integration of Microsoft OCS and Asterisk' has been discussed in a previous section in details. The focus was on the integration from the 'call' side. In this section, the focus is on the integration from the side of 'Instant Messaging'. Previously, OpenSER was used in order to: - enable users who are connected to an Asterisk PBX, to make calls from their Microsoft Office Communicator to the PSTN - enable users who are connected to the OC to be dialed from the PSTN - enable users to receive calls on the analog / IP hard phone, soft phone or Office Communicator. In this section, a different scenario will be presented. A user will use his/her cell phone to call his/her colleague. Suppose the colleague is busy because he/she is on the phone talking to a client. The caller will be able to send an instant message to the colleague on the phone. That way, the integration is complete and the solution is efficient.

To do so, we will use Microsoft UCCAPI. The steps are as follows: • We developed an agent that is always running • The agent opens a connection to the OCS and waits for IM requests • An Asterisk caller wants to send an instant message to a colleague • Asterisk will give the caller a list of predefined messages • The caller will decide which message to send to his colleague

123 • The caller will have to provide his numeric password because communication with his colleague occurs using his own credentials, as if he is sitting behind his desk and using his laptop • Asterisk will call a web page that will place the IM request inside a database • The agent keeps checking the database for IM requests • Whenever a request is found, it is processed immediately and moved to a log table

Table 3.1 shows sample messages that can be sent to Office Communicator users.

Table 3.1 – IM Messages that will Appear in the Office Communicator Message ID Meaning 1 Answer me now. I am calling and you are not answering. 2 I called but you did not reply. I will call again 3 Call me ASAP. I called but you did not answer 4 I sent you an email. Check it and call me 5 Send me SMS when you are available

Table 3.2 shows two sample records of the table that handles the IM requests.

Table 3.2 - Sample Records in the OC Requests Table RequestID UserID UserToNotify RequestDate MessageID 1 1 2 23/04/2009 02:08:56 pm 1 2 2 3 24/04/2009 01:13:37 pm 1

When a request is processed, it is moved to another table that has same structure. This has two benefits: 1. Requests table is kept empty, which improves the performance 2. Processed requests are documented and stored in a special Log table.

124 The small program that processes the IM request is just an updated version of a program developed by Microsoft called, UCCAPI. It has been changed to get requests from the database (instead of the user) and proceed with them.

125 3.7 The PBX Exploring the Hard Disk

This scenario allows callers to retrieve the names of their folders and files through the phone. The caller might have a shared folder and want to navigate until he/she reaches the targeted file. When he/she finds the document, he/she can: • Open it using WinWord. • Open it using Excel. • Instruct Festival to read the content. • Fill in some fields. • Send the document by mail. • Send the document by fax.

All of those tasks are described in previous examples. In this section, we are interested in allowing the users to explore their shared folder through the phone. We need to create 3 web pages: • get_caller_folder.asp - this page will receive 1 parameter only: The CallerID and return the user's folder where the navigation starts. It will use the function InboxFromActiveDir() (presented in Figure 3.7) to retrieve the username and concatenate it to a shared folder path. For example, all users' documents might be on a shared folder on the file server. Suppose this path is: \\fileserver\shared\. Inside the shared folder, each user has his own folder that includes all his documents. If this is the case, all that we need to do is to concatenate "\\fileserver\shared\" to the name of the user which is retrieved from the Active Directory. Using code similar to the one used in the function InboxFromActiveDir(), other details - such as home folder - can be retrieved. • get_folder_count.asp - this page will receive 2 parameters: o folder to explore o what to return – this can be:  either count of files  or count of subfolders • get_only_1.asp - this page will receive 3 parameters: o parent - folder to explore o what to return – files / folders

126 o index – if this is 3, it means return the third file / folder in the parent folder

Again, note that the firewall should block any access to above pages if the request did not originate from the PBX server.

By using the above pages, we can: 1. get the folder to explore (get_caller_folder.asp) 2. ask the caller to press: o one if he wants a list of folders o two if he wants a list of subfolders 3. set N = number of entries inside the folder (get_folder_count.asp) 4. call page (get_only_1.asp) N times. Each time we: o receive a file/folder name o instruct Festival to read it to the caller o tell the user to press 1 if he wants this file / folder o wait for 2 seconds o if user pressed 1, go to step 2 o else continue enumerating files / folders

We will start with the web pages and finally present the dialplan. The code of the first page (get_caller_folder.asp) is simple and shown in Figure 3.15. We suppose here that the name of the shared folder is the same as the name of the Inbox.

<% Function InboxFromActiveDir(PhoneNumber) ' refer to the Figure 3.7 End Function '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Response.Write InboxFromActiveDir(Request.QueryString("CallerID")) %>

Figure 3.15 - ASP Page Used by the PBX to Get the Shared Folder of the Caller

127

The second page (get_folder_count.asp) is presented in Figure 3.16. This page returns either the number of the files or the number of the subfolders found inside a specified folder

<% Set fs = CreateObject("Scripting.FileSystemObject") Set fo = fs.GetFolder(Request.QueryString("path")) if Request.QueryString("return") = "1" then Response.Write fo.Files.Count else Response.Write fo.SubFolders.Count end if %> Figure 3.16 - ASP Page Used by the PBX to Count Items inside a Folder

The third page (get_only_1.asp) is presented in Figure 3.17. This page returns either the name of a file or the name of a folder. The page receives a “return” parameter that specifies what to return (1 for files and 2 for folders). It also receives another parameter that is the number of the file/folder to be returned. For example, calling the page “get_only_1.asp” with the parameters “index”, “return”, and “path” set equal to 4, 1, and “my offers” respectively means: return the name of the fourth file in the folder called “my offers ”.

128

<% Set fs = CreateObject("Scripting.FileSystemObject") Set fo = fs.GetFolder(Request.QueryString("path")) dim counter counter = 0 dim Idx Idx = Request.QueryString("index") if Request.QueryString("return") = "1" then For Each f In fo.Files counter = eval(counter) + 1 if eval(counter) = eval(Idx) then Response.Write (f.path) Response.end end if Next else For Each f In fo.SubFolders counter = eval(counter) + 1 if eval(counter) = eval(Idx) then Response.Write (f.path) Response.end end if Next end if %> Figure 3.17 - ASP Page Used by the PBX to Get the Nth Entry in a Folder

The dialplan that will combine all of the above ideas is presented in Figure 3.18.

129

; STEP 1 - get the folder to explore exten=1000,n,Set(folder=${CURL(${Web} get_caller_folder.asp?CallerID=${CALLERIDNUM})}) ; STEP 2 - ask the caller to enter 1 for a list of folders or 2 to for a list of files ; the file "press1files2folders" will be played ; save caller's decision to variable TYP and give him 2 seconds to decide exten=1000,n(filfol),Read(TYP,${media}press1files2folders,1,,,2) ; STEP 3 - set N = number of entries inside the folder exten=1000,n,Set(N=${CURL(${Web}get_folder_count.asp?return=${TYP}&path=${fo lder})}) ; STEP 4 - call page (get_only_1.asp) N times exten=1000,n,Set(COUNT=1) exten=1000,n,While($[ ${COUNT} < ${N}]) ; get a name of file / folder exten=1000,n,Set(OneEntry=${CURL(${Web}get_only_1.asp?index=${COUNT}&retur n=${TYP}&path=${folder})}) ; instruct Festival to read the name exten=1000,n,Festival(${OneEntry}) ; ask the caller if this is the file / folder he is seeking exten=1000,n,Read(Found,${media}press1wanted,1,,,2) ; go to next line if user pressed 1 (this is what he wants) else goto NextEntry exten=1000,n,GotoIf($[${Found}==1]?:NextEntry) ; if this is a folder, change the folder to be explored ; and go again to where we explore its contents (filfol) exten=1000,n,GotoIf($[${TYP}!=1]?:FoundMyFile) exten=1000,n,Set(folder=${OneEntry}) exten=1000,n,Goto(filfol) ; else send the full path to a macro (this is the file the caller was seeking) ; the macro may open the file and read it, or send it by fax / mail exten=1000,n(FoundMyFile), exten=1000,n,Macro(DealWithFile, ${OneEntry}) exten=1000,n,Goto(bye)

130 exten=1000,n(NextEntry),Set(COUNT=$[ ${COUNT} + 1]) exten=1000,n,EndWhile( ) exten=1000,n(bye),NoOp(done) Figure 3.18 - The Dialplan to Explore the Hard Disk

131 132 PART 4 - ADVANCED APPLICATIONS

4.1 The PBX Enhancing Security

Any software PBX can be used to enhance the level of security applied to computer networks. In this part, we will present several scenarios through which we combine the phone and the computer networks. This is very important because the network phone is accessible through land lines, mobile lines, international lines, or IP/soft phones connected to the LAN, WLAN, WAN, or even the internet. This diversity allows the user to reach the network anytime from almost anywhere. The following sections present many implementations that make use of the PBX as a security tool.

4.1.1 The PBX Accessing MS Active Directory

In this scenario, we will introduce a new way to interact with the Active Directory other than the one described in the section entitled "The PBX Accessing MS Exchange Server". Here, we will allow an IT manager to reset the password of a user through his mobile phone. At this point, some may ask if such an option is secure. Well, a couple of measures can be enforced to increase the level of security: • First, the CallerID allows us to be sure that the caller is the IT manager and not some hacker trying to breach our security. • Second, the caller should be asked to enter a certain numeric password before he/she can continues

This is very similar to the way credit cards are used. The user has a card (which is the phone in our case) and a PIN code (which is the numeric password in this scenario). When a hacker has the card alone, he/she can do nothing with it because he/she still needs the PIN code. When a hacker has the stolen mobile phone of the IT manager, he/she will be able to do nothing. More restrictions can also be added. For example, if a request is made during working hours, reject it and send an email to the IT manager. If the user whose password is to be reset belongs to the Administrators group, reject the request and send an email … and so on.

133

Before we proceed, we should note that an asp page will be called with the username sent as a parameter. The web page will reset the password of this user. The IIS that runs the page should have enough privileges to do so. Thus, the IIS needs to run using an account that can connect to Active Directory and reset a password.

The ASP page that does all of this is presented in Figure 4.1.

134

<% strDomain = "192.168.10.10" Dim Computer Dim User Set Computer = GetObject("WinNT://" & strDomain) Computer.Filter = Array("User") For Each User In Computer If User.Name = "Administrator" Then User.SetPassword "our-secret" Response.Write "DONE" Response.End End If Next %> Figure 4.1 - ASP Page Used by the PBX to Contact the Active Directory

4.1.2 The PBX Acting as Servers' Guard

In this scenario, we present one example in which we enable the caller to get critical information. We will allow the PBX to contact one of the servers and check for shared folders. If the sharing situation is not as expected, an immediate action should be taken.

It would be an efficient feature if the PBX could check every five minutes for the number of shared folders. If this number is greater than a known threshold, the PBX will call the concerned IT people and tell them about this violation.

Unlike previous scenarios, the PBX is the one taking action. It can check for many other things such as the temperature of the server and event viewer entries. A long list of queries that can be used to retrieve very useful information is found in Appendix C. Whenever the PBX detects an unexpected situation, it will immediately take appropriate action. It can also be instructed to check server backups. If they are missing or corrupted, the network administrator will receive a phone call.

135 First, let's discover the shares.asp page. It receives 3 parameters: • S = name of the server to be checked • U = username to use. If server is the same where the asp page exists, there may be no need to provide a username • P = password to use.

The page will return a string including the names of all shares. This string will be returned to the PBX. It will be compared to another string that includes the names of expected shares. If the 2 strings are not equal, then this means that a new share was added without authorization. The code of the ASP pages “shares.asp” is presented in Figure 4.2.

136

<% strComputer = "." UserName = "localadmin" Password = "p@ssw0rd" Set SWBemlocator = CreateObject("WbemScripting.SWbemLocator") Set objWMIService= SWBemlocator.ConnectServer(strComputer,"\root\CIMV2",UserName,Password) Set colItems = objWMIService.ExecQuery("Select * from Win32_Share",,48) For Each objItem in colItems shares = shares & " " & objItem.Name Next Response.write shares %> Figure 4.2 - ASP Page Used by the PBX to Detect Shares

Again, an asp page such as shares.asp should never be allowed to be accessed by any server other than the PBX. Make sure the firewall is configured to block any access to this page if the request is not initiated by the PBX server.

As one can see in the above code, only the name property is used. The following is a list of other columns that can be used: • objItem.AccessMask • objItem.AllowMaximum • objItem.Caption • objItem.Description • objItem.InstallDate • objItem.MaximumAllowed • objItem.Path • objItem.Status • objItem.Type

Before running this page, there is a need to give appropriate permission so that IIS can retrieve the required data. This can be accomplished as follows:

137 • Go to Control Panel > Administrative Tools > Computer Management > Services and Applications • Right-click on WMI Control and Select "Properties" • Go to "Security" tab • Select Root / CIMV2 and click "Security" • Give appropriate permissions

The PBX will do the following: 1. call the page and get returned value 2. do the comparison 3. if strings are matching then go to step 1 after 5 minutes 4. else call network administrator and tell him about the violation

The dialplan is presented in Figure 4.3.

138

; step 1– call the page and get returned value exten=1000,n(Check),Set(shares=${CURL(${Web} shares.asp?S=PDC&U=IT&P=secret)}) ; set the expected string exten=1000,n,Set(expected= …) ; step 2 and 3 - do the comparison and branch accordingly exten=1000,n,GotoIf($[${shares}==${expected}]?: nomatch) ; if strings are matching, wait for 5 minutes and then go to check again exten=1000,n,Wait(300) exten=1000,n,Goto(Check) ; if not matching, call the network administrator to tell him exten=1000,n(nomatch),Dial(Zap/4/1234567) exten=1000,n,Festival(${shares}) exten=1000,n,Hangup( ) Figure 4.3 – Dialplan to Detect Changes in Shares

139 4.1.3 The PBX Protecting Highly Secured LANs

4.1.3.1 Abstract

Highly secured LANs connected to the internet face a huge intrusion problem. Many IDSs, IPSs and firewalls have been developed to detect and prevent intrusion, but there is a good chance of penetration if LANs are targeted by professionals who know the back doors of those products. In this section, we propose a solution for those servers that do not need to be connected to the internet all the time. A PBX machine connected to the PSTN will physically connect a server to the internet upon the request of network administrator who wants to reach the LAN remotely for an urgent task and disconnect again when the task is done. The solution includes other important scenarios such as instructing the PBX to add a rule to the firewall so that only an IP address supplied by the caller is allowed to reach the LAN.

4.1.3.2 Introduction

For highly secured resources, the availability can be sacrificed to achieve higher levels of security. In such systems, critical servers should be extremely isolated. When there is a need to connect them to the internet, this connection should occur for minimum time. Security is as strong as its weakest link (Tiri et al, 2006). In many cases, LANs are connected to the internet because there is a need to log in to some server through the internet to do a couple of tasks. This is the weakest point in the chain because outside attackers will be able to try to reach the internet-connected LAN. They do not have any restrictions regarding time or place. They take advantage of this permanent connectivity to breach the security and reach the LAN. The potential outside attackers include organized criminals, international terrorists, and even hostile governments. Such attackers have very skilled teams that can reach the LAN if the targeted server is always connected to the internet.

This section suggests keeping the internet-connected server shutdown or physically disconnected. When the administrator needs urgent access to the LAN from outside, the PBX will connect the server to the internet so that the administrator can log in and do

140 whatever is required. When he/she is done, the PBX will disconnect the server again. This will prevent outside attackers from having enough time to try their penetration techniques.

4.1.3.3 Previous Work

For decades, security engineers have been trying to find efficient ways to minimize intrusion risks. They came up with commercial solutions like firewalls, IPSs (Intrusion Prevention Systems), and IDSs (Intrusion Detection Systems). Those products monitor the traffic and check the validity of a request to a permanently-internet-connected server. This approach might be efficient if the LAN is targeted by amateur pranksters, or professional intruders hired by business competitors. However, there is no guarantee that those products will not be penetrated by their manufacturers or even some disgruntled engineers or designers. That is why a physically disconnected server is safer. This study describes the role of the PBX in connecting a server to the internet for a short period of time and then disconnecting it again.

4.1.3.4 The Threat

All law enforcement agencies are highly interested in "key escrow" or "key- recovery" mandates (Solveig, 1998). This is why one should not blindly trust commercial intrusion systems and consider them penetration-proof solutions. The possibility of being penetrated is not zero percent. Security engineers cannot guarantee that software or hardware packages used to prevent intrusion will immune the LAN against all kinds of attacks, including those launched by very professional hackers or law enforcement agencies. In fact, security history shows that there are plenty of tricks, tools, and vulnerabilities that hackers can take advantage of to penetrate. The level of trust in existing commercial intrusion systems will continuously be affected by the emergence of many hacking tools especially those that modify the system logs or launch attacks based on weaknesses found in used protocols. Many techniques allow hackers to fragment the packets to evade the IDS and then reassemble them. Evasion attacks disrupt stream reassembly and cause the IDS to miss parts of it. Some rootkits might disable logging all together or remove portions of logs that reveal their presence (EC-Council, 2004).

141 This study does not propose getting rid of solutions such as IDS / IPS systems. Instead, it suggests a new approach through which the server is connected to the internet for minimal time.

4.1.3.5 IP-based Access Scenario

In this scenario, we suppose that the network administrator needs to log in to the LAN from anywhere. There is no mechanism to restrict access to only those specified IP addresses from which the network administrator is expected to request the connection. This is true because those IP addresses cannot be known ahead of time. However, the PBX can be a part of a solution through which IP addresses are determined by the caller. First, the network administrator determines his/her IP address through the use of www.whatismyip.com. Then, he/she calls the PBX which will ask him/her to enter a 14- digit password. If the password is valid, the PBX will ask the caller to enter the IP address from which he/she wishes to connect. To achieve a high level of security, the PBX will hang up, call the hard-coded number of the mobile phone of the network administrator and ask him/her to confirm the IP address. When the request is confirmed, the PBX will use the System() function to run a Linux command that will add the caller-supplied IP address to the list of allowed IP addresses. When the network administrator is done, he/she calls again and instructs the PBX to delete the supplied IP address.

4.1.3.6 WoL Scenario

In this scenario, we suppose that a server is connected to the LAN. The network administrator needs to remotely connect to this server through the internet when he/she needs to access the LAN. Figure 4.4 shows the architecture of our approach. The 'gate' server that connects the LAN to the internet will be ‘shutdown’ so that there is no chance for an intruder to penetrate. A PBX machine will be connected to that 'gate' server through the network. WoL (Wake-on-LAN) should be supported by both the motherboard and the network card of the 'gate' server. The PBX is connected to the PSTN (Public Switched Telephone Network) through a TDM card. Using his/her phone, the network administrator will call the number of the phone line connected to the PBX machine, which will handle the call and authenticate the user. If caller's credentials are valid, the PBX server will turn on the 'gate' server so that the network administrator can access the LAN from outside. The

142 PBX will send the magic packet. When the listening PC receives this packet, the NIC (Network Interface Card) checks the packet for the correct information. When the magic packet is found valid, the NIC takes the PC out of standby/ hibernation, or starts it up (Glenn, 2009). To make sure no intruder is able to send the magic packet, the WoL network card should be accessible only to the PBX. In this scenario, the 'gate' server has two network cards. The first one connects it to the LAN and the other one connects it to the PBX. This approach does not rule out the need for firewalls and intrusion systems. They can still be in use since many lines of defense are recommended (Deloite, 2006). When the network administrator is done, he/she can call again the same number and instruct the PBX to shutdown the 'gate' server again.

Figure 4.4 – The PBX Protecting the LAN | WoL Scenario

4.1.3.7 Alternative Scenario

There are plenty of other scenarios through which the LAN can be secured. Network administrators can decide which scenario to implement. The optimum solution is to implement a combination of the following scenarios so that maximum security is achieved through the use of a multi-line defense strategy. The PBX may

143 • add a user to the Remote Desktop Group of the 'gate' server so that the user will be able to open a remote desktop session • enable the disabled account of a specific user who is already a member of the Remote Desktop Group • allow users to connect remotely to the 'gate' server by changing an entry of its registry • allow http access so that a website running on the 'gate' server can be accessed • allow ftp access • enable the disabled network card of the 'gate' server • run the stopped web server (IIS, Apache) installed on the 'gate' server • start a virtual machine that is bridged to the physical network card connected to the LAN. The 'gate' server will include two network cards. Each NIC is on a different DMZ. The first zone is the outside and the other is the LAN. The virtual NIC of the VM is bridged to the physical NIC connected to the LAN. When this VM is turned on, access from outside to the LAN will be available through the firewall. • change some settings of the network card of the 'gate' server such as adding a new IP address

All of the above scenarios, including shutting the 'gate' server down, can be implemented through the use of one of the following techniques inside the dialplan: • A special web page will be called using the CURL() function. • A special command will be sent to the operating system using the System() function

The PBX should be able to reverse all tasks performed in the above scenarios. For example, if a rule is added to the firewall, the PBX should delete it when it is no more needed. If a user account is enabled, it should be disabled when the connection is closed. If remote desktop is allowed, it should be disallowed again, and so on.

4.1.3.8 Physically Disconnected Scenario

The ideal scenario is to keep the LAN physically disconnected from the internet most of the time. This is shown in “Fig. 2”. The 'gate' server is connected to the internet

144 through a hub / switch that will be normally switched off. When the network administrator wants to connect to the LAN and open a remote desktop session, he / she calls the number of the line connected to the PSTN port of the TDM card. The PBX will handle the call. After authenticating the user, the PBX will ask the network administrator to dial 1 to connect or 2 to disconnect. When 1 is dialed, the PBX will call the extension number of the analog phone that is supposed to be connected to port I. Instead of an analog phone, a relay will receive the ring signal. When the relay receives a signal from port I, it will switch on and let the AC power reach the hub. When it receives a signal from port II, it will switch off and let the hub power down. That way, the network administrator will be able to connect the LAN to the internet using his/her mobile phone and open a remote desktop session. When done, he/she will call the PBX again and dial 2. This will physically disconnect the LAN from the internet again.

Figure 4.5 – The PBX Protecting the LAN | Physically Disconnected Scenario

145 4.1.3.9 Enhancements

The role of the PBX is not just authenticating the caller. It can perform many other tasks that will enhance security. The following list shows some of those enhancements: • Caller ID: A list of pre-defined caller IDs should include only those who can connect to the PBX. Any caller ID that does not belong to the list should be saved to a log file for investigation. • One time password: The PBX will store a list of passwords for each caller id. When a caller reaches the PBX, he/she will be asked to supply a password. The caller should have a hard copy of the list of passwords. He/she should supply a password that is not yet used. Passwords should be used in the same order they appear in the list. One-time passwords prevent a sniffing hacker from using the caught password another time. • PIN code: It is recommended to instruct the PBX to ask the user for a PIN code. That way, a hacker will need three things to penetrate: the mobile phone of the caller, the list of hard coded one-time passwords, and the PIN code. • Request confirmation: When the PBX is called, it will call several phone numbers and ask for confirmation. All of the called people should confirm the operation. If any one of them does not answer or decides to reject the request, the operation fails. • Emergency number: When administrators have doubts of possible intrusion, they should be able to call the PBX and instruct it to disable the whole functionality.

All of the above security enhancements will be handled by the PBX. If they are not needed, the PBX may be replaced by other devices that can act as an on/off switch. Such devices are reached through the PSTN.

146 4.1.4 Interception-proof VoIP using Dictionary-based Encryption

4.1.4.1 Abstract

Due to its many advantages, VoIP is nowadays replacing the traditional analog communication. Although it is a promising technology, it suffers from a big disadvantage that makes wire tapping or sniffing much easier. The only counter-measure is encryption. Many companies provided plenty of products to minimize the chance for sniffers to receive readable packets. Unfortunately, those chances are minimized but not yet eliminated. Law enforcement agencies and the products' manufacturers are still capable of penetrating. In this study, we provide a way through which VoIP communication will be encrypted using a user-defined dictionary. The purpose of this study is to show to what extent such an idea can provide sufficient immunity.

4.1.4.2 Introduction

VoIP is an excellent technology. Its main drawback is the fact that it travels on the LAN, WAN, WLAN, Internet, and other exposed networks that are accessible 24 hours a day. That's why encryption is a major concern (Greg S., 2005). CALEA (the Communications Assistance for Law Enforcement Act) is a United States wiretapping law. The sole purpose of CALEA is enhancing the ability of intelligence and law enforcement agencies to conduct electronic surveillance. CALEA does not just require telecommunication carriers to cooperate with it by allowing US government to wire tap all communication channels. It goes far beyond than this. It requires that telecommunication carriers and manufacturers of telecommunication equipments modify and design their facilities, services, and equipment so that CALEA will have built-in surveillance capabilities. This would allow federal agencies to monitor all broadband internet, telephone, and VoIP traffic in real-time (Wikipedia, 2009). CALEA forced phone companies to introduce modifications to the software and the hardware in their systems. As a result, the U.S. Congress had to fund such network upgrades. CALEA came into force on the first of January 1995. The pressure exerted by CALEA became stronger after September 11, 2000. This implies that commercial encryption solutions cannot be blindly considered safe and secure products. Thus, there is a crucial need for a solution that does not rely on commercial encryption boxes since manufacturers are required to provide the

147 US government with the keys of their implanted back doors. This necessitates having a solution that provides the required level of security. What we provide in this study is a solution through which a phone call is encrypted using a user-defined algorithm. We will create a user-defined dictionary. The caller will use this dictionary to encrypt his/her voice. Then, the encrypted voice will be transferred to the person being called who will use his/her own copy of the same dictionary to decrypt the vocal message. Our approach does not rule out the need for VPN or encryption techniques. All of those solutions can still be in use since many lines of defense are recommended (Deloite, 2006). Our only purpose is to try to find a solution that does not rely on commercial products.

There is a good reason why we focus on VoIP while the same approach can be applied to any other type of communication such as emails. The person's voice is a valid evidence. A country's ambassador might avoid having sensitive conversations on the phone because he/she is afraid of wire tapping. Sometimes, even individuals who are aware of the danger of wiretapping use the phone to convey sensitive information (Diffie et al, 2007). If an electronic message is caught by the media or an intelligence agency, it will not have the same effect as a vocal message because there is no way to prove that this email was sent by the person him/herself.

4.1.4.3 The Proposed Solution

We mean by “dictionary-based” that there is a dictionary which includes millions of unique numbers. Each number identifies a specific byte (number between 0 and 255). In other words, the dictionary is a database that includes 256 tables, named from 0 to 255. Each table has a single column that will store the encrypted value of a specific byte. The first table, called 0, will include 3,000,000 random numbers. Each one of these numbers will be used only once by the encrypting algorithm to replace the byte 0. The random numbers will have no duplicates in the whole dictionary because when the decrypting algorithm wants to retrieve the original byte, it should find only one matching value. Since every digital sound can be recorded as a series of bytes, these bytes can be encrypted by replacing them with the corresponding values from the 256 tables. The numbers will be sent to the destination, where they will be replaced back to their original values using the same database.

148 Such a dictionary-based encryption can be considered secure enough since it does not rely on mathematical methods that can be cracked. Our approach is very simple: 1. The caller will dial a number. 2. The PBX will then start recording the voice of the caller. 3. When the caller is done, the PBX will call an external program that will encrypt the recorded sound file. 4. The sound file is then sent to the PBX on the other side where the recipient is waiting to hear the caller's voice. 5. The PBX on the recipient's side will call a program to decrypt the message. 6. When the message is decrypted, the PBX will play it to the recipient.

Each of the two communicating parties has his/her own LAN that is protected behind a firewall. It would be better not to rely on commercial firewalls. Open source ones are a better choice since there is a good chance that commercial black boxes include CALEA-compliant interfaces. The presence of a firewall will protect the dictionary from being exposed to the outside world.

It is obvious that this approach sacrifices the real-time responsiveness that is required in normal VoIP scenarios. In this study, we will implement many enhancements and try to optimize the algorithm to see if the real time constraint can be achieved. We believe that two seconds of latency would be accepted. The architecture of the solution is shown in Figure 4.6.

149

Figure 4.6 - Complete Architecture of the Encrypted VoIP Scenario

4.1.4.4 Basic Scenario

In this study, we provide an algorithm to read chunks of the source media file, encrypt each byte of the chunks, and save them to another file. The encrypted file will be sent to the recipient. When the file is received on the other side, bytes are decrypted and the media file is played. Using the CURL function, Asterisk will call an ASPX page which will in turn encrypt the content of the media file. The C# code that we used to retrieve the chunks is shown in Figure 4.7.

FileStream fsw = new FileStream (dest, FileMode .OpenOrCreate, FileAccess .Write); BinaryWriter bw = new BinaryWriter (fsw); // The size of the "chunks" int bufferLen = 128; FileStream fsr = new FileStream (src, FileMode .Open, FileAccess .Read);

150 BinaryReader br = new BinaryReader (fsr); byte [] buffer = br.ReadBytes(bufferLen); while (buffer.Length > 0) { for ( int i = 0; i < buffer.Length; i++) { buffer[i]++; } bw.Write(buffer); bw.Flush(); buffer = br.ReadBytes(bufferLen); } // Close br, bw, fsw, fsr Figure 4.7 - Source Code of the ASPX Page that Retrieves the Chunks

The code shown in Figure 4.7 does the simplest form of encryption. It only adds a value of one to each byte in the array. Our goal is to use a user-defined dictionary for the encryption. Such a dictionary might be handled by some RDBMS package. Each time a byte needs to be replaced, the value is sent to the database that will return a corresponding value. Consulting the database thousands of times will have a huge impact on the performance, thus a special care needs to be taken in order to optimize the way the dictionary is stored.

We have implemented our approach and tested it using different hardware, operating systems, DBMS packages, and programming technologies. Table 4.1 shows the different configurations used.

Table 4.1 - Hardware Used to Encrypt / Decrypt Phone Calls

Hardware Operating System Web Technology DBMS

Pentium IV IIS 6 2.8 GHz Windows Server 2003 SQL Server 2000 ASP 2 GB of RAM Pentium IV IIS 6 Windows Server 2003 SQL Server 2005 3.2 GHz ASP.NET

151 4 GB of RAM

Pentium IV Apache 3.2 GHz Linux Fedora 9 MySQL PHP 3 GB of RAM 2 Quad-Core Xeon IIS on another 2.5Ghz Linux Fedora 9 Oracle Database11g separate server 8 GB of RAM

The reason we used all of those different configurations is the fact that the runtime of our first executions was not accepted at all. Using a dictionary to encrypt each byte slows down the whole process to an extent that it becomes unacceptable.

4.1.4.5 Enhancements

In order to achieve our goal which is minimizing the latency to 2 seconds, we had to come up with some enhancements. All the improvements aimed only at achieving better runtime on both sides: caller (encryption) and recipient (decryption).

Even with the powerful HW configuration shown in the last row of Table 4.1, the runtime is still unacceptable. If the format of the recorded media file is gsm, an average phrase will need 30 to 40 KB. This means we have around 35,000 bytes that need to be encrypted. Fetching the database 35,000 times cannot be done in real time using normal servers. There is an obvious need to implement a set of enhancements. Those enhancements are listed in Table 4.2 along with some details.

Table 4.2 – Enhancements to Decrease Encryption/Decryption Time Enhancement Details The main problem is that we need to access the database thousands of Algorithm Fine times. If there is a way to minimize this number, runtime will be better. Tuning For example, encrypting every other byte reduces the time by half. It is so important to decide on the percentage of the bytes that will be

152 encrypted. We implemented an array of servers, each of which will receive a part of the problem. Using .NET threading, we sent each chunk to a different Parallel Execution server. That way, many servers will be working at the same time. We used 4 servers that helped us reduce the required runtime by 75%. Instead of storing the entire dictionary in the same table, we decided to partition the data. Since each byte is only a number between 0 and 255, we can create 256 tables each of which will contain a long list of numbers equivalent to the name of the table. For example, there will be a table called 17. It might include 100,000 records. Every record is only DB Design one number. All of those numbers represent the encrypted 17. Whenever Improvement a byte is equal to 17, the first row in table 17 will be considered the encrypted byte. That way, we will not execute a statement that will retrieve data from a table containing millions of records. We will simply retrieve the first row. Numbers in all of the 256 tables should not overlap. Otherwise, we cannot know the original number if we have many corresponding values. We do not want to use the same record twice. If the value 17 is encrypted to 1234, there is no chance that the number 1234 will be used again. This dictates that we need to delete each number that is used in the encryption process. Such a deletion has a considerable impact and will slow down the encryption process. Postponing this deletion will save run time. Code When the encrypted file is sent, deletion should start since it will not Optimization slow down the encryption process anymore. As explained in the 3rd enhancement, we will have 256 tables. When we use the first number in a table, we will not delete this record. Instead, we will keep a counter (say, C) that tells the number of used records. When the encryption is done, we will delete the first C records from the table. Surprisingly, we discovered that using text files is much better than using a database to store and retrieve the encrypted/decrypted characters. Using Text Files Before the phone call starts, all text files will be opened. Whenever we need to encrypt a character, we get a number from the corresponding text file and move to the next number. This made the encryption process very

153 simple and efficient. After using this method, the runtime decreased dramatically. The previous enhancement saved a lot and minimized the encryption runtime. But we still have a huge problem: the runtime of the decryption process is still unacceptable. We created an array and filled it with all the Loading values of the text files before the phone calls start. We aimed at Dictionary in minimizing the time required to decrypt a given number. If we have an Memory array called “values”, values[1234] should return the initial value of 1234 before encryption. Although the system will keep loading for around four minutes, using this method allowed us to decrypt the whole file in less than a second.

When we applied the above enhancements - especially the last two - we got an acceptable latency. On average, it was less than 1 second; however, occasionally it went above 2. There is a plenty of factors that might have caused this. The duration and the format of the sound file are among the most important factors.

4.1.4.5 Limitations

There are some limitations that affect the efficiency and responsiveness of our proposal. They are as follows: • Latency – our approach does not provide real time VoIP phone calls. The caller will say something and wait for the recipient's response. Then, the recipient will respond and wait for the caller's reply. This is not a real time conversation but it is close to real time as the latency is less than 2 seconds. Although this is considered a drawback, getting interception-proof VoIP phone calls can make up for such a latency. • Another limitation is the fact that the dictionary should exist on both sides. Such a dictionary should not be sent over the internet. Instead, it should be passed by hand. • The core component of the security of our encryption algorithm is the dictionary. If this dictionary is unveiled, the whole process is jeopardized.

154 • The last limitation is the need for multiple powerful servers. Based on many tests we have made, each server can be replaced by many PCs. The algorithm shown in Figure 4.7 can, for example, split the file into 30 chunks and send each chunk to a different PC. Each PC will work on encrypting just one chunk. That way, average PCs can replace powerful servers.

4.1.4.6 Conclusion

In this section, we presented the way through which we could achieve encrypted VoIP phone calls. The caller will talk while the PBX will be recording the voice. Then, the recorded file will be encrypted based on a user-defined dictionary that has a corresponding value to each byte. The major problem was in runtime. However, we could introduce a set of enhancements that led to a better runtime. Those enhancements are encrypting a byte out of each 3, using 4 parallel servers each encrypting a part of the file, creating a table for each byte value, postponing the deletion, using text files instead of databases, and loading the dictionary in memory prior to making phone calls.

4.1.4.7 Future Work

Our lab tests were promising since we could achieve an acceptable latency, but we still have to get better results. Our simulated scenarios showed that our approach was valid; however, implementing such a solution in the real world is far more complicated than doing so in controlled environments. Our next step is to try to implement our approach in a call center to test how efficient, stable, and available it is. Another thing to investigate is loading the whole dictionary database in the RAM so that accessing it will be fast and efficient. This approach has a drawback which is consumption of most of the RAM. The size of the database, available RAM, number of used servers, dictionary loading time, and other factors need to be taken into consideration.

155 4.1.5 Using a Rule-based Engine to Detect Suspicious Calls

4.1.5.1 Abstract

Monitoring telecommunications systems is of a crucial importance. Nowadays, PBX software packages can be instructed to store the CDRs (Call Detail Records) in a database. There is a permanent need to analyze those records and allow business owners to detect PBX misuse from inside and outside the company. The word "misuse" covers an employee making too many personal phone calls, a salesman making fewer phone calls than expected, a client making an excessive number of phone calls, and other suspicious calls. A data-mining package or any other analytical tool would not be as efficient because it would report what happened when it is too late to take action. In this study, we integrate Asterisk with a rule-based engine called InRule. Asterisk will consult InRule whenever a call is about to be made and thus take appropriate actions.

4.1.5.2 Introduction

The PBX might be misused by employees making personal calls, long local calls, many international calls, or mobile calls to clients that can be reached through land lines. Business owners need to analyze the CDRs (Call Detail Records) and act accordingly. In traditional scenarios, the dialplan developer might decide to let the CDR analysis logic be hard coded inside the dialplan. He might do so to make sure appropriate measures are immediately taken when a fraudulent activity is detected. Consequently, the following problems might arise: • Every time the system administrator wants to implement a new rule, update an established one, or change an action, the dialplan needs to be changed and reloaded. This might introduce bugs to the dialplan. It will also require system restart, which means that all current phone calls would be terminated. • The dialplan is complicated since it includes both: o Instructions to lead the caller o Instructions to detect misuse. • The checking process inside the dialplan is slow because it is synchronous. This will exhaust the CPU (Central Processing Unit) of the PBX server.

156 • Business owners do not have the technical know-how to implement / update rules by themselves.

We will start by describing our proposed method. Then, we move to related work in this area. Next, we list the advantages of using RBE and present the imposed challenges. Then, we describe the flow of events and describe sample rules and actions. After that, products configuration is described. At the end, we exhibit the results, the limitations and our conclusion.

4.1.5.3 The Proposed Solution

In this study, we describe how to integrate Asterisk with an RBE (rule-based engine) called InRule. Other RBEs could have been selected. Some are open-source packages. Asterisk will receive the call, instruct the RBE to perform the required analysis, receive its decision, and take actions such as sending an email, making a phone call, recording the call, or ending it.

This approach will allow the following benefits: - There will be no need to update the dialplan. Instead, RBE rules will be updated - The dialplan is simple and straightforward. It just calls the RBE, which in turn will decide what to do with the current call. - The checking process can be asynchronous which means that the call will be normally handled by the PBX even if the RBE did not yet return an action. - The CPU of the PBX server will not be consumed by the analytical logic. Instead, the RBE server will handle the analysis. - Individuals lacking technical expertise will be able to develop their own rules. - Rules are documented and can be used with any other PBX. - Analysis is made by an engine designed and optimized to enforce rules.

4.1.5.4 Related Work

One might think that data mining techniques and tools are an alternative to the methodology presented in this study. In fact, they are not because there is a need for an engine that will receive the call request and send the reply immediately. This package

157 should test the request against a long list of rules, and if any of the rules is violated, the package should send a reply specifying the action to be taken. Thus, an RBE package is necessary; RBE is not just an analysis tool that will receive a copy of the Master.csv file and give a report of the patterns and violations. In a real time response, the RBE package will propose taking one or many of the following measures on the current call: ending the call, recording it, forwarding it, sending an email, or calling the administrator. The RBE will be connected to the database that the PBX uses to store the CDRs. This engine will be consulted by the PBX each time a call is about to be made. The PBX will receive the reply from the RBE and take the appropriate actions.

Some researchers tackled the same problem but with a simpler architecture, a proprietary PBX, and a different scope (Ong & Cing, 2004). Our architecture makes use of a Web Service that can be called either synchronously or asynchronously depending on the performance results. This web service is a separate layer that can be replaced by any other web service dealing with a different RBE. Our architecture includes five servers: i) The PBX ii) Web Application iii) Web Service iv) RBE v) RDBMS.

That way, the load is better balanced; consequently, better performance is attained.

4.1.5.5 Advantages of Using RBE

The following are some advantages of using RBE: - The individuals who will design, test, and verify the rules are not necessarily developers. They would not be required to have extensive knowledge about dialplans. - A separate server will be used for the analysis. - The RBE has a GUI (Graphical User Interface). It also has tools necessary to test and verify the rules. - When updates are introduced, dialplan is not affected.

158 - Business owners can at any time enable and disable a rule through the use of the RBE tools. Each rule has two properties: start date and end date. A rule might be used only for a period of time. - When using an RBE, one tells "What to do" instead of telling "How to do it". This is called Declarative Programming. - When rules are documented, knowledge is centralized. - There will be better logging for decisions and the reasons why they have been made. - Since rules change quite often, the RBE provides the required agility.

4.1.5.6 Challenges

The PBX stores valuable information about all calls passing through the server. However, it might 'miss' some calls. In such a case, it will not be able to log them. Suppose one allowed the SIP (Session Initiation Protocol) devices to 'reinvite'; this means that the devices will not need Asterisk once it has finished setting up the call. If CDRs are of crucial importance, IP devices should not be allowed to 'reinvite'. The option "canreinvite" is used to tell the server to never issue a reinvite to the client (Spencer et al, 2003). This feature can be turned off by disabling 'reinvites' (canreinvite=no) in the sip configuration file (sip.conf). With notransfer=yes, similar functionality can be controlled in iax.conf.

Many questions might arise when it is time to plan for the integration of Asterisk and InRule, such as: - How will the PBX – running under Linux – call the RBE – running under Windows? - How will the RBE check the CDRs which are stored in a Linux file instead of a database? - How will the RBE reply to the PBX? - How will the PBX notify system administrators?

The answer to all of the above questions is to simply use http request. A special web application can be developed to handle all communication between the PBX, the RBE, and the CDR database.

159 4.1.5.7 Flow of Events

The events will occur in the following order: • The PBX receives a call and sends an HTTP request using the CURL function. CURL (URL) downloads the given URL and returns the downloaded HTML. CURL () is usually called to signal external applications. If our approach dictates making synchronous calls, then the PBX will keep waiting until the called web page returns a reply. Just then, the PBX can decide what to do. In case asynchronous calls are being made, it can continue without waiting for the http reply. However, it will not be able to take action because the reply might be received after the call is terminated. • A web page will receive a request from the PBX. This page might either consult the RBE immediately or connect to a Web Service that will do the job. Both options are valid, but we will use a Web Service to make sure our approach will work even when the RBE is running on a remote server, other than our web server. The web page can call the web service in synchronous or asynchronous mode. • When it receives the request to make the current phone call, the RBE will check the rules and the CDR database. The rules will be stored in an XML file. The CDRs should be stored in an accessible database. By default, CDRs are stored in a text file, but Asterisk can be configured to store them in a MySQL database. Then, the RBE will use the ODBC driver of MySQL to connect to the database. Another alternative is to store the CDRs in SQL Server 2000 database. • After the RBE checks the rules, it sets a string of actions to be taken. This string is returned in the form of an HTTP reply to Asterisk. • Asterisk will receive the reply and decide what to do.

Figure 4.8 shows the flow of events along with the used protocols.

160

Figure 4.8 – The PBX Working with an RBE | Flow of control

4.1.5.8 Rules and Actions

4.1.5.8.1 RBE Rules

This study does not focus on the rules themselves. It focuses on the way the integration is implemented. Rules will be developed at the discretion of the business owners. The need for better security and higher control play a crucial role in creating more rules. The development of rules is an ongoing process. Therefore, the number of rules is definitely expected to increase with time. RBE is instructed to take appropriate actions with the following scenarios:

1. Personal Calls 1.1 the same employee regularly calls the same number within the same time range 1.2 an outsider regularly calls within the same time range 1.3 a missed call occurs

2. Busy Numbers 2.1 an outsider keeps our fax number busy after business hours (Denial-of-Service attack) 2.2 all of our lines are busy

161

3. Salesmen Performance 3.1 an employee makes 25% more calls than what he/she had made during the previous month 3.2 an employee reaches x calls / week 3.3 an employee calls a client 25 % above his / her average number of calls per month

4. Client Problem 4.1 a client calls but no one answers 4.2 a client makes 25 % above his /her average calls per month 4.3 the duration of a client's call exceeds x minutes

5. Monitored Numbers 5.1 a specific client is called 5.2 a specific employee is called 5.3 two employees of different branches call each other 5.4 extension x is calling extension y

6. Local Calls 6.1 the total duration of all local phone calls made today exceeds x minutes 6.2 the total duration of all local phone calls made this week exceeds x minutes

7. Mobile Calls 7.1 number of calls made to a mobile phone by the same employee exceeds x / week 7.2 an employee calls a client's mobile phone when the client can be reached through land lines 7.3 the total duration of all calls made to mobile phones exceeds x minutes

8. Distance Calls 8.1 an employee makes international calls more than x times / week 8.2 the total duration of all international calls made this week exceeds x minutes

9. Off Hours 9.1 someone calls after business hours

162 9.2 someone calls during the weekend

10. Miscellaneous 10.1 the caller is the CEO 10.2 an employee is the most frequent caller 10.3 an employee is the most frequent call recipient

It is worth mentioning that there are other rules that can still be developed based on the need of business owners. Our goal is to allow the business owners to detect fraudulent activity. Basically, there are two pre-requisites to fraudulent activity detection. The first is to collect historical data about line activity. The second is to apply some statistical techniques on the data in order to build a model that tells which activity is fraudulent and which is not (ECOS 97). Data warehousing and data mining techniques can be also used to analyze the call records. Extreme care should be taken when using such CPU consuming techniques.

4.1.5.8.2 The PBX Actions

For performance considerations, we can instruct the web server, rather than the PBX, to perform certain actions that do not need the PBX's intervention. Our approach is to let the PBX perform all actions. Table 4.3 lists some of the actions that can be returned by the RBE.

Table 4.3 – Actions Taken by the PBX when Suspicious Calls are Detected Code Meaning Comments AC allow call Everything is OK. No action is to be taken RC reject call Something is wrong. Reject the call. NM non-critical mail Send an email to the system administrator about this call. CM critical mail Same as above. The email will be flagged as critical. CA call admin Call and inform administrator about the call. RK record and keep Record the call and keep it until the administrator checks it. RP record and play Record the call. When done, call the administrator and play

163 it for him / her. blacklist the Do not accept any more calls from the caller. The caller ID BC caller will be stored in the RBE black list document. NO No CDR Disable CDRs for the current call.

4.1.5.9 Preparation and Configuration

4.1.5.9.1 Configuring Asterisk

4.1.5.9.1.1 Dialplan

When the call is about to be made, the dialplan should make a request to a web application. The called web page will return a string that includes a series of actions to be taken. Asterisk will take all of those actions. Figure 4.9 shows the dialplan that does so.

; call the web page and store the returned string of actions to variable A ; 3 parameters will be sent: The UNIQUEID of the call, the called person and the caller exten=1000,n,Set(A=${CURL(${Web}Call.asp?UID=${UNIQUEID}&X=${EXTEN} &C=${CALLERIDNUM})}) ; assign the last 2 letters to variable TODO exten=1000,n(again),SubString(TODO=A,-2,2) ; assign all but the last 2 digits to variable A exten=1000,n,SubString(A=A,0,-2) ; if A is not yet equal to !! stop taking actions. ; the web page will add !! to the end exten=1000,n,GotoIf($[${A}!="!!"]?:NoMore) ; call the macro TakeAction exten=1000,n,Macro(TakeAction, ${TODO}) ; check next action exten=1000,n,Goto(again) ; quit because no more actions are to be taken exten=1000(NoMore),n,NoOp()

Figure 4.9 - The PBX Working with an RBE | Action-Taking Dialplan

164

The "TakeAction" macro used above is simple and straightforward. It will receive a string of actions and take them one after the other.

4.1.5.9.1.2 CDR Configuration

CDR filtering starts with identifying the CDR data fields (Nikbakht & Tafti, 1989). Asterisk CDR table has 16 columns. In this study, we focus on 2 of them: source and destination. All that RBE requires is identifying the caller and the receiver. Asterisk makes use of the directory “/var/log/asterisk/cdr-csv” to store the CDR data. By default, this comma-separated text file is called Master.csv file. We are not interested in this file since the RBE needs to read data records from a database. Although the RBE can use ODBC (Open Database Connectivity) to connect to a text file and deal with it the same way it deals with a database, it is not a recommended approach. The performance is critical. Thus, the data should be indexed and optimized for retrieval. Those capabilities are found in any RDBMS (Relational Database Management System). Therefore, there is a need to store the CDRs in a database. At this stage, there are a plenty of options. InRule can connect to any database using the ODBC connectivity. Asterisk can be instructed to follow one of the following approaches: • To store the CDRs in MS SQL Server through the use of unixODBC. This is recommended because InRule can easily connect to MS SQL Server. • To store the CDRs in any Asterisk-supported database including SQLite, PostGreSQL, MySQL, and others. If this is the case, then the RBE will use ODBC connectivity to connect to the CDRs database.

Both approaches are valid, but we will follow the first one because when a call takes place, Asterisk will have enough time to store the CDR of the call in a database using ODBC connectivity. Then, the RBE will work with MS SQL Server and perform the required queries faster than connecting to another RDBMS using ODBC connectivity. This might not be the case if other RBEs are in use.

4.1.5.9.2 Preparing the Web Application

The mission of the web application is easy. The main page will

165 i) receive a request from Asterisk ii) send the request to the web service iii) receive a reply as a list of actions to be taken iv) send the action string back to Asterisk

The Page_Load code of the only page is presented in Figure 4.10.

CallMonitor.Service CM = new CallMonitor.Service(); string UniqID = Request.QueryString["UID"].ToString(); string Caller = Request.QueryString["C"].ToString(); string Called = Request.QueryString["X"].ToString(); string Actions = CM.GetActions(UniqID, Caller, Called); Response.Write(Actions); Figure 4.10 - The PBX Working with an RBE | ASP calling the Webservice

Before calling the web service from the web page, a web reference needs to be added. The web page should return only one string, which includes 2 letters for each action to be taken. In order to make sure that the web page will not return extra text such as and , all text from the aspx page should be erased except for the following line of code:

<%@ Page Language="C#" AutoEventWireup="true" CodeFile="Default.aspx.cs" Inherits="_Default" %>

4.1.5.9.3 Preparing the Web Service

The web service is closely related to the RBE. It does the following: i) take 3 input parameters ii) send them to the RBE iii) instruct the RBE to do the required checking iv) get feedback from RBE v) send the reply back to the calling web page

166 Again, there is a need to add a reference to the RBE in order to be able to communicate with it. The WebMethod is presented in Figure 4.11.

[WebMethod] public string GetActions(string UniqID, string Caller, string Called) { RuleSession RBESession; string RulesFile = @"C:\ Rules.ruleapp"; // load the XML files of our rules RuleApp ruleApp = new FileSystemRuleApp(RulesFile); // create a new connection to the RBE InProcessConnection ruleConn = new InProcessConnection(); RBESession = new RuleSession(ruleApp, ruleConn); // create entity + fill in values + apply rules Entity CDR = RBESession.CreateEntity("CDR_Entity"); CDR.Fields["Caller"].SetValue(Caller); CDR.Fields["Called"].SetValue(Called); CDR.RuleSession.ApplyRules(); return CDR.Fields["Actions"].Value.ToString() + "!!"; } Figure 4.11 - The PBX Working with an RBE | GetActions Web Method

167 4.1.5.9.4 Configuring the RBE

Although we decided to use InRule, any other RBE can do the job. We aim at assigning the decision making to an external tool. InRule can be easily replaced by many open-source or proprietary rule-based engines. Like other RBEs, InRule has graphical tools that allow the user to add, amend, delete, test, and verify the rules. If the user is a programmer or a database developer, he/she can create rules by writing code. The user can also write rules using natural business language.

We will implement rule 3.1 which states that when an employee makes 25% more calls than what he/she had made during the previous month, a critical mail needs to be sent. To create this rule, we will use the irAuthor, a tool that comes with InRule and helps in creating and maintaining rules. First, we need to create an entity called CDR_Entity. Next, we create 3 fields that belong to it. Those fields are listed in Table 4.4.

Table 4.4 - RBE Entity Fields Name Data Type I/O Comment Caller Text I The phone number of the caller Called Text I The phone number of the receiver A string of characters. Each 2 letters will let the Actions Text O PBX take a specific action

After we have created the entity, we create our rule. Figure 4.12 shows the irAuthor tool where the entity is created and our rule appears as an if-then-else node:

168

Figure 4.12 - InRule Interface

A number of arrows are placed in Figure 4.12. Table 4.5 explains those marks.

Table 4.5 - irAuthor Legend Mark Description A This is the connection to the CDR database. As long as the DBMS has an ODBC driver, connectivity can be successfully achieved. B This is a query that accepts 2 parameters and returns the number of phone calls made by a certain user in the current month. The query is as follows: Select count(*) From CDRTable Where clid = @caller and month(calldate) = @curmonth. This query will be used in the if-then-else rule. C This is our entity. It has 3 fields.

169 The first two fields of our entity are supplied by the web service. The third is the output parameter, the value of which will be decided by the RBE and returned to the D web service. Thus, the web service tells the RBE who is calling whom and the RBE tells the web service what to do. E Those are sets of rules. We decided to create a set for every category. F This is our rule. It is an if-then-else statement. The formula that uses the query explained in B is as follows: Calls_1User_Current_Month(CDR_Entity.Caller; Month()) > (Calls_1User_Current_Month(CDR_Entity.Caller; Month() - 1) * 1.25) The left-hand side returns the number of calls made by the caller (first field in our entity) in the current month. The right-hand side returns the number of calls made by the caller in the previous month * 1.25. When this condition is met, it means that the user made 25% more calls than the previous month. For simplicity sake, this formula ignores the first month of the year. G This statement will be executed if the above condition is satisfied. It will let the value of the third field (Actions) equal to 'NM' which means Non-critical Mail. Actions field will be returned to the web service. The web service can send the mail by itself or it can send the Actions string back to the PBX which in turn will send the mail message. H This statement will be executed if the above condition is not satisfied. It will let the value of the third field (Actions) equal to 'AC' which means Allow the Call.

4.1.5.10 Results

We have developed a simple dialplan that keeps generating phone calls to test the performance of the servers. The goal is to measure the CPU load of each server. All servers have similar hardware. The processor is 2.4 GHz Pentium IV and the capacity of the RAM is 2 GB. Echo cancellation is turned off because it doubles the CPU consumption. Table 4.6 shows the configuration of the servers:

Table 4.6 – PBX, RBE, CDR, and Web Servers Components Server Installed Components

170 PBX AsteriskNOW 1.0.2.1 Web App. Win. Server 2003 + IIS RBE Win. Server 2003 + InRule CDR Database Win. Server 2003 + SQL Server 2000

Table 4.7 shows the CPU load of our four servers per the number of active channels. In this table, each server has a column specifying the percentage of the CPU usage for the corresponding number of active channels.

Table 4.7 - Percentage of Servers’ CPU Usage / Active channels Concurrent Calls Asterisk Web App RBE SQL Server 10 5 2 8 11 20 10 8 17 20 30 15 10 26 33 50 25 15 45 52 100 50 35 87 95

Figure 4.13 shows the CPU measurements.

171

Figure 4.13 - CPU Measurements

This system can handle up to 80 concurrent calls without a problem. The main factor that still affects the above measurements is the number of rules defined in the RBE. This is true because the RBE will check the call request against all defined rules. Another crucial factor is the logic of the rules. Needless to say, a data-mining rule might bring the server to its knees. All of our tests were made using only 30 uncomplicated rules.

Response time is affected by many factors. We could achieve real time response when the server was handling less than 25 concurrent calls. Above this threshold, the server was becoming slower. In fact, our tests showed that when the server held 50 concurrent calls, we got a latency of 2 seconds. Such latency is not acceptable and will disturb the two communicating parties. We believe that load balancing can do a lot in this regard.

4.1.5.11 Limitations

The RBE-oriented approach has certain limitations that might make it less likely to be followed. Those limitations are the following: • Four servers are used (Asterisk + Web App Server + Web Service Server + RBE Server + CDR Database Server). This implies that the whole setup will stop

172 functioning if any of the servers goes down. Our objective in the near future is to reduce this number. • The architecture requires experts in several technologies (VoIP, Web, Database, and RBE) to keep the servers running. • The CPU load tests do not seem to be satisfactory although they are encouraging.

4.1.5.12 Conclusion

In this study, we described the integration of Asterisk and InRule. The goal is to allow our PBX to trigger some actions when specific events take place. We could have omitted the use of RBE by simply creating a tool that analyzes the CDRs, and reports suspicious calls to the user. This would mean that an incident – such as a client making an excessive number of phone calls – will occur but no notifications would be made unless someone specifically asks for a report. What we were able to achieve in this study is a real time response. Moreover, the use of RBE will allow business owners and non-technical users to express their needs in the form of rules since rule-based engines are made for this purpose.

Whenever a phone call is about to be made, the PBX will make an http request to a web page that will call a web service, which in turn will call InRule. Although the web service or the web application could have been designed to take some actions, we decided that the PBX is the best "action taker" because some actions include recording or blocking the phone call. Our approach allows the use of both: synchronous and asynchronous call to the web service. If the synchronous approach slows down the performance of the PBX, it is advisable to follow the asynchronous one.

4.1.5.13 Future Work

To ensure the effectiveness of the solution proposed in this study, more tests and measurements need to be considered. The main two factors which need to be more emphasized are the number and the complexity of the rules. Real-life scenarios, rules, actions, and phone calls will lead to more accurate benchmarking. The following factors will also affect the curve: • types of used phones

173 • number of external lines • employed codecs • voicemail availability • conferencing • recording • faxing • voice menu • text-to-speech translation • speech recognition • load balancing • encryption • firewall configuration

174 4.2 Eye-like Algorithm to Produce Voice Web Pages

4.2.1 Abstract

Nowadays, internet web sites and applications are efficient tools with which users interact by downloading and uploading data through a browser. This same functionality is not available through the phone for all internet web sites and applications. In this chapter, we propose a solution through which callers can call a global provider and specify the web site they wish to surf. A voice application will lead them through the web site allowing them to receive and enter data. We would like to avoid techniques that require radical changes in existing web sites and applications. We propose to implement abstractions inspired from the behavior of the human eye. These abstractions should be modeled in an algorithm that will allow us to extract only the most relevant information which catches the attention of a user while surfing a web site.

4.2.2 Introduction

Internet users can simply write the URL (Uniform Resource Locator) of the web site they wish to visit in the address bar of their browsers in order to start interacting with the site. Then, they can have a look at the returned page, read some titles or menus, decide what their next destination is, and click some menu item or a hyperlink. They keep doing so until they reach the page that includes the content which they are interested in. Phone users cannot do the same with all web sites / web applications available on the internet. This chapter proposes a solution through which callers will be able to interact with all web sites / web applications available on the internet. The term 'interact' means that the users will be able to receive and enter data. Receiving data will occur in the form of a sound played by a text-to-speech engine. Entering data occurs through the use of the dial buttons or a speech-to-text engine.

4.2.3 Related Work

The IBM WTP (WebSphere Transcoding Publisher) provides transcoding from HTML (Hyper Text Markup Language) to VXML (Voice eXtensive Markup Language) (Lamb & Horowitz, 2001). WTP works well for simple HTML documents. For example, if

175 a page has no heading tags, the result would be very low in terms of usability. Annotations were suggested as a method of specifying important sections of pages (Hori et al, 1999). We do not believe that such a suggestion may work since it requires augmenting the original HTML file with annotations. We believe that this approach is not optimal. It is a significant burden to keep the HTML file synchronized with its associated annotation file. Many other suggestions and products attempted to offer a solution, such as Aurora transcoding system (Huang & Sundaresan, 2000), SALT (Wikipedia, 2008), XHTML+Voice (W3.org), Sisl (Ball et al, 2000), and UIML-to-voice transcoders (Plomp & Mayora-Ibarra, 2002). Those solutions require introducing changes to every single page on the internet. That's why we don't think those suggested approaches will be globally considered as efficient solutions.

Technologies such as VXML enhanced voice-based applications and made their development far easier and simpler. Nevertheless, they did not solve the whole problem because developers still need to create a separate application for each web site. This duplication of efforts leads to unnecessary redundancy and requires the developers to keep changing the voice-based applications every time the web site content is changed and vice- versa. This study proposes a solution that allows callers to visit and interact with the same web site surfed by normal internet users.

4.2.4 The Need

What the phone community needs is voice internet which allows all phone users to reach most web sites / web applications around the world without being in need of installing anything on their phones, or upgrading any software components. It is a voice- based solution through which callers receive sound and interact by saying their choices and/or using their dial keys. The need for such a solution is justified by the following reasons: - The phone is more portable than the PC (Personal Computer). - Connectivity through the phone is achieved faster and easier than through the PC. - Callers do not need to be sophisticated PC users. - Number of phone users is far greater than number of internet users (Refer to Table 1.2 and Table 1.3)

176 Although this proposal applies to most of the web sites, some web applications rely heavily on visual interface and cannot be explored through the phone. A good example is the famous web site of Google Earth (http://earth.google.com/).

4.2.5 Proposed Solution

Several transcoders convert HTML pages to corresponding VXML ones (Lamb & Horowitz, 2001). The main obstacle is not the straightforward translation from a language to another. Instead, it is in the significance of the output of such translation (Shao et al, 2002). To achieve a high level of this significance, many researchers decided to use annotations so that HTML pages can be dynamically analyzed. Adding annotations to the HTML page or creating a new file to inform about the different parts of the page or what parts might be of interest to the callers enhances the way the HTML page can be processed by the voice application. However, this means that each HTML page needs to be changed to meet those new specifications. If each web site has a corresponding voice application, companies will need to spend more money in order to develop their voice-based web sites and keep both sites synchronized. This study aims at proposing a solution that does not require radical changes.

It would be an excellent solution if we can develop an algorithm that acts just like the human eye. This algorithm is supposed to 'look at' a web page and discover which parts of the page are most attractive to the human eye. For example, the human eye is more attracted to: - Special fonts (bold, italic, underlined, big, and colored fonts) - Marquee - Blinking text - Text inside a box / with a colored background - Hyperlinks - Menus - Segments that can be recognized by horizontal lines, In-line frames, frame sets, tables, and div tags - Other visual effects that will be ignored by our algorithm (Animations, Graphics, Videos, …)

177 To test the algorithm, typical users will be asked to look at a page for 5 seconds. Then, they will be asked to close their eyes and say what they saw in the page. The algorithm should say the same. This algorithm tells what a human eye can see when looking at a page. It should act the same way a human being does. Usually, we look at a page, discover five or six major parts of it, and then decide where to click. The algorithm should study the page and tell the caller about the major parts. The caller will then be able to decide where to 'focus'. When an area of interest is determined, the caller will be able to fine-tune his 'focus'. This bio-inspired abstraction agent will dynamically discover the most attractive parts of the page and tell the caller about them. A similar algorithm was developed by Feng GUI Lab (www.feng-gui.com). This algorithm might be a start point. Then, some modifications need to be made so that the algorithm focuses more on parts of the page that can be delivered to the caller. For instance, the algorithm should ignore images since they cannot be delivered to phone users. Flash and any other animation media files should also be ignored. Attention and attraction should be analyzed with phone users in mind. Another important modification is to let the algorithm focus on menus and page structure. The algorithm should also be able to OCR the text of clickable images. Figure 4.14 shows the architecture of the solution.

Figure 4.14 – Architecture of the Eye-Like Surfing Scenario

4.2.6 Analyzing Dynamic Pages

We believe that it is not fruitful to focus on the HTML file alone. Previous studies tried to explore the annotated HTML pages (Asakawa & Takagi, 2000). They aimed at

178 finding content that might be of interest to the phone callers. Analyzing HTML pages is not an easy task due to the fact that they can be messy and disorganized. When the structure of the HTML page is known, there is a good chance of presenting the page to the caller in an efficient way. The main drawback of the annotation approach is that it requires changing the web pages or adding new ones. Our proposal is to analyze the source page and the resulting HTML page in order to discover the structure. That way, there is no need to introduce any change to any page. We propose developing a tool that will be used by the web owner to create XML files that describe the structure of all the pages of the web site. Those XML files will be published so that they can be reached by the voice application. The eye-like algorithm will keep working even if those XML files are missing, but their presence will enhance its functionality. Being able to receive source code of the page helps a lot. Knowing that the page contains some controls will give the algorithm a great advantage. The presence of any of the following controls gives the algorithm a considerable hint: repeater, grid, panel, check box list, radio button list, bulleted list, calendar, multi-view, data source, site map path, menu, tree view, ad rotator, and user- defined controls. To demonstrate the idea, we will explore the source file of an ASPX web page and the HTML output. Our goal is to show that analyzing the HTML page is often complex while analyzing the source file is much easier.

Figure 4.15 shows the source code of an ASPX web page that includes only a tree control. Needless to say, analyzing this piece of code is far easier and simpler than analyzing the HTML page that includes around 60 lines as a result of only two nodes in the tree. This is also true in case a repeater is used. Finding a repeater in the source page will simplify the analysis that aims at discovering the structure of the page. Analyzing the HTML code resulting from using a repeater might be a very complex process.

179 Figure 4.15 – Source Code of a Sample ASPX Page

4.2.7 Dealing with Web Applications

In addition to being able to receive information from the website, the caller will be also able to interact with the web application. He/she will be able to: - simply enter numeric data - enter text through the use of his 9-button dial pad - record a message instead of entering textual data - tell what he wants to a Speech-to-Text engine

Whenever the voice application finds a form that requires data entry from the caller, it will deal with the different types of controls found in the page. Table 4.8 shows how HTML controls can be handled.

Table 4.8 – Handling HTML Controls HTML Control How the voice application handles it Label This control can be simply handled. The text of the label will be sent to a TTS engine which will read the text to the caller. Text box Since the web page will not include specified grammar, speech-to- text engines cannot be used to make data entry in a text box control. This control requires special treatment since it accepts all types of data. To handle the text box control properly, we have two options: - either we allow the user to spell his text one letter after another so that a speech-to-text engine can receive his/her letters and concatenate them - or we let the user use the nine buttons of the dial pad to send text the same way he/she sends an SMS message. Drop down list Handling this control is not complicated. There are some options within the list. A text-to-speech engine can read them to the caller who will 'say' the option. A speech-to-text engine can then detect what he/she 'said'. The grammar will include only those terms found in the list. List box This control will be handled the same way as drop down list. Check box The check box is a control that can be either true or false. Thus, it can

180 be handled by asking the user what he/she wants and getting what he/she 'says'. The grammar includes two terms: yes / no Radio button The radio button is very similar to the check box control. However, there is one difference only: when the user checks a radio button, all other radio buttons in the same collection should be unchecked Button The caller can click on a button by saying 'click' Submit button The caller can submit a page by saying 'submit' Hyperlink The caller can follow a hyperlink by saying 'follow' Grid The caller can navigate a grid. He/she can scroll up and down by saying 'scroll up' and 'scroll down' respectively. A speech-to-text engine will receive his/her request and do the scrolling. The same approach applies to grid paging. A text-to-speech engine will read the text for the caller. The user can also say 'top' to go to the top of the list and say 'bottom' to go to the bottom of the list. The user can use the same navigation commands to scroll inside a drop-down list or list box. Validator Validators can be used to make sure the user will not make an invalid entry. For example, the page might include a java code that enforces the user to enter a valid email address. The same java code can still be used to check the validity of the entry made by the caller. Java alert The web page sometimes makes java alerts to tell the user about some issues. For example, an alert message may warn the user that a certain field is required. The same approach can still be followed by the voice application. Alerts will be read to the caller by a text-to- speech engine.

4.2.8 Challenges

We still have to face many challenges. One of those is the way a hyperlink is handled. There is a crucial need to audibly identify hyperlinks (Shao et al, 2002): - Changing the voice and producing background sound are among the proposals to indicate that the words being read are part of a link (James, 1998).

181 - Determining the context where the hyperlink appears is of extreme importance (Shao et al, 2002). - Determining the text to be presented to the caller is also critical (Shao et al, 2002). - Determining which part of the link the caller will say to select the option is a big problem if the link contains a long sentence.

Another challenge is the run time of the eye-like algorithm. When websites such as Google, Yahoo, Gmail, Hotmail, or CNN are heavily contacted by phone users, the performance might be a major issue. The last and most challenging issue is the eye-like algorithm and the accompanying tool that produces the XML files which describe the structure of dynamic pages.

182 4.2.9 Limitations

There are some inevitable limitations to the approach presented in this section: - Technology-dependent algorithm : Algorithms that analyze the source page of the web page are technology-dependent. Thus, each technology will have its own algorithm. For example, .NET controls have their own behavior and need to be handled accordingly. Other technologies such as PHP behave differently and thus require different mechanisms. - Accessing the XML files : owners of web applications will need to use the special tool that analyzes the source pages and produces the XML files that describe the structure of the web pages. Those XML files should be published so that they are accessible. An alternative solution might be to allow the voice provider to access the source pages and generate the XML files dynamically. Although this might be an efficient solution, there will be a considerable impact on the runtime because the XML files will be generated only when they are needed. Some web owners might decide not to publish the structure of their web pages.

4.2.10 Conclusion

Our main goal is to avoid techniques that require radical changes. We proposed to implement abstractions inspired from the behavior of the human eye. A similar algorithm was developed by Feng GUI Lab. This algorithm will allow phone users to access web sites and applications without the need for annotations. Then, we suggested analyzing the source files of web pages and creating XML files that describe the structure of the page. Analyzing the source file is easier than analyzing the HTML page. Finally, we presented ways of handling different web controls using voice interface and then we listed some challenges and limitations.

4.2.11 Future Work

A lot of work needs to be done on the method presented in this section. The eye- like algorithm needs to be developed and tested before it can be globally used. This algorithm will let phone users interact with the web page as if they are 'looking at' it. We need also to develop the tool that analyzes the source files of the web pages and try to

183 generate XML files that describe the structure of the pages. Those XML files will give the eye-like algorithm important hints that will enhance its efficiency.

184 4.3 Academic VoIP Blog for Elementary Schools

4.3.1 Abstract

Computers have played an important role in educating nations in the last two decades. Nevertheless, computers are still not very affordable, especially when compared to phones. Besides, users are usually required to be mature and skilled enough to use a computer. In this paper, we propose a VoIP solution thanks to which students below 10 years of age would be provided with services similar to those provided by a typical forum / blog. Due to their simplicity, blogs enhance collaboration among students. The challenge is to be able to offer a user-friendly solution that can be used by students of grades 4, 5, and 6. Through this audio forum / blog, a student can use an IP phone to post questions related to an academic topic. He/she will also be able to receive answers from a classmate. The goal of the study that we conducted is to check whether such an audio alternative can replace a visual device. This study also aims at checking if this replacement is efficient, easy to use, and productive.

4.3.2 Introduction

In the education field, there is always an increasing need for innovation. That’s why educators are always on the move to come up with new tools and facilities. Undoubtedly, the computer is one of those most commonly used facilities. Most schools, nowadays, encourage their students to make extensive use of computers because computers provide excellent educational services such as offline educational packages. Although these packages can help a lot, allowing students to interact with each other is a more efficient learning methodology. Due to their interactive nature, blogs are explored as a collaborative learning tool because they provide a forum for learning (Williams & Jacobs, 2004).

Versatility is one of the advantages of blogs. They provide a wide range of uses and there is no rule which states that a blog should be operated or even owned by an individual. There are family blogs, group blogs, corporate blogs, and community blogs, There are also blogs defined by their content. There is even a new type of blog that emerged in educational circles (Williams & Jacobs, 2004).

185

Ten Lebanese elementary schools participated in a study that we conducted in the second quarter of 2010. At the end of the study, around 50 teachers were asked to fill out a questionnaire. The analysis of the results showed that students often start using computers for academic purposes when they reach the age of 10 and sometimes even later. Here, we emphasize the word “academic” since students might start using the computer earlier for entertainment or other non-academic activities. Our study will focus on students between the age of 7 and 9. Our goal is to test if audio systems (phones) can replace audio-visual devices (computers). Using Asterisk, we will develop an educational tool which will enable the students to communicate the way they do using forums / blogs. Our tool will be called A3 “Academic Audio Access”.

4.3.3 Typical Scenario

Instead of computers, schools would be equipped with IP phones - readily accessible in the entire building. The scenario starts with a student who dials 111 to record a question about one of the topics. The student is given a choice to specify the material and the chapter to which the question belongs. The simplicity of the process is essential to the success of “A3”. Another student might dial 222 to check recorded questions and provide answers. Later on, the question owner might dial 333 to check if anyone was able to provide an answer. To authenticate the students, callers will be required to supply a 4-digit password.

4.3.4 Previous Work

Blogs are nowadays considered a valid collaborative learning tool. Since they provide a forum for academic discourse (Allen, 1999), a large number of academic forums / blogs were established in the last two decades for the sole purpose of helping students discuss academic topics. However, the school administration and the academic staff were not always able to benefit from those forums / blogs because they were not designed to help in achieving quality education. Besides, although some were used to monitor the academic level of the students, most of them did not target students below 10 years of age.

186 Encouraging young students to benefit from an academic tool that uses collaborative learning is so crucial because this type of learning stimulates students to negotiate and discuss complex problems from their own perspectives. It also encourages learners to elaborate and evaluate information while attempting to solve problems (Baker, 1994), (Scardamalia & Bereiter, 1994), (Dillenbourg & Schneider, 1995), (Erkens, 1997) and (Veerman, 2000).

To secure a suitable environment for collaboration among students, blogs and forums, along with other electronic solutions, were used. Blogs and forums were successful in increasing collaborative activity, knowledge sharing, reflection and debate due to their simplicity and ease of use. However, complex and expensive technology has failed to be that efficient (Williams & Jacobs, 2004). Many field experts argue that traditional knowledge management tools are complicated to implement, whereas informal systems like blogs facilitates knowledge capturing (Bausch et al, 2002).

In spite of all the above-mentioned facts, in most developing countries, typical full- featured blogs are still considered complicated for students below 10 years of age, especially those who do not have great experience in using computers. Thus, there is a need for a simpler solution that will still support the concept of collaboration.

4.3.5 The XO Laptop

The OLPC (XO Laptop) project aims at providing educational opportunities for the poorest children of the world by giving each one of them a low-cost laptop. OLPC laptop is designed in a way that aims at enabling kids to learn, create, and collaborate. The main goal behind this project is to encourage kids to become connected to each other and to the world. This will eventually create educational opportunities for the poorest children of the world (Williams & Jacobs, 2004).

The OLPC laptop technology has been discussed in several publications (Hourcade et al, 2008), (Fox, 2009) and (One Laptop per Child, 2009). This laptop is designed in such a way that makes computer usage much easier. The simple fact that such an invention was made implies that traditional computer cannot be efficiently used by young children because of its complexities. This re-emphasizes the need to simplify computer usage so

187 that children can benefit a lot from such an educational device. Thus, there is a need to provide simpler solutions, such as “A3” and “XO laptop”. Although the “XO laptop” would solve many problems, especially the complexity issue, it would not be as simple as “A3” that is fully controlled through the use of only 12 dial buttons. Moreover, phones are only interfaces through which students will interact with the VoIP server. There will be no need at all to upgrade them. If any new feature needs to be installed, the VoIP server will be updated. This is not the case with the “XO Laptop”. If, for example, a new plug-in is required, all the students will face upgrading / downloading problems. Although companies like Microsoft and Google did their best to overcome this obstacle, users still face the same problem of downloading a new ActiveX or required Plug-in.

4.3.6 The Need

Students need a tool which they can use to discuss problems from different perspectives, ask questions, and receive answers. This tool should enable them to propose various solutions and evaluate them (Petraglia, 1997). They have been able to do so with a computer for a long period of time. However, in most cases, a 7-year-old student is not expected to be able to easily turn on a PC, turn it off, login to the operating system by providing credentials, logout, open a browser, sign in to a forum by providing different credentials, sign out, connect to the LAN (Local Area Network) / WLAN (Wireless LAN) / Internet, use a mouse, input data using a keyboard, and react properly to some operating system events, such as: - expired anti-virus package notification - sudden system restart - disabled firewall notification - pending action awaiting administrator’s approval - required plug-in / ActiveX control - several other system messages

Although network administrators try their best to avoid situation where users face such problems, they still occur, disturb the users, and get in their way.

To achieve our goal, we need to use a small part of the services that the computer provides. This implies that there is a need to disable all unneeded services so that the users

188 will not misuse them .However, attempting to do so is not always successful. This is why a phone that will only offer the required service (sign in, post a question, and hang-up) is the right solution. It is important to mention that we are not suggesting that students quit using computers. Instead, we are trying to find an alternative for those who i) are not capable of using the PC, ii) do not like using it, or iii) cannot interact effectively with its sudden problematic requests. “A3” is most suitable for these three types of users. If students do not have any difficulty to use the PC, they should be assisted and encouraged to use it because it can be so helpful in leading them towards better academic achievement.

According to the Soviet psychologist Lev Vygotsky, the zone of proximal development (ZPD) is the difference between what a learner can do with help and what he/she can do without any help. Vygotsky stated that after children follow adults' examples, they gradually become independent and develop the ability to do tasks without any help (ZPD, 2010).

Learning is a spiral loop, i.e. learning one thing leads to learning other things that lead, in turn, to learning more and more. Bearing this concept in mind, any educator will feel that he/she can assist students to reach a level where they become fully independent learners. Allowing young children to benefit from the easy-to-use phone will eventually prepare them to use computer in an effective way.

If there were an audio forum / blog that uses IP phones to help students post their academic questions, it would provide the students with the simplicity they need. In such a case, all what the students need to do is dial 111, choose the topic, and record the question. There would be no need for a PC, internet, typing skills, and crowded pages full of different pieces of information.

One of the major advantages of IP phones is their affordability. Current academic standards require a PC for every 4 students. Suppose a PC costs at least 800 USD. This means that the cost is 200 USD per student. An IP phone would cost around 140 USD, or only 35 USD per student. The above ratio (1 pc for every 4 students) may vary greatly depending on the socio-economic level in a given country. However, regardless of any variance in this ratio, it will always reflect the same fact: the phone is cheaper to use than the PC.

189 There would be no additional cost because our solution is based on open-source packages. Thus, there will be no need to buy any licenses: - Operating system is Linux - PBX is Asterisk - Database is MySQL - Web server is Apache - Text-to-Speech engine is Festival

Using “A3” will provide various benefits for students and administration simultaneously. First, it provides promising academic services for minimum cost and minimum training. If we can prove that the phone can easily replace the computer, then the need for shifting from using a machine with considerable processing power to a less powerful device will be justified.

Another reason that justifies the need for “A3” is the fact that some students are overly concerned about face-to-face communication. They might feel shy, embarrassed, or uncomfortable to ask questions in front of other students. “A3” will help them focus on the content of the question rather than worrying about overcoming the pressure they are subjected to when engaging in face-to-face communication.

Moreover, the nature of blogging leads to the creation of a rich warehousing of captured knowledge (Bausch et al, 2002). This means that “A3” is needed to act as a content warehouse, where the students will store their questions and answers. Within two years, the database of “A3” would include thousands of questions and answers that would be documented and organized.

Finally, the academic metrics dictate that each question be related to several learning objectives. Such a link is of crucial importance, especially when the student posts a question or fails to provide an answer. This means that he/she did not grasp some of the learning objectives related to the question. A tool that checks the level of understanding of each student for the targeted learning objectives is necessary.

190 4.3.7 Stakeholders

The stakeholders who will play a key role in “A3” are the students, their parents, the teachers, the academic coordinators, and the administration.

a) The teachers and the academic coordinators will benefit the most because “A3” will provide them with the right answers to the most important questions, such as: - Who missed what? If student S posts question Q, this means he/she has not fully understood chapter C. If the number of students - who have not acquired the learning concepts - exceeds a specific threshold, then the teacher is expected to re-explain whatever the students have missed. For example, if 20% of the students post questions related to the chapter, this means that the chapter needs to be taught further. This is not to say that the whole chapter is to be delivered again. It just means that the teacher needs to investigate which part of the chapter was not understood. “A3” is the right tool that will provide the teacher with this valuable piece of information. - Who did what? 1. If student S is asked question Q but cannot provide an answer, this means he/she has not fully understood chapter C. 2. If student S is asked question Q, but cannot provide an answer the first time, this means that he/she has to do some research before being able to answer the question. 3. If student S is asked question Q and is able to provide an answer right away, this means that he/she has fully understood the specific part of the required chapter. b) The students will be able to post questions, provide answers, and receive answers to their posted questions. Those who provide answers will greatly benefit because they may listen to many questions that they did not have answers for initially. They will be curious to know, and will probably do some research to find the right answers in order to be rewarded. c) The parents will be allowed to post questions and receive answers. They will also be allowed to check their children’s activity on “A3”: - Are they using “A3”? - Which questions were posted? - Which answers did they receive? d) The administration will own a bank of questions & answers. “A3” will provide answers for the following questions: - Which learning objectives are not met? - Why does chapter X have the highest number of posted questions? - Does the teacher of section X have enough experience to teach material Y? Does his/her lack of experience cause his/her students to keep posting questions? - Why do the students in section X post the highest number of questions about various chapters?

Feedback coming from all types of stakeholders will enrich the staff’s knowledge and guide the whole academic team through reaching a higher level of quality in education.

191 4.3.8 Encouraging Stakeholders to use “A3”

To make sure all types of users are capable of accessing “A3”, it should be set to be accessed from inside the school (through the LAN) and outside the school (through PSTN). It should be accessible 24/7. This will allow stakeholders to post questions, receive answers, and/or retrieve any piece of information while being at school, at home, or even in car.

To encourage students to use “A3”, teachers can give bonus points. Giving bonus points has been proven to be one of the most efficient strategies in motivating students. There are different ways of using a “bonus points system”. The academic staff can: - inform the students that school tests will include some of the questions posted. - give the student 1 bonus point whenever he/she posts a valid question. - give the student 5 bonus points whenever he/she provides a valid answer. - ask parents to encourage their children to use “A3” because this will help the parents identify the weaknesses of their children.

One disadvantage of using bonus points is that some students might try to take advantage of the bonus system. For example, a student might ask his/her classmate to post a very simple question to which he/she will provide an answer in order to get the 5 bonus points and vice versa. In order to prevent such a thing from happening, a random question will be picked each time a student tries to provide an answer. This way, the student cannot guarantee which questions he/she will be assigned to answer.

Because offering bonus points does not necessarily help weaker students learn more, there was a need for a plan that will help us overcome this obstacle. After consulting several field experts, we could come up with a solution that would help motivate weaker students. The procedure that we used went as follows: once these students started using “A3”, they were told that the teacher had left a couple of questions for them and that they would be rewarded if they could provide an answer in 24 hours. The questions used were basic and easy so that they would provide an answer within the specified period of time. This trick played a very important role in motivating them to do their best in order to provide correct answers and ask their parents / classmates when they do not know.

192 4.3.9 Audio Issues

A traditional forum / blog will allow the students to post all kinds of questions / answers, which are of 2 types: - questions / answers that do not require visual presentation - questions / answers that require visual presentation

A question concerning science, history, geography, grammar, dictation, or vocabulary can usually be dealt with through the phone. A question regarding a mathematical problem is an example of the second type –requiring a visual presentation. For example, it is easy to post the fraction 9+(15/3) on a visual forum, but how easy is it to do the same through an audio forum such as “A3”? If the receiver thinks it is (9+15)/3, he/she will then provide 8(=24/3) as an answer while the correct answer is 14(=9+5). Since the targeted students are not mature enough, it is not easy to train them on how to deal with questions that require a visual presentation. First, we tried making use of ASTER, a computing system used for rendering technical documents in audio. However, after trying plenty of scenarios, we realized that a 9-year-old student will not be able to get the best from using such a scientific tool. Our proposal appeals to simplicity. This is why the decision has been made not to include subject matter that relies extensively on visual presentation, such as Mathematics. An alternative for academic staff would be to schedule private sessions in order to answer some questions that cannot be answered over the phone because they need illustration.

4.3.10 Audio-to-Blog Call Flow

Asterisk is a fully-featured telephony engine that provides VoIP engineers with the ability to fully program phone call handling and empowers developers to create advanced communication solutions for free. “A3” is based on this open-source software package and thus takes advantage of its flexibility to guide the students through a series of steps that include both taking and returning output. “A3” takes input in form of dialed numbers and returns output by playing recorded messages. To illustrate the proposed solution, the flow of the phone call is presented in the following series of steps (steps 5 to 9 are optional): 1. The student dials: a. 1 to record a question b. 2 to check for answers

193 c. 3 to review previous questions & answers and explore the blog d. 4 to check bonus points 2. The student is asked to enter the secret PIN number 3. If the PIN is not valid, the student is asked to try again. (3 trials at most) 4. If there is a pending answer for a previously recorded question, the student is informed and will be given a chance to play the answer. When the student decides to listen to the answer, he/she is asked if the answer is satisfying. Later on, teachers can find out which students were not satisfied and thus take appropriate actions. 5. The student is asked to enter the number of the subject to which the question belongs. This is an optional step. The student can enter 0 if he wants to skip this. Alternatively, “A3” can list the subject (for physics press1, for chemistry press 2, for math press 3 and so on …) 6. The student is asked to enter the number of the chapter to which the question belongs. This is an optional step. The student can enter 0 if he wants to skip it. 7. When subject and chapter are provided, “A3” will inform the student whether there are related questions and answers or not. The student will then be given the option to listen to them. 8. The student is given the option to retrieve questions based on a keyword. When the student supplies valid subject and chapter numbers, the search space is minimized. Within each chapter, there is usually half a dozen of keywords that will be played to the caller. When the caller spells the keyword, the question and the answer will be linked to the corresponding keyword. This enhances the search and empowers “A3”. 9. When the student is done, he/she is told to record after the beep and then hang up. 10. The recorded question, the answers coming from many classmates, the teachers’ comments, and students satisfaction flags will be stored as a part of the blog

Figure 4.16 presents the architecture of “A3”:

194

Figure 4.16 - Architecture of “A3”

“A3” was installed in 3 elementary schools in Beirut. To make sure the sample is as random as possible; we chose three schools that are very different: - School (A) is located in a very popular area outside the city and its 400 students are French educated. - School (B) is located inside the city and its 900 students are English educated. - School (C) is located in the hills surrounding the town of Beirut. Its students are less than 300, two thirds of them are English educated, and the rest are French educated.

“A3” was implemented in a way that guarantees efficient and accurate results. In each school, the targeted classes were carefully selected. Our study included three sections of each class: - Section A: Students in this section were encouraged to use “A3”. - Section B: Students in this section were not allowed to use “A3”. Instead, they were encouraged to use a typical visual forum. - Section C: Students in this section did not make use of any tool.

Our aim was to compare students of section A and those of section B in order to show whether “A3” could be a better solution than typical visual forums. Any comparison with students of section C will measure the efficiency of the idea of a blog at large, whether visual or audio.

195 In order to make sure the required functionalities were offered to the end users in a very user-friendly way, a complete set of different technologies were used. Again, the main goal was ensuring simplicity and accessibility. This is why we have set things so that “A3” would simply prompt the student recording a question to dial the number of the chapter that the question belongs to.

Each chapter includes several topics. “A3” would prompt the question owner to say the name of the topic. Through the use of a speech-to-text engine, “A3” would categorize questions and relate them to keywords. When this happens, “A3” would “advise” the question’s poster to listen to previous answers that are related to the same topic. Similarly, the teachers and the answers providers would answer a question based on a keyword.

4.3.11 Stakeholders’ Feedback

We have used several mechanisms, such as questionnaires, interviews, and voting, to get feedback from the different categories of users: • Students • Practitioners • Teachers • Coordinators • Principals

The majority of those who provided their feedback agreed that although the PC offers a wider variety of services than a phone does, this is just what is needed for limited users, who are expected to teach themselves how to move forward to the next level and become self-reliable. We could deduce that the phone is more efficient than a PC under certain circumstances. Although “A3” offers a solution for a limited population, it is a necessity because it enables young students to achieve what otherwise could not be done.

4.3.12 Results

All the results presented below are based on the selected samples. Some students are not included because they are considered extreme cases that required special care. Once

196 those were excluded from the survey, all the three sections appeared to have similar academic results. Our study covered: - 3 schools for 1 academic year - 3 classes per each school (grade 3, grade 4, and grade 5) - 3 sections per each class (A: used “A3”, B: used a visual forum, and C: used none) - 9 materials were included for each class section - 474 students, all having both parents - 143 teachers and academic coordinators

The results of our study focus on a number of indicators, including: the percentage of participation missed learning objectives, academic staff quality, and academic performance. The results are shown in tables 4.9, 4.10, 4.11, 4.12, and 4.13.

Table 4.9 – Academic VoIP Blog | Participation Description Sec. A Sec. B Percentage of students’ participation 61 17 Percentage of parents’ participation 9 21 Percentage of answers providers 29 10 Percentage of students who received valid answers 63 41 Percentage of those who could provide a correct answer the first 41 29 time around Percentage of those who could NOT provide a correct answer 27 32

In order to measure the usefulness and effectiveness of “A3”, we had to come up with our own methodology. Many indicators are shown in Tables 4.10, 4.11, and 4.12. In the following section, we explain how we could reach the conclusions presented in these 3 tables.

Unmet / weakly delivered learning objectives - Lack of correct answers is one of the most important indicators which show that the learning objectives are either weakly delivered or not met at all. - “A3” included sometimes 6 wrong answers for the same question. In other cases, no answers were provided at all. This showed that i) many students have questions about a certain topic and ii) their classmates cannot provide correct answers - The same questions were asked by most students, even the serious and smart ones. This indicated that the learning objective needed to be re-taught because it was not attained. The extensive use of “A3” by the students led to the discovery of these hidden facts.

197 Number of questions asked by students: - The higher this number is, the worse the case is. When the students are asking too many questions about the same chapter, it can be an indicator of some serious problem. The ease-of-use and access of “A3” encouraged students to ask more questions. This allowed the educators to discover the problem that could not be unveiled in the other 2 sections, namely, B and C.

Ineffective teachers - Inefficiency in delivering the content can be sometimes because: a) The teacher could not be efficient in explaining the material. b) The teacher does not have the required knowledge. c) The teacher did not give enough time to cover the chapter. d) There is a commonly missing pre-requisite. e) There is a problem inside the classroom that prevents the students from focusing on the teacher's explanation.

All of the above points were thoroughly checked, including other hidden factors. Our plan was to use elimination to find out the reason. In most cases, all of the above points were eliminated except for the first one.

Table 4.10 – Academic VoIP Blog | Missed Learning Objectives Description Sec. A Sec. B Percentage of the learning objectives which were identified as 6 2 unmet Percentage of the learning objectives which were marked as 8 4 weakly delivered

Table 4.10 shows that “A3” could “tell” about more missing learning objectives.

Table 4.11 – Academic VoIP Blog | Academic Staff Quality Description Sec. A Sec. B Number of teachers who were investigated for being ineffective 1 0 in delivering the content

Table 4.11 shows that “A3” could help in detecting the existence of one inefficient teacher.

Table 4.12 – Academic VoIP Blog | Bank of Questions & Answers

198 Description Sec. A Sec. B Average number of questions posted per section in 1 term 103 30 Number of questions compiled in each school 3000 1200

Table 4.12 shows that “A3” motivated the students to ask more questions. We believe this is due to the fact that “A3” is easy to use and access. All what is required is to pick up the phone, dial a number, record a question, and hang up, which can be done in less than a minute.

Table 4.13 – Academic VoIP Blog | Academic Performance Indicators Description Sec. A Sec. B Percentage of improvement in academic results compared to 21 10 Section C

The results presented in the table “Academic Performance Indicators” show that the students could achieve 11% better results when using “A3”. This is due to the simplicity and the straightforwardness of “A3” as compared to a typical internet / intranet visual forum. The results presented last in the same section show that any forum is likely to help the students perform better.

4.3.13 Future Enhancements

The current features of “A3” enable it to help improve academic performance to a limited extent. In order to get more benefits, “A3” still needs one or more improvements. The following improvements are planned for in the near future:

- The teacher will be able to provide answers through a special web application. He / she will listen to the question and input a text. Using a text-to-speech engine, “A3” will read this text to the student. - Parents and students will have their own portals, through which they will be able to check for grades, exams, attendance, and academic performance. It would be a great opportunity to link “A3” to the web portals of parents and students. Similarly, “A3” would be linked to any typical academic forum already in use. The goal is to achieve full compatibility between the audio and the visual forums, so that a user can post an audio question and receive a written answer and vice versa. This way, users would benefit from advantages of audio and visual forums.

199 200 PART 5 – CONCLUSION & PERSPECTIVES

5.1 Conclusion

Different integration problems and solutions have been presented in this thesis. These solutions belong to three different categories. In what follows, we list all of the categories with all related topics along with our contribution, limitations of the proposed solution, challenges we had to face, and future perspectives.

5.1.1 VoIP and Typical Services

5.1.1.1 Mail Servers

5.1.1.1.1 Contribution

We provided a library of code that allows developers to integrate a VoIP PBX with an email server so that callers can check their emails, record the reply, and even send attachments. All of this has been done through the use of a couple of web pages. A Text- to-Speech engine has been used in order to read the email to the caller.

In prior work, a model called “Email by phone” was suggested. Our proposed architecture is more general and solves many integration problems using the same integration mechanism. “Email by phone” requires a special browser while we rely on typical http requests.

5.1.1.1.2 Limitations

This solution suffers from lack of visual presentation. If, for example, one of the email messages includes an attachment, the phone user will not be able to check the attached document which might be an image or a video file. Another limitation is that the web pages which get information about the email messages should be installed on the same email server. This means that an IIS server should be also installed there.

201 5.1.1.1.3 Challenges

There are some problems that could not be easily solved. The accuracy of Speech- to-Text engines is a real obstacle. The phone caller should speak and the STT engine should capture the words so that the reply is sent in text format. Since such accuracy could not be achieved, we solved the problem by asking the caller to record the reply which will be sent as an attachment. This led to another problem because the email messages are not searchable anymore. When all messages are stored as text, one can easily search for a word. When all replies are stored as sound file, search for keywords is very limited.

Another challenge is the performance of the server delivering mail service over the phone. Such a server needs to be a very powerful machine. Depending on the number of concurrent users, a farm of servers might be required. This scalable architecture should be engineered with capacity in mind.

5.1.1.1.4 Perspectives

Two functionalities would enrich the solution if provided to the clients.

The first is the ability to receive the voice of the client, convert it to text, and send it as the body of the email. Such functionality is not so easy to fulfill because different people use different languages and accents. This functionality should be available to everyone and the system should receive no training at all.

The second functionality is the ability to allow users to receive the attachment as a Multi Media Service file (MMS). This feature will allow the users to receive the attachment and see what’s inside it, even if the phone does not support the format of the attached file. Before sending the file to the client, it will be transformed to video, image, or sound. Then, it will be sent to the user as MMS, which is a service supported by almost every phone nowadays.

5.1.1.2 Unified Communication Servers

5.1.1.2.1 Contribution

202 We have built a rigid architecture that integrates different servers based on different platforms to achieve a complete unified communication solution. The problem was to integrate MS Office Communication Server with Asterisk, the open source PBX. The problem was that MS OCS runs under windows and Asterisk runs under Linux. Another problem is that MS OCS uses TLS/TCP while Asterisk uses SIP/UDP. Integrating two different solutions with different protocols and platforms is not a straightforward task. Our contribution comes in the form of code, scripts, and configuration guidelines. The necessary code is fully implemented and tested so that users using MS office communicator can make phone calls to the outside world through Asterisk in spite of all differences in platforms and protocols.

In prior work, an architecture called CINEMA was proposed. CINEMA is not easy to implement as it includes a lot of servers. Our proposal is simpler and uses the same integration mechanism to integrate VoIP with all other services, not only communication services.

5.1.1.2.2 Limitations

The solution has to be optimized as the caller needs to pass by three translation servers before reaching the PBX. This is a considerable load, especially when there are a plenty of concurrent users using the system. Although the translation is a CPU consuming process, servers doing the translation cannot be reduced because different protocols are being used. Our architecture includes four servers, but three of them are for translation only. This number of used servers is about to change because Asterisk has a new release that supports SIP/TCP. This will eliminate the need for one of our servers, which means a simpler architecture will keep doing the job.

5.1.1.2.3 Challenges

The main challenge is that the integration includes products based on different platforms and using different protocols. We could make use of OCS Mediation Server to translate from TLS/TCP to SIP/TCP. We could also use OpenSER, an open source solution, to translate from SIP/TCP to SIP/UDP. Finally, we could fully achieve the multi- protocol & multi-platform integration.

203

Another challenge was the debugging of such an architecture that includes MS Office Communicator, MS OCS, OCS Mediation Server, OpenSer, Asterisk, and the PSTN. There are useful tools that could easily inform us about the problem or at least its location.

5.1.1.2.4 Perspectives

Minimizing the number of used servers is one of the important goals that need to be achieved in the future because this will improve the performance of the whole solution and minimize its down time.

Another future perspective is to simulate the functionality of the proposed architecture in a call center to discover how efficient the proposal is.

5.1.1.3 DBMS Servers

5.1.1.3.1 Contribution

Since we decided to use the http protocol for the communication between our PBX and the data stored in different databases, we could achieve the integration with a plenty of different RDBMS packages such as Oracle, MS Access, MS SQL Server, and MySQL. The same mechanism can be used with any other RDBMS (Relational Database Management System) package. Thus, our contribution is the required code to integrate an open-source PBX in software with a wide range of database systems. That way, any data stored behind any RDBMS can be accessed by the PBX.

In prior work, a suggested model allowed accessing a database using spoken language dialogue interface. It is a sophisticated, specific, and hard-to-build model that uses drivers, managers, parsers, handlers, and other tools. Our approach is general, simple to use, and easy to build.

204 5.1.1.3.2 Limitations

Our proposed architecture includes a web server in the middle between the PBX and the RDBMS server. The more servers that are included in the solution, the more vulnerable the system is to faults and problems. In addition, the system performance is negatively affected by the web server in the middle, which can be also seen as a single point of failure.

5.1.1.3.2 Challenges

Different RDBMS packages run over windows while others run over Linux. This difference in the platform is an obstacle because the result is a heterogeneous system. The architecture includes a Linux-based PBX, a Windows-based Web Server, and the RDBMS Server which might be Linux- or Windows-based.

5.1.1.3.3 Perspectives

Eliminating the server-in-the-middle might enhance the whole architecture. Tests should be made to check if the performance will be better or worse when the web server is eliminated. Although the existence of the web server is providing load balancing, it might be the cause of some latency because each request to obtain data is starting at the PBX, passing by the web server, and ending at the RDBMS server. This is happening in both directions. Tests need to be conducted to see if it is better to get rid of the web server and if there is an easy way to do so.

5.1.1.4 File Servers & Office Tools

5.1.1.4.1 Contribution

We could provide an integrated solution, through which users can use their phones to navigate their hard disks, find a document, open it over the phone, fill some blank fields, and eventually send it as an attachment. The names of the folders and files are read to the caller by a Text-to-Speech engine. Our contribution is in providing the code that enables users to use their phones, just the same way typical users interact with visual solutions, like

205 computers or smart phones. Such a contribution is of utmost importance to disabled or visually impaired people.

5.1.1.4.2 Limitations

The solution presented in this thesis focuses on i) navigating Windows-formatted hard disk and ii) dealing with MS WinWord & Excel documents only. This is because those two products can be controlled using VBA code. The presented solution works only for MS Office applications.

5.1.1.4.3 Challenges

Usually, users recognize the file by the icon that appears next to its name. They can visually differentiate between a Word document and Excel workbook. This is not the case with our audio proposal. We could solve this issue by changing the voice; for example, a male voice will read the name of an Excel workbook, while a female voice will read the name of a WinWord document. This was a temporal solution and cannot be used to differentiate between four different types of files.

5.1.1.4.4 Perspectives

Navigating the hard disk and dealing with documents is advantageous. Nonetheless, empowering users to have full control over their computers through the phone would help three categories of people: 1) visually impaired, 2) blind, and 3) disabled users. This includes the ability to perform Google search, printing, backup, restart, and many other functions over the phone.

5.1.2 VoIP & Security Services

5.1.2.1 Encryption Services

5.1.2.1.1 Contribution

We are not aware of any prior work that addresses encrypting VoIP communication based on substitution. Our algorithm converts the voice of the first party on the phone to a

206 text file. Then, each byte within the file is replaced by a unique number using a specific dictionary. Afterwards, the file is transmitted to the second party on the phone, where each number is replaced by its original value using the same dictionary. When all original bytes are restored, the voice can be played. That way, encryption can be broken by only those who have the dictionary because no patterns can be detected. Our contribution is a complete architecture with required code to perform dictionary generation, encryption, file transmission, and decryption.

Many researchers used same methodology to encrypt text files only. Our approach is concerned with encrypting voice media, and not only text files. In prior work, the runtime was not a big issue because the problem being solved was not about real-time communication. In VoIP, there is an inevitable need for real time response. What we have achieved is encryption and decryption using a dictionary in less than one second, which is an acceptable latency in VoIP.

5.1.2.1.2 Limitations

There are three obvious limitations that adversely affect our solution.

The first limitation is the need for manual process to exchange the dictionary. Such a process cannot be done over the internet because it jeopardizes the security of the whole solution.

The second problem is that we could not achieve full duplex communication. Our solution could only provide half-duplex phone calls, which means that one party will be talking while the other will be listening and vice versa.

The third problem is that the dictionary needs to be protected from theft. If the dictionary is unveiled or lost, the security of the whole process is jeopardized.

5.1.2.1.3 Challenges

Our biggest challenge was the runtime. When we tried the first version of our algorithm, the encryption process took around five minutes. Same period of time was

207 needed for the decryption process. We kept optimizing the algorithm and the way dictionaries are stored until we succeeded in making both encryption and decryption run in less than one second.

5.1.2.1.4 Perspectives

We believe that our solution is still in need of two major changes. First, the half- duplex phone call should be upgraded to full duplex because the users will be annoyed by the push-to-talk way of communication. Second, our solution focuses only on the privacy issue, while any encryption technique should provide a solution for data integrity, authentication, and other encryption-related problems.

5.1.2.2 Using RBE to Detect Suspicious Calls

5.1.2.2.1 Contribution

Controlling a PBX can happen in one of two ways: either upon-request or on-the- fly. The first is followed when there is a tool that will analyze the records in the CDR file and output a report about what happened, but when it is too late to take the proper action. Our contribution lies in presenting the second method so that whenever a suspicious phone call is detected, an immediate action is taken on the fly. The philosophy of our proposal is to enable users to enforce a policy that will control the behavior of the PBX. A complete implementation is provided with all necessary code and architectural details.

Field researchers have worked on similar integration, but we believe that our architecture is better because it makes use of a Web Service that can be called either synchronously or asynchronously based on the performance results. Our web service can be replaced by any other web service allowing the usage of other RBE packages. We also believe load is better balanced in our architecture.

5.1.2.2.2 Limitations

Our model is limited to using InRule, which is a very expensive RBE. It is not a general model that can be used with any other RBE package.

208

Another limitation is that the Web service needs to be installed on the same server where the RBE is installed. This is not optimal in terms of performance.

The last limitation is that the architecture requires experts in several technologies (VoIP, Web, Database, and RBE) to maintain the servers.

5.1.2.2.3 Challenges

The whole process is slower than usual because the RBE will be consulted for each phone call. The runtime is extremely affected by the complexity of the rules. Although simple rules result in better performance, they still pose another challenge when there are many of them.

Another challenge is the heterogeneous nature of the architecture. The CDR is a text file stored on the hard disk of a Linux server, while everything else runs under Windows.

5.1.2.2.4 Perspectives

A plenty of rules have been developed to detect suspicious phone calls. Some of them are really complicated and will bring the CPU to its knees. A simulation needs to be conducted to determine the threshold after which the architecture stops being efficient. This simulation should include tens of concurrent users with a dozen of complicated rules. The goal of this simulation is identify which rules need to be eliminated because they are CPU killers.

5.1.2.3 Other Security Services

5.1.2.3.1 Contribution

Three additional ideas have been introduced under the title of security to show that the framework of the integration can be complete and that it can include all services

209 available in a typical LAN. These ideas are Intrusion Prevention, Directory Services, and Monitoring Services. Our contribution is a complete implementation of the solution.

The Intrusion Prevention scenario presented in this thesis can be compared to the network appliance control provided as a part of the CINEMA architecture that was previously introduced. Our architecture focuses on achieving a higher level of security by isolating highly sensitive networks. The physical disconnection is controlled by hardware, not just a software package with possible backdoors. The wire will be physically unplugged to ensure no intruder is capable of penetrating.

The second scenario integrates the VoIP solution with MS Active Directory so that network administrators can control their networks over the phone, after being authenticated.

The third scenario is all about instructing the VoIP solution to monitor the network and report any misuse or violation.

A very long list of queries that return information about the windows server is presented. Using any of these queries, the PBX solution will be able to know all details about the server, including the temperature of the CPU. The necessary code to execute any of these queries, check the result, and report back to the LAN administrator is presented in this thesis.

5.1.2.3.2 Limitations

The isolation scenario is very specific and is convenient only to those who can sacrifice availability to security. The main concept is to minimize the time the network is connected to the internet. This isolation is accepted in very rare cases, where internet is needed for a short period of time.

The other two scenarios, Monitoring and Directory Services, are also specific to Windows platform. The queries that have been listed run for Windows servers only and do not work for other operating systems.

210 5.1.2.3.3 Challenges

Our presented scenarios allow LAN administrators to do very critical tasks over the phone. Our biggest challenge is to secure the phone call and authenticate the caller. The caller ID is one of the most important features that helped us achieve a better level of security.

5.1.2.3.4 Perspectives

We need to enforce a one-time password policy, as an additional line of defense. Even when the Caller ID is spoofed, the one-time password policy will keep the system immune enough.

211 5.1.3 VoIP & Web Services

5.1.3.1 Eye-like Algorithm

5.1.3.1.1 Contribution

Our proposal is to develop an algorithm that works just like the human eye. This algorithm will enable phone users to navigate the web using their phones as efficient as using a visual browser. Programs that read the whole page to the caller are not efficient since the callers might keep waiting for a long time until the reading program reaches the part in which they are interested. In previous research work, annotations were suggested as a method of specifying important sections of pages. Any suggestion that requires introducing changes to every web page will not gain global acceptance.

Our proposal allows callers to visit and interact with the same web site surfed by typical internet users. Our contribution is a complete architecture with partial implementation. Several optimizations have been suggested to simplify the work of the proposed algorithm.

5.1.3.1.2 Limitations

There are two limitations to our proposal:

First, algorithms that analyze the source page of the web page are technology- dependent. For example, .NET controls have their own behavior and need to be handled accordingly. Other technologies such as PHP behave differently and thus require different algorithms.

Second, owners of web applications will need to use the special tool which analyzes the source pages and produces the XML files that describe the structure of the web pages. Those XML files should be published so that they are accessible. An alternative solution might be to allow the voice provider to access the source pages and generate the XML files dynamically. This will result in a considerable impact on the runtime. Some web owners might decide not to publish the structure of their web pages.

212

5.1.3.1.3 Challenges

There are many challenges that need to be faced.

First, there is a crucial need to audibly identify hyperlinks. This might be done by changing the voice. For example, a male voice will read normal text while a female voice will read hyperlink text.

Second, the run time of the eye-like algorithm is a real challenge. When websites such as Google, Yahoo, Gmail, Hotmail, or CNN are heavily contacted by phone users, performance might be a major issue.

Third, all visual content will be ignored. This includes videos, images, and animation files.

The last and most challenging issue is the eye-like algorithm and the accompanying tool that produces the XML files which describe the structure of dynamic pages.

5.1.3.1.4 Perspectives

The eye-like algorithm needs to be developed and tested before it can be globally used. There is also a need to develop the tool that analyzes the source files of the web pages and try to generate XML files that describe the structure of the pages. Those XML files will give the eye-like algorithm important hints that will enhance its efficiency.

5.1.3.2 Academic Audio Access “A3”

5.1.3.2.1 Contribution

Discussion forums and blogs are nowadays considered good surrogates for classroom interaction. This motivated us to work on “A3”, which makes use of VoIP technology in the education field. We suggested using a VoIP solution to provide the

213 students with an audio forum. This solution was used in several schools to achieve collaboration. Our proposal was proven to be efficient and simple to use.

In a previous research, a similar solution was proposed for e-Learning. Although this proposal has some similarities to ours, there are some differences. First, it focuses on the field of higher education, while our focus is on elementary schools. Second, it tries to provide an efficient solution to visually impaired students, while our proposal focuses on young students who do not have enough computer driving skills.

Our contribution provided the concept of “A3”, the architecture, the complete configuration, and the required library of code.

5.1.3.2.2 Limitations

A traditional forum / blog will allow the students to post all kinds of questions / answers, which are of two types: i) questions / answers that do not require visual presentation and ii) questions / answers that require visual presentation. A question concerning science, history, geography, grammar, dictation, or vocabulary can usually be handled over the phone. A question regarding a mathematical problem is an example of the second type. The decision has been made not to include subject matters that rely extensively on visual presentation. An alternative for academic staff would be to schedule private sessions in order to answer some questions that need illustration and cannot be answered over the phone.

5.1.3.2.3 Challenges

Our biggest challenge was to convince the stakeholders of using “A3”. Giving bonus points has been proven to be one of the most efficient strategies in motivating students. Another efficient strategy is to tell the students that school tests will include some of the posted questions.

5.1.3.2.4 Perspectives

The following improvements are scheduled in the near future:

214

The teacher will be able to provide answers through a special web application. He / she will listen to the question and input a text. Using a text-to-speech engine, “A3” will read this text to the student.

“A3” would be linked to any typical academic forum already in use. Our goal is to achieve full compatibility between the audio and the visual forums, so that a user can post an audio question and receive a written answer and vice versa. This way, users would benefit from advantages of audio and visual forums.

215 5.2 Findings and Recommendations

5.2.1 Introduction

The integration techniques, models, and implementations that were presented in this thesis are considered an introduction to further work in the field of VoIP integration. Integrating any IP PBX with other technologies and solutions is still in need of some enhancements. VoiceXML, SALT, VoicePHP, and other integration solutions are considered one step towards achieving integration. Another step is still needed if we want the VoIP developers to use a high-level development language. VoiceXML and the likes help the developers focus on what features are required, rather than how they should be developed. Although these technologies reduced the amount of code voice applications programmers need to develop, there are still some obstacles in the way. Dealing with Active Directory, Office Communication Server, Exchange Server, File Servers, Rule- Based Engines, Word documents, or Excel sheets still needs some efforts and is not easily achieved using the current integration solutions. We suggest using the techniques and the programs proposed in this thesis to create a very simple Domain Specific Language (DSL). Voice application programmers will make use of this language to develop their applications. They will not need to understand anything about Exchange Server, emails retrieval, or authentication process. They will have a set of commands that are specific to Exchange Server. Using this small set of instructions, they are supposed to have full access to the emails. The code presented in this thesis can help a lot and serve as a base for such a high-level language.

5.2.2 Domain Specific Language

5.2.2.1 What is DSL?

DSL is a specification language that is dedicated to a particular problem domain. In our case, the DSL would be dedicated to the integration of any IP PBX with a collection of servers. Creating a DSL along with the software that supports it allows a particular type of solutions to be expressed more clearly than other languages would allow. Unlike general- purpose languages, DSLs are developed to solve specific problems in a particular domain. DSLs are languages with specific goals in both: design & implementation (DSL, 2010).

216

5.2.2.2 Advantages and Disadvantages

Generally speaking, DSLs have many advantages, some of which are presented in the following list: • DSLs allow solutions to be expressed in the idiom. That’s why field experts can understand and develop DSL programs. • Generated code is always self-documenting. • DSLs enhance: o Code quality o Code portability o Code reusability o Solution reliability o Developers’ productivity o Solution maintainability • Sentences written with safe constructs of the language are also considered safe.

Although DSLs are in general considered useful tools, they suffer from the following drawbacks: • Learning a new language is costly. • The applicability of a DSL is limited to its domain. • There is a considerable cost for designing, implementing, and maintaining a DSL, in addition to developing the Integrated Development Environment (IDE). • There will be a need to overcome: o the potential loss of some efficiency. o the difficulty of balancing the domain-specificity. o the difficulty of finding, setting, maintaining, and standardizing of proper scope.

In brief, although DSLs enhance quality, reliability, portability, productivity, maintainability, and reusability, this comes at a cost in terms of designing, implementing, maintaining, and standardizing the language.

217

5.2.2.3 DSL Life Cycle

The life cycle of a Domain-Specific Language consists of the following development phases (Mernik et al , 2005): • Decision: A DSL starts with an existing code base. Then, this base will be used to start implementing a DSL and its associated execution engine. All the code presented in this thesis can be used as a code base for the proposed DSL. • Analysis: The domain of the problem will be identified and the domain knowledge will be gathered in this phase. We believe that the analysis of the suggested DSL is almost done due to the many scenarios and implementations presented in this thesis • Design: A DSL can be designed from scratch. Under certain circumstances, it can be easier to start from an existing language. In our case, the DSL should be built from scratch. • Implementation: There are many approaches with which a DSL can be implemented. Implementation includes creating the engine that will handle the programs developed using the DSL syntax. • Deployment: In this phase, the language and the programs constructed with the DSL are used. Implementations will result in having a working software package which will be used by end-users. We believe that compiling all the programs presented in this thesis in a single library will provide the voice applications developers with the required features. This compiled library will be the engine that “understands” the new syntax and provides the needed functionalities.

Eelco Visser says that maintenance is the sixth phase (Johan den, 2009). With time, changes in the software may evolve. This will eventually result in altering the DSL implementation. This is to say that a DSL will evolve over time. Thus, it is of a crucial importance to have a DSL migration strategy.

218 5.2.3 Proposed DSL Specifications

The main goal of the proposed DSL is to seamlessly implement an otherwise sophisticated integration. This language will hide the complexity of integrating the PBX with a wide range of Windows and Linux servers and enable developers to write dialplans in a normal structured language.

We propose using a DSL to simplify the process of integrating Asterisk with a wide range of servers. We also propose a set of commands that make contacting MS Exchange Server, for example, and retrieving information easier. Instead of telling voice applications developers how to contact the Active Directory, we will provide them with a language, through which they get what they want without the need to know how it is done. That way, we create a new layer that hides the complexity of dealing with different architecture, technologies, and software packages.

We do not consider the proposed syntax a totally mature work. We just propose a methodology to be used by voice applications developers in order to achieve smooth integration and simple-to-use implementation.

Sample syntax is presented in Appendix E. Two files will be used. The first file is just used for setup. It will include the IP addresses of the servers that will be used along with other required information. Another file will be used for the dialplan. Both of the files are highly commented to make sure every line of code is explained. The two files are presented in Figure 6.3 and Figure 6.4 respectively in Appendix E.

The proposed architecture includes an integration server. This server includes the code required to do the integration. It will include a web page for every provided service. Some pages will check mail, return the messages, send replies other pages will explore a shared folder. VoIP engineers will be able to add new features. That way, the functionality is extensible, since users can add their own integration. The new functionalities can be added by simply developing the corresponding web page. The PBX will call this page from within the dialplan to send input and get output.

219 There is a huge advantage of having two separate files. The setup file will include many technical details. Senior engineers are expected to prepare this file and developers will work on the other one. This will enhance security and enable developers to focus on their work rather than getting stuck with setups and configurations.

220 5.2.3.1 Setup File Specifications

In the following, we suppose that the setup file will be prepared by VoIP engineers, whereas VoIP developers will be responsible of developing the dialplan:

• The setup file should include the IP address of the integration server. • The integration server is a web server that includes a web page for each offered functionality. • In case developers are expected to connect to a database, the connection string should be specified in the setup file. • VoIP developers will deal with Exchange Server, Active Directory, or any other service just the same way they deal with a database. There will be several tables, each of which has several columns & rows. • VoIP engineers will have the option to create their own functionalities. • Each user-defined functionality is assigned to a web page developed by VoIP engineers. The URL of this page is specified in the setup file. This flexibility allows VoIP engineers to include any functionality, even those available over the internet. • The setup file may also include a connection to a shared file, so that VoIP developers will be able to reach a document that is stored in the folder. • The setup file may also include a connection to a Microsoft Office Communication Server.. • The setup file may also include a connection to a Windows server. VoIP developers will be able to execute a windows query and check retrieved results

221 5.2.3.2 Dialplan File Specifications

The proposed DSL is a structured programming language. It includes procedures and functions that return values. It also includes branching commands such as if, else, while, for, switch, and break.

The DSL should include a function called 'input' that asks for user's input and returns it to the dialplan. 'input' will take several parameters. The first parameter is the file to be played. If file is omitted, the text of the second parameter will be sent to festival (TTS engine) that will read the text to the caller. The last parameter is the number of digits to be input. If the second and the third parameters are both empty, festival will automatically play the following message to the caller: “Please enter a 6 digit number”.

VoIP developers can retrieve data from tables defined in the setup file. For example, VoIP engineers might define a table called emails. The setup file will include enough information about the connectivity to the Exchange server. The integration server will also include a web page for each functionality.

VoIP developers will then be able to retrieve data from those tables the same way a SELECT statement is used. The VoIP developers will query a table to get information. The data source might be:

o MS SQL Server DB o Oracle DB, o MySQL DB, o MS Access DB o XML file o Text file o MS Excel Sheet o MS WinWord document o MS exchange server o Active Directory o Logging entries of windows Event Viewer o A folder on a windows server

222 o A folder on a linux machine o A web service

Appendix E presents a sample dialplan that includes different scenarios with significant details and descriptive comments.

223 5.2.4 Conclusion

Although technologies such as VXML reduced the amount of code programmers need to develop voice applications, there are still some obstacles in the way. Dealing with MS Active Directory, MS OCS, MS Exchange Server, and File Servers is not easily achieved using the current voice development languages. We suggest using programs proposed in this thesis to create a very simple Domain Specific Language (DSL) that will simplify the targeted integration. When this DSL is used, VoIP developers will not need to understand anything about Exchange Server, emails retrieval, or authentication process. They will have a set of commands, specific to Exchange Server. Using this small set of instructions, they have full access to emails. This applies to all types of services discussed in this thesis

The language presented in Figures 5.1 and 5.2 requires an engine that “understands” the used syntax and provides the required functionalities. This engine may be based on the code presented in this thesis, VoiceXML, and/or VoicePHP. Our proposed solution is expected to be built on top of other technologies. That way, we benefit from existing solutions that will help us come up with new ones. The engine will be small in size. It takes as input the two files presented in Figures 5.1 and 5.2 and outputs the dialplan. It makes use of several web applications, each of which provides connectivity to a special type of servers. For example, there will be a set of web pages that provide the required functionality to communicate with MS Exchange Server. Another set of web pages will provide proper communication with MS OCS, and so on. To allow Asterisk, or any Asterisk-based IP PBX, to deal with any software package, the software provider should provide a list of web pages through which voice applications developers can communicate with the software and benefit from its different features. That way, any IP PBX that can make an HTTP request and receive the returned HTML has the chance to be fully integrated with this software.

In this thesis, we presented three different types of scenarios: • Type A – Allowing an IP PBX in software to work with: o Different DBMS packages o MS Exchange server o MS OCS

224 o Office applications o File servers o Windows queries • Type B – Using the PBX to enhance information security, such as: o Integrating the PBX with MS Active Directory o Allowing the PBX to isolate highly secured LANs o Using dictionary-based encryption to provide interception-proof VoIP o Using a Rule-based Engine to detect suspicious calls • Type C – Using the PBX to provide special services, such as: o Eye-like algorithm to produce voice web pages o Academic VoIP blog for elementary schools

Figure 5.1 shows the complete proposed solutions with a wide range of integrated services. All of those scenarios have been successfully implemented and tested. The integration presented in Figure 5.1 will be a complicated task if VoIP developers are going to use a low-level language such as VXML or SALT. We want the VoIP developers to use a high-level development language to help them focus on what features are required, rather than how to implement these features.

225

Figure 5.1 - Broad Range of Integrated Services

In the above scenarios, we faced the need to integrate an open-source PBX in software with many other software packages. We could overcome this challenge with the use of a set of web pages that can receive the request and return the required piece of information. This is of crucial importance because Asterisk is capable of making HTTP requests. Allowing the IP PBX to deal with different solutions using HTTP requests enables security engineers to achieve a better level of security. This is related to the architecture of the solution. If the PBX can rely on HTTP requests to contact other solutions, it can belong to a separate DMZ and communicate with servers belonging to other DMZs. Firewalls between the different DMZs will be configured to allow only HTTP requests and replies between the PBX server and all other servers.

226 227 PART 6 – APPENDICES & REFERENCES

228 Appendices

Appendix A - Allow Callers to Input Letters

Sometimes, Asterisk developers might want to give the caller the opportunity to enter letters instead of numbers. Callers should be able to do so by using the numbers buttons. Table 6.1 shows a list of characters along with the corresponding dialed digits. This table tells the dialplan developers that a caller entering text using the 10-digit buttons will, for example, dial 222 to enter the letter “C”.

Table 6.1 Digits and the Corresponding Letters When the user dials It means he/she wants to enter the letter 2 A 22 B 222 C 3 D 33 E 333 F … …

Figure 6.1 shows a macro that will ask the caller to enter a letter and read it back to the caller.

; play the file "enterletter", expect a 4-digit number, and wait for 2 seconds exten=s,1(again),Read(NUM,${media}enterletter,4,,,2) exten=s,n,GotoIf($[${NUM} == 2]?:TryB) exten=s,n,SayAlpha(A) exten=s,n,Set(Phrase=$[${Phrase}+A]) exten=s,n,Goto(prompt)

229 exten=s,n(TryB),GotoIf($[${NUM} == 22]?:TryC) exten=s,n,SayAlpha(B) exten=s,n,Set(Phrase=$[${Phrase}+B]) exten=s,n,Goto(prompt) exten=s,n(TryC),GotoIf($[${NUM} == 222]?:TryD) exten=s,n,SayAlpha(C) exten=s,n,Set(Phrase=$[${Phrase}+C]) exten=s,n,Goto(prompt) exten=s,n(TryD),GotoIf($[${NUM} == 3]?:TryE) exten=s,n,SayAlpha(D) exten=s,n,Set(Phrase=$[${Phrase}+D]) exten=s,n,Goto(prompt) ; … exten=s,n(TryZ),GotoIf($[${NUM} == 9999]?:prompt) exten=s,n,SayAlpha(Z) exten=s,n,Set(Phrase=$[${Phrase}+Z]) exten=s,1(prompt),Read(ZERO,${media}zero_to_stop,1,,,2) exten=s,n,GotoIf($[${ZERO } == 0]?:again) ; caller decided to stop. Do something to phrase and exit Figure 6.1 - Macro that Asks Callers to Enter Letters by Dialing Numbers

230 Appendix B - Exchange Message Fields

Each email message has many fields. For a complete list of these fields, developers can use the VBA code presented in Figure 6.2.

<% cs = "data source=http://192.168.10.10/exchange/a.hammoud/Inbox/testing.EML; user [email protected]; password=p@ssw0rd"

Set Conn = CreateObject("ADODB.Connection") Conn.Provider = "ExOLEDB.DataSource" Conn.Open cs

Set Rec = CreateObject("ADODB.Record") Rec.Open "http://192.168.10.10/exchange/a.hammoud/Inbox/testing.EML", Conn, adModeReadWrite for i = 0 to rec.fields.count - 1 response.write rec(i).name & "
" Next %> Figure 6.2 - VBA Code to Get All Fields of Email Messages

Although one can have a comprehensive list of fields, we do believe that the list presented in Table 6.2 is enough.

Table 6.2 - Some Fields of Email Messages Field Index Field Meaning Field Name 1 Content rec("urn:schemas:httpmail:textdescription") 3 CC rec("urn:schemas:httpmail:cc")

231 5 To rec("urn:schemas:httpmail:to") 6 Date rec("urn:schemas:httpmail:datereceived") 7 From rec("urn:schemas:mailheader:from") 12 Subject rec("urn:schemas:mailheader:subject") 22 Importance rec("urn:schemas:mailheader:importance") 25 Priority rec("urn:schemas:mailheader:priority") 26 HasAttach rec("urn:schemas:httpmail:hasattachment") 29 IsRead rec("urn:schemas:httpmail:read")

232 Appendix C - Queries to Get Windows Information

The list of queries presented in Table 6.3 can be used to get windows information. These queries can be used within an ASP page to get critical and important information regarding Windows servers.

Table 6.3 - Queries to Get Windows Information

1. select * from Win32_1394Controller

2. select * from Win32_1394ControllerDevice

3. select * from Win32_Account

4. select * from Win32_AccountSID

5. select * from Win32_ACE

6. select * from Win32_ActionCheck

7. select * from Win32_AllocatedResource

8. select * from Win32_ApplicationCommandLine

9. select * from Win32_ApplicationService

10. select * from Win32_AssociatedBattery

11. select * from Win32_AssociatedProcessorMemory

12. select * from Win32_BaseBoard

13. select * from Win32_BaseService

14. select * from Win32_Battery

15. select * from Win32_Binary

16. select * from Win32_BindImageAction

17. select * from Win32_BIOS

18. select * from Win32_BootConfiguration

233 19. select * from Win32_Bus

20. select * from Win32_CacheMemory

21. select * from Win32_CDROMDrive

22. select * from Win32_CheckCheck

23. select * from Win32_CIMLogicalDeviceCIMDataFile

24. select * from Win32_ClassicCOMApplicationClasses

25. select * from Win32_ClassicCOMClass

26. select * from Win32_ClassicCOMClassSetting

27. select * from Win32_ClassicCOMClassSettings

28. select * from Win32_ClassInfoAction

29. select * from Win32_ClientApplicationSetting

30. select * from Win32_CodecFile

31. select * from Win32_COMApplication

32. select * from Win32_COMApplicationClasses

33. select * from Win32_COMApplicationSettings

34. select * from Win32_COMClass

35. select * from Win32_ComClassAutoEmulator

36. select * from Win32_ComClassEmulator

37. select * from Win32_CommandLineAccess

38. select * from Win32_ComponentCategory

39. select * from Win32_ComputerSystem

40. select * from Win32_ComputerSystemProcessor

41. select * from Win32_ComputerSystemProduct

42. select * from Win32_COMSetting

234 43. select * from Win32_Condition

44. select * from Win32_CreateFolderAction

45. select * from Win32_CurrentProbe

46. select * from Win32_DCOMApplication

47. select * from Win32_DCOMApplicationAccessAllowedSetting

48. select * from Win32_DCOMApplicationLaunchAllowedSetting

49. select * from Win32_DCOMApplicationSetting

50. select * from Win32_DependentService

51. select * from Win32_Desktop

52. select * from Win32_DesktopMonitor

53. select * from Win32_DeviceBus

54. select * from Win32_DeviceMemoryAddress

55. select * from Win32_DeviceSettings

56. select * from Win32_Directory

57. select * from Win32_DirectorySpecification

58. select * from Win32_DiskDrive

59. select * from Win32_DiskDriveToDiskPartition

60. select * from Win32_DiskPartition

61. select * from Win32_DisplayConfiguration

62. select * from Win32_DisplayControllerConfiguration

63. select * from Win32_DMAChannel

64. select * from Win32_DriverVXD

65. select * from Win32_DuplicateFileAction

66. select * from Win32_Environment

235 67. select * from Win32_EnvironmentSpecification

68. select * from Win32_ExtensionInfoAction

69. select * from Win32_Fan

70. select * from Win32_FileSpecification

71. select * from Win32_FloppyController

72. select * from Win32_FloppyDrive

73. select * from Win32_FontInfoAction

74. select * from Win32_Group

75. select * from Win32_GroupUser

76. select * from Win32_HeatPipe

77. select * from Win32_IDEController

78. select * from Win32_IDEControllerDevice

79. select * from Win32_ImplementedCategory

80. select * from Win32_InfraredDevice

81. select * from Win32_IniFileSpecification

82. select * from Win32_InstalledSoftwareElement

83. select * from Win32_IRQResource

84. select * from Win32_Keyboard

85. select * from Win32_LaunchCondition

86. select * from Win32_LoadOrderGroup

87. select * from Win32_LoadOrderGroupServiceDependencies

88. select * from Win32_LoadOrderGroupServiceMembers

89. select * from Win32_LogicalDisk

90. select * from Win32_LogicalDiskRootDirectory

236 91. select * from Win32_LogicalDiskToPartition

92. select * from Win32_LogicalFileAccess

93. select * from Win32_LogicalFileAuditing

94. select * from Win32_LogicalFileGroup

95. select * from Win32_LogicalFileOwner

96. select * from Win32_LogicalFileSecuritySetting

97. select * from Win32_LogicalMemoryConfiguration

98. select * from Win32_LogicalProgramGroup

99. select * from Win32_LogicalProgramGroupDirectory

100. select * from Win32_LogicalProgramGroupItem

101. select * from Win32_LogicalProgramGroupItemDataFile

102. select * from Win32_LogicalShareAccess

103. select * from Win32_LogicalShareAuditing

104. select * from Win32_LogicalShareSecuritySetting

105. select * from Win32_ManagedSystemElementResource

106. select * from Win32_MemoryArray

107. select * from Win32_MemoryArrayLocation

108. select * from Win32_MemoryDevice

109. select * from Win32_MemoryDeviceArray

110. select * from Win32_MemoryDeviceLocation

111. select * from Win32_MethodParameterClass

112. select * from Win32_MIMEInfoAction

113. select * from Win32_MotherboardDevice

114. select * from Win32_MoveFileAction

237 115. select * from Win32_MSIResource

116. select * from Win32_NetworkAdapter

117. select * from Win32_NetworkAdapterConfiguration

118. select * from Win32_NetworkAdapterSetting

119. select * from Win32_NetworkClient

120. select * from Win32_NetworkConnection

121. select * from Win32_NetworkLoginProfile

122. select * from Win32_NetworkProtocol

123. select * from Win32_NTEventlogFile

124. select * from Win32_NTLogEvent

125. select * from Win32_NTLogEventComputer

126. select * from Win32_NTLogEventLog

127. select * from Win32_NTLogEventUser

128. select * from Win32_ODBCAttribute

129. select * from Win32_ODBCDataSourceAttribute

130. select * from Win32_ODBCDataSourceSpecification

131. select * from Win32_ODBCDriverAttribute

132. select * from Win32_ODBCDriverSoftwareElement

133. select * from Win32_ODBCDriverSpecification

134. select * from Win32_ODBCSourceAttribute

135. select * from Win32_ODBCTranslatorSpecification

136. select * from Win32_OnBoardDevice

137. select * from Win32_OperatingSystem

138. select * from Win32_OperatingSystemQFE

238 139. select * from Win32_OSRecoveryConfiguration

140. select * from Win32_PageFile

141. select * from Win32_PageFileElementSetting

142. select * from Win32_PageFileSetting

143. select * from Win32_PageFileUsage

144. select * from Win32_ParallelPort

145. select * from Win32_Patch

146. select * from Win32_PatchFile

147. select * from Win32_PatchPackage

148. select * from Win32_PCMCIAController

149. select * from Win32_Perf

150. select * from Win32_PerfRawData

151. select * from Win32_PerfRawData_ASP_ActiveServerPages

152. select * from Win32_PerfRawData_ASPNET_114322_ASPNETAppsv114322

153. select * from Win32_PerfRawData_ASPNET_114322_ASPNETv114322

154. select * from Win32_PerfRawData_ASPNET_ASPNET

155. select * from Win32_PerfRawData_ASPNET_ASPNETApplications

156. select * from Win32_PerfRawData_IAS_IASAccountingClients

157. select * from Win32_PerfRawData_IAS_IASAccountingServer

158. select * from Win32_PerfRawData_IAS_IASAuthenticationClients

159. select * from Win32_PerfRawData_IAS_IASAuthenticationServer

160. select * from Win32_PerfRawData_InetInfo_InternetInformationServicesGlobal

161. select * from Win32_PerfRawData_MSDTC_DistributedTransactionCoordinator

162. select * from Win32_PerfRawData_MSFTPSVC_FTPService

239 163. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerAccessMethods

164. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerBackupDevice

165. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerBufferManager

166. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerBufferPartition

167. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerCacheManager

168. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerDatabases

select * from 169. Win32_PerfRawData_MSSQLSERVER_SQLServerGeneralStatistics

170. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerLatches

171. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerLocks

select * from 172. Win32_PerfRawData_MSSQLSERVER_SQLServerMemoryManager select * from 173. Win32_PerfRawData_MSSQLSERVER_SQLServerReplicationAgents

174. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerReplicationDist

select * from 175. Win32_PerfRawData_MSSQLSERVER_SQLServerReplicationLogreader select * from 176. Win32_PerfRawData_MSSQLSERVER_SQLServerReplicationMerge select * from 177. Win32_PerfRawData_MSSQLSERVER_SQLServerReplicationSnapshot

178. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerSQLStatistics

179. select * from Win32_PerfRawData_MSSQLSERVER_SQLServerUserSettable

180. select * from Win32_PerfRawData_NETFramework_NETCLRExceptions

181. select * from Win32_PerfRawData_NETFramework_NETCLRInterop

182. select * from Win32_PerfRawData_NETFramework_NETCLRJit

183. select * from Win32_PerfRawData_NETFramework_NETCLRLoading

240 184. select * from Win32_PerfRawData_NETFramework_NETCLRLocksAndThreads

185. select * from Win32_PerfRawData_NETFramework_NETCLRMemory

186. select * from Win32_PerfRawData_NETFramework_NETCLRRemoting

187. select * from Win32_PerfRawData_NETFramework_NETCLRSecurity

188. select * from Win32_PerfRawData_Outlook_Outlook

189. select * from Win32_PerfRawData_PerfDisk_PhysicalDisk

190. select * from Win32_PerfRawData_PerfNet_Browser

191. select * from Win32_PerfRawData_PerfNet_Redirector

192. select * from Win32_PerfRawData_PerfNet_Server

193. select * from Win32_PerfRawData_PerfNet_ServerWorkQueues

194. select * from Win32_PerfRawData_PerfOS_Cache

195. select * from Win32_PerfRawData_PerfOS_Memory

196. select * from Win32_PerfRawData_PerfOS_Objects

197. select * from Win32_PerfRawData_PerfOS_PagingFile

198. select * from Win32_PerfRawData_PerfOS_Processor

199. select * from Win32_PerfRawData_PerfOS_System

200. select * from Win32_PerfRawData_PerfProc_FullImage_Costly

201. select * from Win32_PerfRawData_PerfProc_Image_Costly

202. select * from Win32_PerfRawData_PerfProc_JobObject

203. select * from Win32_PerfRawData_PerfProc_JobObjectDetails

204. select * from Win32_PerfRawData_PerfProc_Process

205. select * from Win32_PerfRawData_PerfProc_ProcessAddressSpace_Costly

206. select * from Win32_PerfRawData_PerfProc_Thread

207. select * from Win32_PerfRawData_PerfProc_ThreadDetails_Costly

241 208. select * from Win32_PerfRawData_RemoteAccess_RASPort

209. select * from Win32_PerfRawData_RemoteAccess_RASTotal

210. select * from Win32_PerfRawData_RSVP_ACSPerRSVPService

211. select * from Win32_PerfRawData_Spooler_PrintQueue

212. select * from Win32_PerfRawData_TapiSrv_Telephony

213. select * from Win32_PerfRawData_Tcpip_ICMP

214. select * from Win32_PerfRawData_Tcpip_IP

215. select * from Win32_PerfRawData_Tcpip_NBTConnection

216. select * from Win32_PerfRawData_Tcpip_NetworkInterface

217. select * from Win32_PerfRawData_Tcpip_TCP

218. select * from Win32_PerfRawData_Tcpip_UDP

219. select * from Win32_PerfRawData_W3SVC_WebService

220. select * from Win32_PhysicalMemory

221. select * from Win32_PhysicalMemoryArray

222. select * from Win32_PhysicalMemoryLocation

223. select * from Win32_PNPAllocatedResource

224. select * from Win32_PnPDevice

225. select * from Win32_PnPEntity

226. select * from Win32_PointingDevice

227. select * from Win32_PortableBattery

228. select * from Win32_PortConnector

229. select * from Win32_PortResource

230. select * from Win32_POTSModem

231. select * from Win32_POTSModemToSerialPort

242 232. select * from Win32_PowerManagementEvent

233. select * from Win32_Printer

234. select * from Win32_PrinterConfiguration

235. select * from Win32_PrinterController

236. select * from Win32_PrinterDriverDll

237. select * from Win32_PrinterSetting

238. select * from Win32_PrinterShare

239. select * from Win32_PrintJob

240. select * from Win32_PrivilegesStatus

241. select * from Win32_Process

242. select * from Win32_Processor

243. select * from Win32_ProcessStartup

244. select * from Win32_Product

245. select * from Win32_ProductCheck

246. select * from Win32_ProductResource

247. select * from Win32_ProductSoftwareFeatures

248. select * from Win32_ProgIDSpecification

249. select * from Win32_ProgramGroup

250. select * from Win32_ProgramGroupContents

251. select * from Win32_ProgramGroupOrItem

252. select * from Win32_Property

253. select * from Win32_ProtocolBinding

254. select * from Win32_PublishComponentAction

255. select * from Win32_QuickFixEngineering

243 256. select * from Win32_Refrigeration

257. select * from Win32_Registry

258. select * from Win32_RegistryAction

259. select * from Win32_RemoveFileAction

260. select * from Win32_RemoveIniAction

261. select * from Win32_ReserveCost

262. select * from Win32_ScheduledJob

263. select * from Win32_SCSIController

264. select * from Win32_SCSIControllerDevice

265. select * from Win32_SecurityDescriptor

266. select * from Win32_SecuritySetting

267. select * from Win32_SecuritySettingAccess

268. select * from Win32_SecuritySettingAuditing

269. select * from Win32_SecuritySettingGroup

270. select * from Win32_SecuritySettingOfLogicalFile

271. select * from Win32_SecuritySettingOfLogicalShare

272. select * from Win32_SecuritySettingOfObject

273. select * from Win32_SecuritySettingOwner

274. select * from Win32_SelfRegModuleAction

275. select * from Win32_SerialPort

276. select * from Win32_SerialPortConfiguration

277. select * from Win32_SerialPortSetting

278. select * from Win32_Service

279. select * from Win32_ServiceControl

244 280. select * from Win32_ServiceSpecification

281. select * from Win32_ServiceSpecificationService

282. select * from Win32_SettingCheck

283. select * from Win32_Share

284. select * from Win32_ShareToDirectory

285. select * from Win32_ShortcutAction

286. select * from Win32_ShortcutFile

287. select * from Win32_ShortcutSAP

288. select * from Win32_SID

289. select * from Win32_SMBIOSMemory

290. select * from Win32_SoftwareElement

291. select * from Win32_SoftwareElementAction

292. select * from Win32_SoftwareElementCheck

293. select * from Win32_SoftwareElementCondition

294. select * from Win32_SoftwareElementResource

295. select * from Win32_SoftwareFeature

296. select * from Win32_SoftwareFeatureAction

297. select * from Win32_SoftwareFeatureCheck

298. select * from Win32_SoftwareFeatureParent

299. select * from Win32_SoftwareFeatureSoftwareElements

300. select * from Win32_SoundDevice

301. select * from Win32_StartupCommand

302. select * from Win32_SubDirectory

303. select * from Win32_SystemAccount

245 304. select * from Win32_SystemBIOS

305. select * from Win32_SystemBootConfiguration

306. select * from Win32_SystemDesktop

307. select * from Win32_SystemDevices

308. select * from Win32_SystemDriver

309. select * from Win32_SystemDriverPNPEntity

310. select * from Win32_SystemEnclosure

311. select * from Win32_SystemLoadOrderGroups

312. select * from Win32_SystemLogicalMemoryConfiguration

313. select * from Win32_SystemMemoryResource

314. select * from Win32_SystemNetworkConnections

315. select * from Win32_SystemOperatingSystem

316. select * from Win32_SystemPartitions

317. select * from Win32_SystemProcesses

318. select * from Win32_SystemProgramGroups

319. select * from Win32_SystemResources

320. select * from Win32_SystemServices

321. select * from Win32_SystemSetting

322. select * from Win32_SystemSlot

323. select * from Win32_SystemSystemDriver

324. select * from Win32_SystemTimeZone

325. select * from Win32_SystemUsers

326. select * from Win32_TapeDrive

327. select * from Win32_TemperatureProbe

246 328. select * from Win32_Thread

329. select * from Win32_TimeZone

330. select * from Win32_Trustee

331. select * from Win32_TypeLibraryAction

332. select * from Win32_UninterruptiblePowerSupply

333. select * from Win32_USBController

334. select * from Win32_USBControllerDevice

335. select * from Win32_UserAccount

336. select * from Win32_UserDesktop

337. select * from Win32_VideoConfiguration

338. select * from Win32_VideoController

339. select * from Win32_VideoSettings

340. select * from Win32_VoltageProbe

341. select * from Win32_WMIElementSetting

342. select * from Win32_WMISetting

247 Appendix D – Configuration File of OpenSER

The code shown below is the content of the configuration file of OpenSER. The following is the configuration that will handle the translation between Microsoft Office Communication Server and Asterisk.

########################## Global Parameters ############################## debug=3 fork=yes log_stderror=yes log_facility=LOG_LOCAL5 children=4 port=5060 ########################### Modules Section ############################## #set module path mpath="/usr/local/lib/openser/modules/" loadmodule "sl.so" loadmodule "tm.so" loadmodule "rr.so" loadmodule "maxfwd.so" loadmodule "textops.so" loadmodule "mi_fifo.so" loadmodule "uri.so" loadmodule "xlog.so" loadmodule "gflags.so" ##################### setting module-specific parameters ##################### # ----- mi_fifo params ----- modparam("mi_fifo", "fifo_name", "/tmp/openser_fifo") # ----- rr params ----- modparam("rr", "enable_full_lr", 1) modparam("rr", "enable_double_rr", 0) # ----- gflags params ----- modparam("gflags", "initial", 1) # 1..enable extended debug ############################## Routing ################################## ############################## request routes ############################# route { if (msg:len >= 2048 ) { sl_send_reply("513", "Message too big"); exit; } if (loose_route()) { append_hf("P-hint: rr-enforced\r\n"); xlog("L_INFO", "Loose Route\n"); } if (!uri==myself) { append_hf("P-hint: outbound\r\n"); # xlog("L_INFO", "URI is not myself\n");

248 } route(1); } route[1] { # various debug outputs xlog("L_INFO","$C(yb)$cs $rm: mF=$mF, bF=$bF, sF=$sF$C(xx)\n"); if (is_gflag("0")) { if(!t_check_trans()) { $var(trans)=$var(trans)+1; xlog("L_INFO","$C(gx)[ New Transaction ($var(trans))]$C(xx)\n"); } xlog("L_INFO","\n\n$C(py)[ Method $rm from $si:$sp ($var(trans)) ]$C(px)\n$mb$C(py)[ End of Request ($var(trans)) ]$C(xx)\n\n\n"); } # use a non-generic reply route to be able to use transaction flags t_on_reply("1"); if (src_ip == 1.1.1.4) { xlog("L_INFO", "~~~ OCS -> * ~~~\n"); # remove misleading CONTACT header line remove_hf("CONTACT"); # subst("/(CONTACT:).*/\1 /ig"); # use from address in contact header # remove UTF-8 information, as * is not able to process it properly (causes RTP problem) subst("/^(CONTENT-TYPE:.*);[ ]*charset=utf-8(.*)/\1\2/ig"); # identify re-INVITEs if ((method == INVITE) && (search("^TO.*tag=.*"))) { # mark INVITE (hold) for latter modification of the reply "200 OK" if (search("a=inactive")) { xlog("L_INFO", "$C(rx)'a=inactive' found in INVITE!$C(xx)\n"); setflag(5); } } # relay request to * if (!t_relay("udp:1.1.1.6:5060")) { xlog("L_ERR", "~~~ relay to udp:1.1.1.6:5060 failed!\n"); sl_reply_error(); } } ###################### # request to OCS # ###################### else { xlog("L_INFO", "~~~ * -> OCS ~~~\n"); # relay request to OCS if (!t_relay("tcp:1.1.1.4:5060")) { xlog("L_ERR", "~~~ relay to tcp:1.1.1.4:5060 failed!\n"); sl_reply_error(); } }

249 exit; } ############################# reply routes ############################# onreply_route[1] { # various debug outputs xlog("L_INFO","$C(yr)$rs $rr ($cs $rm): mF=$mF, bF=$bF, sF=$sF$C(xx)\n"); if (is_gflag("0")) { xlog("L_INFO","\n\n$C(bc)[ Reply $rs ($rr) from $si:$sp concerning $rm ]$C(bx)\n$mb$C(bc)[ End of Reply ]$C(xx)\n\n\n"); } ####################### # reply back to * # ####################### if (src_ip == 1.1.1.4) { xlog("L_INFO", "$C(cx)reply OCS -> *$C(xx)\n"); # remove misleading CONTACT header line remove_hf("CONTACT"); # remove UTF-8 information, as * is not able to process it properly (causes RTP problem) subst("/^(CONTENT-TYPE:.*);[ ]*charset=utf-8(.*)/\1\2/ig"); } ######################### # reply back to OCS # ######################### else { xlog("L_INFO", "### * -> OCS ###\n"); # identify reply to call hold if (isflagset(5) && (status == "200") && (has_body("application/sdp"))) { # add missing "a=inactive", as OCS expects it if (subst_body("/(m=.*\r\n)/\1a=inactive\r\n/")) { xlog("L_INFO", "$C(rx)'a=inactive' inserted in 200 OK!$C(xx)\n"); } } } }

250 Appendix E – DSL Configuration & Dial Plan Files

The setup presented in Figure 6.3 introduces a new concept. All the available pieces of information are treated as tables and columns. This organization allows the dialplan developers to deal with the different types of servers the same way a developer deals with a database. For example, a table called emails is created in the setup file. This means that dialplan can execute a simple select statement to retrieve those unread emails for user A, whose password is X. In the same way, Active Directory is treated as a table with a number of columns. This also applies to any other server or software package.

// the first line in this file is the IP address of the integration server IntegrationServer=192.168.10.10 // declare a variable of type 'database connection' dbconnection AccessDB // specify the connection string // in the following example, the data is stored in a Microsoft Access database // the following means that the Access database is on the C drive of the integration server AccessDB.Provider = 'Microsoft.Jet.OLEDB.4.0” AccessDB.DataSource = ‘C:\OCSRequests.mdb' // declare a variable of type 'table' table tbl1 // create a table, name it users, and let the variable 'tbl1' point to it // the first parameter is the name of the table // this is the name that will be referenced in the dialplans // the second parameter is the source table/view in the source database tbl1 = AccessDB.CreateTable('users', 'UsersView') // add a column to the table // the first parameter is the name that will be referenced in the dialplans // the second parameter is the source column in the source database tbl1.addcolumn('username', 'LoginName') tbl1.addcolumn('pincode', 'NumericPassword') tbl1.addcolumn('active', 'IsActive')

251 //------// declare a variable of type 'exchange connection' xconnection xcon // specify the IP address of the MS exchange server xcon.IP = '192.168.10.20' // declare a variable of type 'exchange table' xtable tbl2 // create a table, name it emails, and let the variable 'tbl2' points to it // the only parameter is the name of the table // this is the name that will be referenced in the dialplans tbl2 = xcon.CreateTable('emails') // all of the following columns will be automatically available // without the need to create them // username, password, msg, cc, to, date, from, subject, // importance, priority, hasattach, read // an additional column will be available. it is called 'concatenated' // this field will contain the result of concatenating three fields: from + subject + body // this concatenated string will be sent to Festival to be read to the caller // the caller will hear the following: // you received an email from ($from) + titled + ($subject) + it says + ($body) //------// declare a variable of type 'user-defined connection' udfconnection udf // specify the URL of the user-defined Web site that will provide the required service udf.webservice = 'http://192.168.10.30/Warehouse/' // declare a variable of type 'table' table tbl3 // each page in the above website will represent a table. // in the following, we will create a new table called products // this table will include the productid, productname, price, and availableQty // suppose the user wants to retrieve productname from // products table where productid = 11 // the 'products.php' page will be called with the following parameters supplied:

252 // 1- where = 'productid=11' // 2- return = 'productname' // the setup builder should develop a page that behaves this way // it should receive these 2 parameters (where & return) and return 1 text value tbl3 = udf.CreateTable('products', 'products.php') // add a column to the table // the first parameter is the name that will be referenced in the dialplans // the second parameter is what will be sent as a parameter to the web page tbl3.addcolumn('productid', 'prodserial') tbl3.addcolumn('productname', 'prodname') tbl3.addcolumn('price', 'price') tbl3.addcolumn('availableqty', 'qty') //------// declare a variable of type 'directory connection' dirconnection dcon // specify the folder to be explored dcon.path = '\\192.168.10.40\shared\' dcon.type = windows // declare a variable of type 'dir table' dirtable tbl4 // the only parameter of the CreateTable command is // the name of the table to be used in the dialplan // there is no need to add new columns // this table will automatically have the following 5 columns: // 1- owner (numeric value that is usually the password of the caller) // 2- fname (name of file / folder) // 3- type (1 for folders, 2 for files) // 4- parent (path of parent folder) // 5- ext (extension of the file) tbl4 = dcon.CreateTable('dir') // the following code will insert one row for user 123456 to specify the starting folder // there is no need to specify the values of the other 2 columns // type will be '1' (folder), parent will be empty '' (starting folder),

253 // and ext is also an empty string tbl4.insert('123456', 'My Offers', 1, '', '') //------// declare a variable of type 'OCS connection' ocsconnection ocscon // specify some settings of the Office Communication Server ocscon.ServerIP = '192.168.10.50' ocscon.Domain = 'scopepbx.com' ocscon.Transport = 'TCP' // declare a variable of type 'ocs table' ocstable tbl5 // the first parameter is the name that will be referenced in the dialplans // the second parameter is the required service tbl5 = ocscon.CreateTable('IM', 'IM') //------// create a new connection to a windows server // specify the IP of the server and credentials winconnection wincon wincon.serverIP = "192.168.10.50" wincon.user = "admin" wincon.pwd = "secret" // declare a variable of type 'wintable' wintable tbl6 // fill the table with the result of the following windows query tbl6 = wincon.CreateTable("shares", "Select * from Win32_Share")

Figure 6.3 - Proposed DSL Setup File

The code presented in Figure 6.4 shows the dialplan that will be developed by VoIP applications developers. It is so obvious that those developers will not have to take care of any low-level details related to the integration. After the setup file is properly configured, they make use of the available commands to specify what they need and pay no attention to how the integration was achieved.

254

255

//each dialplan starts with 'begin' and ends with 'end' Begin // this line is considered a comment because it starts with 2 front slashes // this language uses simpler syntax than traditional dialplan // it is also a structured programming language // it includes procedures and functions that return values // it includes branching commands such as if, else, while, for ...

// the caller can try his pin 3 times $trials = 0 while ($trials < 3) { $trials = $trials + 1 // the first parameter in the following command is the file to be played. // if file is omitted, the text of the 2nd parameter will be sent to festival // festival will read the text to the caller // the last parameter is the number of digits to be input // if the 2nd and the 3rd parameters are both empty, // festival will automatically play the following message to the caller: // please enter a 6 digit number $pin = input(, please enter your 6 digit pin code, 6) // find the pin in the users table // table should have been defined during the setup process // the setup builder will specify the source of data, including // IP, server name, database, DBMS engine, username, password, table, ... // the search condition is put between 2 square brackets [] // the data source might be: // MS SQL Server DB, Oracle DB, MySQL DB, MS Access DB, others // XML file, a Text file, MS Excel Sheet, MS Winword document // MS exchange server, where each mail message is considered a row // Active Directory, where each user is considered a row

256 // Logging entries of windows Event Viewer // a folder on a HDD of a windows server, where each file is considered a row // a folder on a HDD of a linux machine // a web service // set the username. it will be used to retrieve the mails $user = tables.users.username[active = 'Y' && pincode = $pin] if ($user != "") { // exit the loop because the pin is found break } elseif ($trials == 3) { // this following line of code will cause the following to occur: // 1- escape the rest of the code // 2- play the file sent as 1st parameter. // if file name is not sent, festival will play a goodbye message // 3- end the call exit(goodbye) } } // the caller will not reach here unless he/she supplied a valid pin number // the following switch command will play the file sent as 1st parameter // the switch command will allow the user to enter a digit // and branch to the corresponding line // 1st parameter can be escaped // if this is the case, the 2nd parameter will be sent to festival switch(what_do_u_want) { case 1: // connect to MS Exchange server checkmail($user) break

257 case 2: // use a user-defined object checkProduct() break case 3: // use the Instant Messaging feature of MS Office Communication Server IM($user) break case 4: // the caller will explore his/her hard disk // he/she will keep navigating from a folder to another until he/she finds the file // when done, the found file might be a Word document or an Excel sheet // the found file might be read by festival // an empty string is supplied as parent because we want the first level ExploreHDD($user, '') break case 5: // run a windows query to get the shares // and tell the caller how many shares there are festival.say(tables.shares.count) // if none of the above keys are pressed, execute bye() procedure else: bye() } // the following line ends the dialplan End //------void checkmail($user) { $password = input(, please enter your password, 8) // the following command accepts up to 3 parameters // the first is the name of the file that will be played // if 1st parameter is missing, 2nd parameter, which is a text, will be sent to festival

258 // if both are missing, the 3rd parameter is read to the caller // the term 'count' is reserved. // setup builder will not be allowed to use this term as the name of any column $cnt = tables.emails.count[read = 'N' && username = $user && password = $password] SayDigit (,'number of your emails not yet read is', $cnt) if ($unread == 0) { // the following command will cause control to go back to the switch statement return } for (int $i = 0; $i <= $unread; $i = $i +1) { // get the subject of the ith email message // select subject from emails // where read = 'N' and username = $user and password = $password] // and rownumber = $i // the 1st filtering condition [between square brackets] returns an array of values // the 2nd filtering condition [$i] returns the ith record $msg = tables.emails[read='N' && username=$user && password=$password][$i] // the following command will send the text to festival, // which will read it to the caller festival.say($msg) } } //------void checkProduct() { $prod = input(, please enter the number of the product, 5) $avail = tables.products.availableqty[productid = $prod] if ($avail == "") { festival.say("no such product")

259 return } festival.say ('available quantity is' + $avail) // when done, control will go back to the switch statement } //------void IM($user) { // the following command allows the caller to input letters // by using the digits buttons of the phone, the same way one writes SMS message // read up to 5 letters and store them in the variable called $destination $destination = readletters(5) // now the ($user) wants to send an IM message to the ($destination) // the message should appear in the OC of the ($destination) tables.IM.insert['please call me when u can' + $user, $destination] } //------void ExploreHDD($user, $parent) { $type = input(, please press 1 for folders 2 for files and 3 for both, 1) if ($type == 3) { $tbl = tables.dir[owner=$user && parent=$parent]} else { $tbl = tables.dir[owner=$user && parent=$parent && type=$type]} for (int $i = 0; $i < $tbl.count; $i = $i +1) { festival.say($tbl[i].fname) $found = input(, please press 1 if you want this or 2 to ignore, 1) // if type = folder, explore subfolders if ($found == 1 && $tbl[i].type == 2) { ExploreHDD($user, $tbl[i].fname)

260 break } // read content of the file else { festival.readfile($parent, $tbl[i].fname) break } } // the caller will reach here only if none of the entries inside // the $parent folder are what he/she wants. The content of the folder is read again ExploreHDD($user, $parent) }

Figure 6.4 - Proposed DSL Dialplan File

261 Appendix F – Countries by Number of Mobile Phones & Internet Users

Table 6.4 lists top countries by the number of mobile phones in use.

Table 6.4 - Countries by the number of mobile phones in use

Number of % of Last Rank Country or region Population mobile phones population updated

1. China 804,400,000 1,338,610,000 61.5 Aug. 2010

2. India 670,600,000 1,185,000,000 56.6 Aug. 2010

3. United States 285,610,580 308,505,000 91.0 Dec. 2009

4. Russia 213,900,000 141,940,000 147.3 Jun. 2010

5. Brazil 189,400,000 191,480,630 98.9 Sep. 2010

6. Indonesia 168,264,000 229,965,000 73.1 May. 2009

7. Japan 107,490,000 127,530,000 84.1 Mar. 2009

8. Germany 107,000,000 81,882,342 130.1 2009

9. Pakistan 99,185,844 168,500,500 60.4 Aug.2010

10. Italy 88,580,000 60,090,400 147.4 Dec.2008

11. Mexico 83,500,000 111,212,000 75.0 Apr.2010

12. Philippines 78,000,000 92,226,600 73.6 Jan. 2010

13. United Kingdom 75,750,000 61,612,300 122.9 Dec. 2008

14. Vietnam 70,000,000 87,375,000 80.1 2009

15. Turkey 66,000,000 71,517,100 92.2 2009

16. Nigeria 76,000,000 144,339,000 50.3 Dec. 2009

17. France 58,730,000 65,073,842 90.2 Dec. 2008

18. Thailand 56,170,908 65,001,021 81.0 2009

19. Ukraine 54,377,000 46,143,700 117.9 April. 2009

20. Spain 50,890,000 45,828,172 111.0 Dec. 2008

262 Table 6.4 - Countries by the number of mobile phones in use

Number of % of Last Rank Country or region Population mobile phones population updated

21. Bangladesh 50,400,000 162,221,000 31.1 Aug 2009

22. South Korea 47,000,000 48,333,000 97.2 2009

23. Argentina 40,402,000 40,482,000 99.8 2007

24. South Africa 42,300,000 47,850,700 82.9 2007

25. Iran 39,400,000 71,208,000 54.2 2008

26. Poland 36,746,000 38,115,967 96.4 2006

27. Colombia 40,300,000 45,393,050 88.7 2009

28. Egypt 30,065,000 75,498,000 23.8 2007

29. Algeria 28,500,000 33,858,000 92.0 2006

30. Venezuela 27,400,000 28,200,000 98.0 2008

31. Peru 27,000,000 29,000,000 93.1 Sep. 2010

32. Taiwan 25,412,000 22,974,347 110.6 2008

33. Romania 22,800,000 21,438,000 108.5 Mar. 2008

34. Canada 21,455,000 33,487,208 64.2 2008

35. Morocco 20,029,000 34,343,000 58.4 2007

36. Australia 19,760,000 21,179,211 93.3 2006

37. Saudi Arabia 19,663,000 24,735,000 79.5 2006

38. Malaysia 19,464,000 27,484,000 70.8 2006

39. Netherlands 18,914,000 16,402,414 115.3 Sept. 2007

40. Chile 15,768,000 16,598,074 95.0 July 2008

41. Portugal 14,500,000 10,632,000 137.0 2008

42. Hungary 11,732,000 10,020,000 115.1 Dec. 2009

43. Bulgaria 10,655,000 7,600,000 140.2 2008

263 Table 6.4 - Countries by the number of mobile phones in use

Number of % of Last Rank Country or region Population mobile phones population updated

44. Hong Kong 10,550,000 7,008,900 150.5 2009

45. Israel 9,319,000 7,310,000 127.5 2008

46. Azerbaijan 7,000,000 8,900,000 31.4 Nov. 2009

47. Jordan 6,010,000 5,950,000 101.0 Mar. 2010

48. Singapore 4,770,000 6,400,000 74.5 Nov. 2009

49. New Zealand 4,620,000 4,252,277 108.6 2008

50. Estonia 1,982,000 1,340,602 147.8 Apr. 2009

51. Lebanon 1,260,000 4,017,095 31.4 2007

52. Lithuania 4,960,000 3,341,966 148.4 Feb. 2010

53. Yemen 8,312,773 22,492,035 36.958 2010

264 Table 6.5 is a list of top countries by number of Internet users:

Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

1. China 425,000,000 31.8% 2010

2. European Union 337,779,055 67.6 % 2010

3. United States 240,000,000 77.4% 2010

4. Japan 99,150,000 78.2% 2010

5. India 81,000,000 6.9% 2010

6. Brazil 75,943,600 37.8% 2010

7. Mexico 68,430,000 61.5% 2010

8. Germany 65,200,000 79.1% 2010

9. Russia 59,850,000 42.8% 2010

10. United Kingdom 51,450,000 82.5% 2010

11. France 44,630,000 68.9% 2010

12. Nigeria 43,985,000 28.9% 2010

13. South Korea 39,500,000 81.1% 2010

14. Turkey 35,000,000 45.0% 2010

15. Italy 34,000,000 54.0% 2010

16. Iran 33,200,000 43.2% 2010

17. Indonesia 30,000,000 12.3% 2010

18. Philippines 29,750,000 29.7% 2010

19. Spain 29,095,000 62.6% 2010

20. Argentina 26,615,000 64.4% 2010

21. Canada 26,224,900 77.7% 2010

22. Vietnam 24,269,083 27.1% 2010

23. Poland 22,450,600 58.4% 2010

265 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

24. Colombia 21,529,415 48.7% 2010

25. Pakistan 18,500,000 10.4% 2010

26. Thailand 17,486,400 26.4% 2010

27. Australia 17,033,826 80.1% 2010

28. Egypt 17,060,000 21.2% 2010

29. Malaysia 16,902,600 64.6% 2010

30. Taiwan 16,130,000 70.1% 2010

31. Ukraine 15,400,000 33.7% 2010

32. Netherlands 14,890,200 88.7% 2010

33. Morocco 10,450,000 33.0% 2010

34. Saudi Arabia 9,800,000 38.1% 2010

35. Venezuela 9,306,916 34.2% 2010

36. Sweden 8,400,000 92.5% 2010

37. Chile 8,370,000 50.0% 2010

38. Belgium 8,113,200 77.8% 2010

39. Peru 8,085,000 27.0% 2010

40. Romania 7,790,000 35.5% 2010

41. Czech Republic 6,700,000 66.5% 2010

42. Hungary 6,176,400 61.8% 2010

43. Austria 6,143,600 74.8% 2010

44. Switzerland 5,739,300 75.3% 2010

45. South Africa 5,300,000 10.8% 2010

46. Kazakhstan 5,300,000 34.3% 2010

47. Israel 5,263,146 71.6% 2010

266 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

48. Portugal 5,168,800 48.1% 2010

49. Greece 4,970,700 46.2% 2010

50. Hong Kong 4,878,713 68.8% 2010

51. Denmark 4,750,500 86.1% 2010

52. Algeria 4,700,000 13.6% 2010

53. Uzbekistan 4,689,000 16.8% 2010

54. Finland 4,480,900 85.3% 2010

55. Belarus 4,436,800 46.2% 2010

56. Norway 4,431,100 94.9% 2010

57. Sudan 4,200,000 10.0% 2010

58. Serbia 4,107,000 55.9% 2010

59. Slovakia 4,065,000 74.3% 2010

60. Kenya 3,995,500 10.0% 2010

61. Syria 3,935,000 17.7% 2010

62. United Arab Emirates 3,777,900 75.9% 2010

63. Azerbaijan 3,689,000 44.4% 2010

64. Singapore 3,658,400 77.8% 2010

65. New Zealand 3,600,000 85.4% 2010

66. Tunisia 3,600,000 34.0% 2010

67. Bulgaria 3,395,000 47.5% 2010

68. Uganda 3,200,000 9.6% 2010

69. Ireland 3,042,600 65.8% 2010

70. Dominican Republic 3,000,000 30.5% 2010

71. Ecuador 2,359,710 16.0% 2010

267 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

72. Guatemala 2,280,000 16.8% 2010

73. Croatia 2,244,400 50.0% 2010

74. Kyrgyzstan 2,194,400 39.8% 2010

75. Lithuania 2,103,471 59.3% 2010

76. Costa Rica 2,000,000 44.3% 2010

77. Uruguay 1,855,000 52.8% 2010

78. Sri Lanka 1,776,200 8.3% 2010

79. Jordan 1,741,900 27.2% 2010

80. Cuba 1,605,000 14.0% 2010

81. Jamaica 1,581,100 55.5% 2010

82. Latvia 1,503,400 67.8% 2010

83. Bosnia-Herzegovina 1,441,000 31.2% 2010

84. Zimbabwe 1,422,000 12.2% 2010

85. Albania 1,300,000 43.5% 2010

86. Georgia 1,300,000 28.3% 2010

87. Slovenia 1,298,500 64.8% 2010

88. Ghana 1,297,000 5.3% 2010

89. Moldova 1,295,000 30.0% 2010

90. Oman 1,236,700 41.7% 2010

91. Bolivia 1,102,500 11.1% 2010

92. Kuwait 1,100,000 39.4% 2010

93. Macedonia 1,057,400 51.0% 2010

94. Afghanistan 1,000,000 3.4% 2010

95. Haiti 1,000,000 10.4% 2010

268 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

96. Lebanon 1,000,000 24.3% 2010

97. Paraguay 1,000,000 15.7% 2010

98. Puerto Rico 1,000,000 25.1% 2010

99. El Salvador 975,000 16.1% 2010

100. Estonia 969,700 75.1% 2010

101. Cote d'Ivoire 968,000 4.6% 2010

102. Panama 959,900 28.1% 2010

103. Honduras 958,500 12.0% 2010

104. Senegal 923,000 6.6% 2010

105. Zambia 816,700 6.9% 2010

106. Cameroon 750,000 3.9% 2010

107. Malawi 716,400 4.6% 2010

108. Tajikistan 700,000 9.3% 2010

109. Tanzania 676,000 1.6% 2010

110. Bahrain 649,300 88.0% 2010

111. Nepal 625,800 2.2% 2010

112. Bangladesh 617,300 0.4% 2010

113. Mozambique 612,500 2.8% 2010

114. Nicaragua 600,000 10.0% 2010

115. Angola 607,400 4.6% 2010

116. Laos 527,400 7.5% 2010

117. Trinidad and Tobago 485,000 39.5% 2010

118. Rwanda 450,000 4.1% 2010

119. Ethiopia 445,400 0.5% 2010

269 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

120. Qatar 436,000 51.8% 2010

121. Cyprus 433,800 39.3% 2010

122. Luxembourg 424,500 85.3% 2010

123. Yemen 420,000 1.8% 2010

124. Kosovo 377,000 20.8% 2010

125. Democratic Republic of the Congo 365,000 0.5% 2010

126. Palestine | West Bank 356,000 14.2% 2010

127. Togo 356,300 5.7% 2010

128. Libya 353,900 5.5% 2010

129. Mongolia 350,000 11.3% 2010

130. Iraq 325,000 1.1% 2010

131. Madagascar 320,000 1.5% 2010

132. Brunei 318,900 80.7% 2010

133. Iceland 301,600 97.6% 2010

134. Réunion 300,000 36.5% 2010

135. Montenegro 294,000 44.1% 2010

136. Mauritius 290,000 22.4% 2010

137. Macao 280,900 49.5% 2010

138. Eritrea 250,000 4.3% 2010

139. Mali 250,000 1.8% 2010

140. Republic of the Congo 245,200 5.9% 2010

141. Malta 240,600 51.9% 2010

142. Guyana 220,000 29.4% 2010

143. Armenia 208,200 7.0% 2010

270 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

144. Benin 200,000 2.2% 2010

145. Chad 187,800 1.8% 2010

146. Burkina Faso 178,200 1.1% 2010

147. Martinique 170,000 41.9% 2010

148. Suriname 163,000 33.5% 2010

149. Cape Verde 150,000 29.5% 2010

150. Saint Lucia 142,900 88.8% 2010

151. Barbados 142,000 49.7% 2010

152. Gambia 130,100 7.1% 2010

153. Namibia 127,500 6.0% 2010

154. Botswana 120,000 5.9% 2010

155. Papua New Guinea 120,000 2.0% 2010

156. Niger 115,900 0.7% 2010

157. Bahamas 115,800 37.3% 2010

158. Myanmar 110,000 0.2% 2010

159. Somalia 106,000 1.0% 2010

160. Fiji 103,000 10.9% 2010

161. Guadeloupe 103,000 23.2% 2010

162. Gabon 98,800 6.4% 2010

163. Guinea 95,000 0.9% 2010

164. French Polynesia 90,000 31.4% 2010

165. Swaziland 90,000 6.6% 2010

166. Maldives 87,900 22.2% 2010

167. Guam 85,000 47.6% 2010

271 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

168. New Caledonia 85,000 37.4% 2010

169. Turkmenistan 80,400 1.6% 2010

170. Cambodia 78,000 0.5% 2010

171. Lesotho 76,800 4.0% 2010

172. Saint Vincent and the Grenadines 76,000 72.9% 2010

173. Mauritania 75,000 2.3% 2010

174. Antigua and Barbuda 65,000 74.9% 2010

175. Burundi 65,000 0.7% 2010

176. Andorra 67,200 79.5% 2010

177. Belize 60,000 19.1% 2010

178. French Guiana 58,000 24.6% 2010

179. Bermuda 54,000 79.1% 2010

180. Greenland 52,000 90.2% 2010

181. Bhutan 50,000 7.1% 2010

182. Guernsey 48,300 74.6% 2010

183. Faroe Islands 37,500 76.4% 2010

184. Guinea-Bissau 37,100 2.4% 2010

185. Seychelles 33,900 38.4% 2010

186. U.S. Virgin Islands 30,000 27.3% 2010

187. Jersey 29,500 31.6% 2010

188. Dominica 27,500 37.8% 2010

189. Grenada 27,000 25.0% 2010

190. São Tomé and Príncipe 26,700 15.2% 2010

191. Djibouti 25,900 3.5% 2010

272 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

192. Comoros 24,300 3.1% 2010

193. Aruba 24,000 22.9% 2010

194. Cayman Islands 24,000 47.8% 2010

195. Liechtenstein 23,000 65.7% 2010

196. Monaco 23,000 75.2% 2010

197. Liberia 20,000 0.5% 2010

198. Central African Republic 22,600 0.5% 2010

199. Saint Kitts and Nevis 17,000 34.1% 2010

200. San Marino 17,000 54.0% 2010

201. Vanuatu 17,000 7.8% 2010

202. Federated States of Micronesia 16,000 14.9% 2010

203. Sierra Leone 14,900 0.3% 2010

204. Equatorial Guinea 14,400 2.2% 2010

205. Northern Mariana Islands 10,000 19.4% 2010

206. Solomon Islands 10,000 1.7% 2010

207. Gibraltar 9,853 34.2% 2009

208. Samoa 9,000 4.1% 2009

209. Tonga 8,400 6.9% 2008

210. Palau 5,400 26.0% 2007

211. Cook Islands 5,000 42.1% 2009

212. Anguilla 4,500 31.2% 2009

213. Tuvalu 4,200 33.9% 2008

214. British Virgin Islands 4,000 16.3% 2002

215. Falkland Islands 2,483 100.0% 2009

273 Table 6.5 - Countries by the number of Internet users

Rank Country Internet Users % Pop. Date

216. Marshall Islands 2,200 3.4% 2007

217. Kiribati 2,000 1.8% 2001

218. Netherlands Antilles 2,000 0.9% 1999

219. Timor Leste 1,800 0.2% 2009

220. Montserrat 1,200 23.5% 2009

221. Wallis and Futuna 1,200 7.8% 2009

222. Niue 1,000 62.6% 2010

223. Saint Helena 800 10.4% 2010

224. Tokelau 800 58.4% 2010

225. Norfolk Island 700 27.4% 2010

226. Christmas Island 464 33.1% 2010

227. Nauru 300 2.1% 2010

228. Vatican City 93 11.2% 2010

274 References

Allen, 1999 : Allen, M (1999). Don't be a troll! Using the Internet for successful higher education. [Verified 6 June 2004] http://smi.curtin.edu.au/NetStudies/docs/allen/AllenHighered19 99.doc Asakawa & : Asakawa, C. and Takagi, H. Annotation-based transcoding for nonvisual web access. The fourth international ACM conference Takagi, 2000 on Assistive technologies ASSETS. 172-179. ACM. 2000. Baker, 1994 : Baker, M. (1994). A model for negotiation in teaching-learning dialogues. Journal of Artificial Intelligence in education, 5 (2), 199-254. Ball et al , 2000 : Ball, T., Colby, C., Danielsen, P., Jagadeesan, L.J., Jagadeesan, R., Laufer, K., Mataga, P. and Rehor, H. Sisl: Several Interfaces, Single Logic, International Journal of Speech Technology 3, 93- 108, 2000. Bausch et al , : Bausch, P., Haughey, M. & Hourihan, M. (2002). We Blog: Publishing Online with Weblogs. New York: John Wiley & 2002 Sons, (Chapter 8). [Verified 6 June 2004] http://www.blogroots.com/chapters.blog/id/4 Beasley et al , : Beasley, R., Farley, K.M., O’Reilly, J. and Squire, L.H. Voice 2002 Application Development with VoiceXML. Sams Publishing, Indianapolis, 2002. Bradnum, 2004 : Bradnum, Kristy. (2004, June 21).Voice: A Field Evaluation. In Literature Review . Retrieved from http://www.cs.ru.ac.za/research/g01B3159/reports/Literature%2 0Review/Lit%20Review.pdf Bromley, 2005 : Bromley, Mike. (2005). Introduction to VoIP . Retrieved from http://ezinearticles.com/?Introduction-to-VoIP&id=65549 Brøndsted, 2004 : Brøndsted, Tom. (2004, April 7). “Unification Grammar + VoiceXML” Aalborg University. Cooper, 2004 : Cooper, Andrew. (2004). “ The Speech Technologies Market: Past, Present and Future ” Ectaco Inc. Datamonitor, : Datamonitor. (2003, July). “ Voice Automation: Past, Present 2003 and Future ”. Deloite, 2006 : Technolgy, Media and Telecomunications. (2006). Deloite: Protecting the Digital Assets. Retrieved from http://www.deloitte.com/assets/Dcom- Global/Local%20Assets/Document Diffie et al, 2007 : Diffie,Whitfield,& Landau, Susan. Privacy on the Line: The Politics of Wiretapping and Encryption. Retrieved from http://books.google.com/books?hl=en&lr=&id=nMY8yHaTQi4

275 C&oi=fnd&pg=PR9&dq=encrypting+voip+is+more+important+ than+other+communication&ots=DP_YEVDgch&sig=EY8psxS wT9nPKOa1RgiVTQQGo8M#v=onepage&q=&f=false Dillenbour & : Dillenbourg, P., & Schneider, D. (1995). Mediating the mechanisms which make collaborative learning sometimes Schneider, 1995 effective. International Journal of Educational Telecommunications , 1(2-3), 131-146. DSL, 2010 : Domain-Specific Language. (2010, October 28). In Wikipedia, The Free Encyclopedia . Retrieved 15:09, November 12, 2010, from http://en.wikipedia.org/w/index.php?title=Domain- specific_language&oldid=393424995 Dybkjær et al , : Dybkjær, Hans, Dybkjær, Laila & Bernsen, Neils. (1995). 1995 Database Access via Spoken Language Interfaces. Retrieved from http://spokendialogue.dk/Publications/ EC-Council, : International Council of Electronic Commerce Consultants. 2004 (2004). Ethical Hacking (EC-Council Exam 312-50): Student Courseware. ECOS 97 : Peter B., John S., Yves M., Bart P., Christof S., Chris C., Fraud Detection and Management in Mobile Telecommunications Networks, Proceedings of the European Conference on Security and Detection ECOS 97, pp. 91-96, London, April 28-30, 1997. ESAT-SISTA TR97-41. Eidsvik, 2001 : Eidsvik, Bruce. (2001, November 26). “ Power to the People! VoiceXML in the Hands of the Workforce ” VoiceGenie. Erkens, 1997 : Erkens, G. (1997). Coöperatief probleemoplossen met computers in het onderwijs: Het modelleren van coöperatieve dialogen voor de ontwikkeling van intelligente onderwijssystemen [Cooperative problem solving with computers in education: Modelling of cooperative dialogues for the design of intelligent educational systems].Utrecht: Brouwer Uithof. Fluss, 2004 : Fluss, Donna M. (2004, November 25). “ The Practical Guide to Speech Recognition ” DMG Consulting LLC. Glenn, 2009 : Fleishman, Glenn . (2009,August 28). Wake on Demand lets Snow Leopard sleep with one eye open[19 paragraphs]. MacUser.[On-line]. Available: http://www.macworld.com/article/142468/2009/08/wake_on_de mand.html. [2009, Sep.15]. Govindan & : Govindan, V.K., & Mohan, Shajee. (2004, February).An Mohan, 2004 Intelligent Text Data Encryption and Compression for High Speed and Secure Data Transmission over Internet. IIT Kanpur Hacker’s Workshop. Retrieved from

276 http://www.security.iitk.ac.in/IITKHACK04 Greg S., 2005 : Tucker, Greg S. (2005). Voice Over Internet Protocol (VoIP) and Security. Retrieved from http://www.securitytechnet.com/resource/hot- topic/voip/1513.pdf Heng, 2010 : Heng, Christopher. (2010, August 10). Free Screen Readers: Text to Speech Conversation. Retrieved from http://www.thefreecountry.com/utilities/free-screen- readers.shtml Hiler, 2002 : Hiler, J. (2002). Blogs as disruptive tech: How weblogs are flying under the radar of the content management giants. [Verified 6 June 2004] http://www.webcrimson.com/ourstories/blogsdisruptivetech.htm Hori et al , 1999 : Hori, M., Kondoh, G., Ono, K., Hirose S.I. and Singhal S. Annotation-Based Web Content Transcoding, Proceedings of the 9th International World Wide Web Conference. Amsterdam, Netherlands: 1999. HP OpenCall, : HP OpenCall. “VoiceXML: Changing the landscape of voice 2003 services” HP OpenCall. July 2003. Available online: http://www.informationweek.com/whitepaper/Internet/wp90450 0?articleID=904500 Accessed: 13 November 2010. Huang & : Huang, A.W. and Sundaresan, N. Aurora: A Conceptual Model for Web-Content Adaptation to Support the Universal Usability Sundaresan, of Web-based Services, CUU ’00 Arlington VA. 2000 IDC Press : IDC Worldwide Quarterly Mobile Phone Tracker. 07 Feb 2011. Release, 2011 Available online: http://www.idc.com/about/viewpressrelease.jsp?containerId=prU S22689111§ionId=null&elementId=null&pageType=SYNO PSIS Accessed: 9 July 2011. Jackson, 2001 : Jackson, Eric. “VoiceXML: Open for Business” Wireless Week, Vol. 7 Issue 29, p33. 16 July 2001. James, 1998 : James, F. Lessons from Developing Audio HTML Interfaces, ACM Conference on Assistive Technologies. April 15-17, 1998, Marina del Rey, CA USA, pages 27-34. Jiang et al , 2002 : Jiang, Wenya et al. (2002, May.June).Integrating Internet Telephony Services . Internet Computing. Retrieved http://computer.org/internet/ Johan den, 2009 : Haan, Johan den. (May, 2009). DSL development: 7 Recommendations for Domain Specific Language Design Based

277 on Domain-Driven Design. In The Enterprise Architect. Retrieved from http://www.theenterprisearchitect.eu/archive/2009/05/06/dsl- development-7-recommendations-for-domain-specific-language- design-based-on-domain-driven-desig Lamb & : Lamb, M. and Horowitz, B. Guidelines for a VoiceXML Solution Using WebSphere Transcoding Publisher. Available at Horowitz, 2001 ftp://ftp.software.ibm.com/software/wtp/info/VxmlTranscoding Guide.pdf Larson, 2004 : Larson, James. “VoiceXML lets you talk to computers” NetworkWorld. Vol. 21 Issue 12, p63. 22 March 2004. Available online from EBSCOhost: http://search.epnet.com/direct.asp?an=12603666&db=buh Accessed: 22 May 2004. Leavitt, 2003 : Leavitt, Neal.(2003,June). Two Technologies Vie for Recognition in Speech Market. In Technology News . Retrieved from www.leavcom.com/pdf/Speech.pdf Leggett and : Leggett, J. and Williams G. (1984). An Empirical Investigation Williams, 1984 of Voice as an Input Modality for Computer Programming. International Journal of Man-Machine Studies , 21, 493 – 520. Lippencott, 2004 : Lippincott, Richard, J. (2004, April). “Voice Extensible Markup Language Status” Intercom. Vol. 51 Issue 4, p23. Retrieved from http://www.stc.org/intercom/PDFs/2004/200404_23-25.pdf Meggelen et al , : Meggelen, JimVan, Smith, Jared, & Madsen, Leif. 2005 (2005). Asterisk: The Future of Telephony. Retrieved from http://cdn.oreilly.com/books/9780596510480.pdf Mernik et al , : Marjan Mernik, Jan Heering, and Anthony M. Sloane. When and how to develop domain-specific languages . ACM Comput. 2005 Surv., 37(4):316-344, 2005. Motiwalla, 2009 : Motiwalla, Luvai F. (2009). A Voice-enabled Interactive Services (VoIS) Architecture for e-Learning. International Journal on Advances in Life Sciences, 1(4).122-133.Retrieved from http://www.iariajournals.org/life_sciences/ Nelson, 1996 : Nelson, M. The Data Compression Book, 1996. Nikbakht & : Nikbakht, E. and Tafti, M.H.A, Application of Expert Systems in evaluation of credit card borrowers. Managerial Finance 15/5, Tafti, 1989 19-27, 1989. Noyes et al , : Automatic Speech Recognition for Disabled People 1989 J.M. Noyes, R. Haigh and A.F. Starr The Royal National Hospital for Rheumatic Diseases, Bath, UK Department of Psychology, University of Bristol, UK Smiths Industries Aerospace and Defence Systems, Cheltenham,

278 UK http://www.ncbi.nlm.nih.gov/pubmed/15676748 Appl Ergon. 1989 Dec;20(4):293-8. Oats, 2010 : Oats: Open Source Assistive Technology Software. (2004-2010) [Software]. Available from http://www.oatsoft.org/Software/Software/by- category/Repository/Function/TextToSpeech Ong & Cing, : C. W. Ong and Tay Joc Cing, "A Robust Rule-based Event Management Architecture for Call-Data Records," accepted for 2004 publication in the Eighth International Conference on Knowledge-Based Intelligent Information and Engineering Systems 2004. Orubeondo, 2001 : Orubeondo, Ana. (2001, May 18). “ The Power of Voice ”. InfoWorld Test Center. Packetizer, : Introduction to VoIP. (2010). Packetizer Inc .Retrieved from 2010a http://www.packetizer.com/ipmc/papers/understanding_voip/voi p_introduction.html Packetizer, : Why Is VoIP Important? (2010). Packetizer Inc. Retrieved from 2010b http://www.packetizer.com/ipmc/papers/understanding_voip/voi p_importance.html Pegu, 2009 : Pegu, Uttam (2009, July 27). Use of Text to Speech (TTS) in IVR System. Retrieved from http://www.ivrsworld.com/tts/use- of-text-to-speech-tts-in-ivr-system/ Petraglia, 1997 : Petraglia, J. (1997). The rhetoric and technology of authenticity in education . Mahwah, NJ: Lawrence Erlbaum. Plomp & : Plomp, C.J. and Mayora-Ibarra, O. A. Generic Widget Vocabulary for the Generation of Graphical and Speech-Driven Mayora-Ibarra, User Interfaces, International Journal of Speech Technology 5, 2002 39-47, 2002. Poe, 2009 : Poe, Robert. (2009). The Top 25 VoIP Advances of 2009 In VoIP Evolution: Reports from the Cutting Edge of Voice and Video Communication . Retrieved from http://www.voipevolution.com/2009/12/top-25-voip-advances- of-2009.html Potter & Larson, : Potter, Stephen & Larson, Jim. (2005). VoiceXML and SALT - 2005 How are they different, and why? Retrieved from www.tdil.mit.gov.in/Jan_issue%202005/8-VOICE%20XML.pdf Prasanna and : Prasanna, Victor & Dandalis, Andreas. (2000). FPGA-based Dandalis, 2000 Cryptology for Internet Security. Retrieved from citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.111.302 Redmond, 2006 : Redmond, Wash. (2006, April 5). Microsoft Unveils Road Map for Speech Server 2007.In Microsoft Press Pass. Retrieved from

279 http://www.microsoft.com/presspass/press/2006/apr06/04- 05MSS07BetaPR.mspx Regruto, 2003 : Regruto, Luciano. “VoiceXML – Surfing on the Internet Using Voice” Business Briefing: Wireless Technology 2003. p122 Available online: http://www.vxmlitalia.com/bb2003.pdf Accessed: 13 Novemeber 2010. Rosenberg, 2001 : J. Rosenberg. A SIP interface to voicexml dialog servers. Internet draft, Internet Engineering Task Force, July 2001. Scardamalia& : Scardamalia, M., & Bereiter, C. (1994). Computer support for knowledge-building communities. The Journal of the learning Bereiter, 1994 sciences, 3 (3), 265-283. Scholz, 2003 : Scholz, K. W. “Beyond SALT Versus VoiceXML: Coping With The Wealth Of Standards In Speech And Multimodal Self- Service Applications” TCM Customer Inter@ction Solutions. Vol. 21 Issue 9, p52. March 2003. Available online: http://www.tmcnet.com/cis/0303/0303cccrmms.htm Accessed: 13 Nov 2004. Schwartz, 2004 : Schwartz, Ephraim. (2004). “W3C Recommends VoiceXML 2.0”. In Info World. Retrieved from http://www.infoworld.com/article/04/03/17/HNvxml_1.html Shao et al, 2002 : Shao Z., Capra R., and Pérez-Quiñones, M.A. Annotations for HTML to VoiceXML Transcoding: Producing Voice WebPages with Usability in Mind, Computer Science Department, Virginia Tech. Available at http://arxiv.org/abs/cs/0211037 Singh et al , 2003 : Singh, Kundan, Nambi, Ajay & Schulzrinne Henning. (2003). Integrating Voice with SIP services. Retrieved from citeseerx.ist.psu.edu/viewdoc/download Solveig, 1998 : Singleton, Solveig. (1998). ENCRYPTION POLICY FOR THE 21ST CENTURY: A Future without Government-Prescribed Key Recovery. Policy Analysis, 325. Retrieved from http://www.cato.org/pubs/pas/pa325.pdf Spencer et al , : M. Spencer, M. Allison, and C. Rhodes, The Asterisk Handbook. Asterisk Documentation Team, 2003. 2003 Swale, 2008 : Swale, Richard. (Ed.). (2008). Voice over IP: Systems and Solutions . Retrieved from books.google.com Syntellect, 2003 : Syntellect (2003, July). “Navigating the Waters of VoiceXML Part II”. p6. Tiri et al , 2006 : Tiri, Kris, & Verbauwhede, Ingrid. (2006). A Digital Design Flow for Secure Integrated Circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 25(7). Retrieved from

280 http//www.cosic.esat.kuleuven.be/publications/article-631 Veerman, 2000 : Veerman, A.L. (2000). Computer-supported collaborative learning through argumentation . Enschede: Print Partners Ipskamp. Downloadable at: http://eduweb.fss.uu.nl/arja/ VoiceXML : VoiceXML Tutorial. Part No. 520-0002-02. (2005). Retrieved Tutorial, 2005 from http://cafe.bevocal.com/docs/tutorial/tutorial.pdf W3.org : XHTML+voice. Available at http://www.w3.org/TR/xhtml+voice/ Wailgum, 2010 : Wailgum, Thomas. (2010). VoIP Definition and Solutions . CIO. Retrived from http://www.cio.com/article/40796/VoIP_Definition_and_Solutio ns?page=2 Wikipedia, 2008 : Speech Application language Tags. (2008, November 6). In Wikipedia, the free encyclopedia. Retrieved November 5,2009, from http://en.wikipedia.org/wiki/Speech_Application_Language_Tag s Wikipedia, 2009 : Communications Assistance for Law Enforcement Act. (2009, November 9). In Wikipedia, The Free Encyclopedia. Retrieved from http://en.wikipedia.org/w/index.php?title=Communications_Assi stance_for_Law_Enforcement_Act&oldid=324850309 Wikipedia, : Voice over IP. (2010, November 6). In Wikipedia, The Free 2010a Encyclopedia . Retrieved 18:54, November 11, 2010, from http://en.wikipedia.org/w/index.php?title=Voice_over_IP&oldid =395167164 Wikipedia, : Asterisk (PBX). (2010, November 10). In Wikipedia, The Free 2010b Encyclopedia . Retrieved 18:56, November 11, 2010, from http://en.wikipedia.org/w/index.php?title=Asterisk_(PBX)&oldi d=395984454

Wikipedia, : Speech Synthesis. (2010, October 31). In Wikipedia, The Free 2010c Encyclopedia . Retrieved 18:38, November 11, 2010, from http://en.wikipedia.org/w/index.php?title=Speech_synthesis&old id=393906349 Wikipedia, : Speech Recognition. (2010, October 30). In Wikipedia, The Free Encyclopedia . Retrieved 18:11, November 11, 2010, from 2010d http://en.wikipedia.org/w/index.php?title=Speech_recognition&

281 oldid=393757764

282