A Collaborative Interface for Multimodal Ink and Audio Documents

A Collaborative Interface for Multimodal Ink and Audio Documents

2009 10th International Conference on Document Analysis and Recognition A Collaborative Interface for Multimodal Ink and Audio Documents Amit Regmi and Stephen M. Watt Department of Computer Science University of Western Ontario London, Ontario, Canada N6A 5B7 {aregmi,watt}@csd.uwo.ca Abstract • known data formats for cross-platform support • use widely available communications infrastructure With the increased availability of pen-based devices, it • can be used from a range of devices from PDAs to dig- becomes interesting to conduct and to archive multi-party ital whiteboards communication sessions that involve audio and digital ink • sessions can be recorded and later played back on a shared canvas. Collaborative whiteboards do exist today but typically use complex or closed protocols for • recorded sessions are in a form suitable for later pro- communication. As a rule, existing whiteboards are not cessing, document analysis and recognition. interoperable across multiple platforms and do not sup- port archival of collaborative sessions for later reference After considering various alternatives, arrived at a de- or analysis. We explore how various data formats may be sign using InkML [1, 2] to represent digital ink, MP3 used to represent, to transmit, to record and to synchronize to represent the audio data and the Skype backbone for ink and audio channels. We find InkML to be a suitable rep- communications. While each of these choices has advan- resentation to support platform-independent digital ink in a tages and disadvantages, we elected to use the most widely form supporting both transmission and higher-level seman- supported cross-platform options over single-platform pro- tic analysis. To test our ideas we have developed a com- prietary formats and protocols or multi-platform but less- plete software implementation as a Skype add-on. This has adopted choices. In particular, the wide use of Skype pro- revealed possible improvements to the page and streaming vides immediate access to a wide user base familiar with the models of InkML. operation of the audio software. The primary contributions of the paper are: 1 Introduction • validation of InkML as a suitable basis for multi-party, real-time collaboration, Computing and communications technologies are now • certain specific observations on how InkML should be inseparable. Communication networks rely in essential enhanced before its final form in order to better support ways on computers and personal computers are rarely used real-time, multi-party collaboration, without being networked. We now use computers for both • the design and public availability of a proof-of- voice and written communications. At the same time, com- concept, but already useful, package for ink and audio puting devices are more commonly providing support for collaboration, satisfying our criteria above and avail- digital pen input. We see this with Tablet PCs, PDAs and able as a Sykpe add-on, digital whiteboards. • a design for the archival of ink and audio conferences For certain forms of collaboration, a combination of that allows later meaningful analysis of the digital ink. voice and pen-input is ideal. Collaborative annotation of architectural drawings or working with mathematical for- The remainder of the paper is organized as follows: Sec- mulae are just two examples. We are interested in how to tion 2 places our work in context, presenting related work best support this. and the necessary background on ink formats and the Skype We desire a solution with the following characteristics: architecture. Section 3 presents our architecture and dis- • multi-party collaboration, as for a conference call cusses its more important aspects. Section 4 concludes with • support for audio and ink input, with a shared canvas a brief discussion and observations and suggests a few di- • no special equipment required rections for future work. 978-0-7695-3725-2/09 $25.00 © 2009 IEEE 901 DOI 10.1109/ICDAR.2009.205 SKYPE4COMLib.Skype objSkype; 2 Background SKYPE4COMLib.Call objCallOne, objCallTwo; objSkype = new SKYPE4COMLib.Skype(); 2.1 Related Work objSkype.Attach(7,true); Whiteboard sharing is one of the widely used applica- objCallOne = objSkype.PlaceCall( "test","","",""); tions in pen computing. In the earliest forms of white- while(objCallOne.Status != SKYPE4COMLib. boards, portability and efficiency of data exchange were not TCallStatus.clsInProgress) { } a prime concern. Multimodal systems with pen and voice objCallOne.Hold(); interfaces were in use in the 1990s and earlier. QuickSet objCallTwo = objSkype.PlaceCall( [3] was a multimodal framework used by the US Navy and "echo123", "", "", ""); US Marine Corps to set up training scenarios and to control while(objCallTwo.Status != SKYPE4COMLib. the virtual environment. Recent work [4, 5] at the HP Labs, TCallStatus.clsInProgress) { } describe a heterogeneous collaborative environment for dig- objCallTwo.Join(objCallOne.Id); ital ink exchange. There are several other applications that Figure 1. Conf. call using Skype4COM in C#. use digital ink and/or audio as input modes. Some of them include applications for classroom presentation and capture [6], meetings [7, 8] etc. over 300 million registered users [17]. Its application inter- A number of Skype add-ons have been developed for face is familiar to a large population and presents a number collaboration. Yugma (with cross-platform support) [9], of convenient features. Mikogo [10], TalkAndWrite [11], WhiteBoardMeeting [12] Text-based cross-platform API provided by Skype works are a few that are commercially marketed. These tools sup- based on Operating System messaging with Windows Mes- port collaboration through screen sharing, audio/video con- saging on Windows; Cocoa, Carbon or AppleScript in Mac ferencing and digital ink exchange. However, we are un- OSX and X11 or D-BUS messaging in Linux environment. aware of any prior work that archives the ink exchange for The Skype public API has been exported in various forms later reference and analysis. such as Skype4COM, Skype4Py and Skype4Java. Such pre- Multimodal interactions, including collaboration, are the built libraries make it easier to handle the native Skype focus of a W3C working group [13]. A recent thesis [14] API calls written in C++. Skype4COM was our choice provides a summary of issues and current trends in multi- for the Windows client implementation as it could be used modal collaboration by certain vendors. in any ActiveX environment, such as Visual Studio, Del- phi etc. Figure 1 illustrates an example C# code that uses 2.2 InkML Skype4COM. InkML [1] is the Ink-Markup Language that provides a data format for the representation of the digital ink gener- 3 System Architecture ated by a stylus or an electronic pen. Applications such as handwriting recognizers, paint programs etc. can process the digital ink with InkML as the common format. InkML Most of the available whiteboard applications used for also has a support for a wide range of hardware devices. collaboration are built on their own complex protocols for With the emergence of InkML as an open standard, pro- data exchange [18, 4]. Instead of re-inventing a new proto- posed by the W3C Multimodal Interaction Activity Group col, our system design takes the Skype peer-to-peer protocol [13], a wide variety of applications have been using it for as the data exchange back-end after a careful analysis on its data representation. It is a platform-independent standard advantages. The Skype client can run on different mobile that can be used by applications on different operating sys- devices and on various platforms. The availability of pub- tems. Hardware and software vendors had previously used lic API to interact with the Skype network makes it a good proprietary formats for ink representation. choice for the implementation. InkML is the medium for digital ink representation in 3.1 InkAndAudio Client for Windows our InkAndAudio multimodal chat application. Although Microsoft’s Tablet PC SDK had features to support dig- other digital ink formats such as UNIPEN [15], Jot [16] etc. ital ink for Windows. Since its first release in 2001, it has do exist, our choice was InkML based on its being an open been improved significantly. Windows Presentation Foun- and an up-to-date standard. dation (WPF), on the other hand, is a graphical subsystem in the Microsoft .NET framework 3.0. Digital ink collection 2.3 Skype and the Skype API and rendering, functions traditionally present only on the We have chosen to use Skype as a backbone for ink and Tablet PC platform have been incorporated into WPF as first audio collaboration at a distance. Skype is at present the class member of the framework. The InkCollector, InkPic- global leader in VoIP and has an increasing user-base with ture and InkOverlay controls found in the classic Tablet PC 902 Figure 3. Audio compression with LAME. # python > > > import objc > > > objc.loadBundle("SkypeAPI", globals(), bundle_path= objc.pathForFramework( "/Applications/Skype.app/Contents /Frameworks/Skype.framework")) NSBundle </Users/amit/Library/Frameworks /Skype.framework> (loaded) > > > print SkypeAPI <objective-c class SkypeAPI at 0xd0004140> > > > SkypeAPI.isSkypeRunning() 1 Figure 2. The InkAndAudio Chat application. Figure 4. Using PyObjC bridge Mac OS X. platform API are put together in the <InkCanvas> element InkCanvas functionality that our Windows client has. with improved functionalities. A suitable application development environment for Figure 2 shows how two

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us