<<

AN IMPROVED MODULAR FRAMEWORK FOR DEVELOPING MULTI-SURFACE

INTERACTION

A Paper Submitted to the Graduate Faculty of the North Dakota State University of Agriculture and Applied Science

By

Jed Patrick Limke

In Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE

Major Program: Software Engineering

December 2015

Fargo, North Dakota North Dakota State University Graduate School

Title AN IMPROVED MODULAR FRAMEWORK FOR DEVELOPING MULTI-SURFACE INTERACTION

By Jed Patrick Limke

The Supervisory Committee certifies that this disquisition complies with North Dakota

State University’s regulations and meets the accepted standards for the degree of

MASTER OF SCIENCE

SUPERVISORY COMMITTEE:

Dr. Jun Kong

Chair Dr. Juan Li Dr. Jin Li

Approved:

12/21/2015 Dr. Brian Slator

Date Department Chair ABSTRACT

Advances in multi-touch capabilities have led to their use in a vast array of devices, but usable interactions that span across devices are less frequently encountered. To address this con- cern, a framework was created for the development of interaction spaces which gave developers the tools to write applications which united tabletop and handheld devices, allowing each device to utilize its inherent interaction style yet still communicate effectively. However, while this framework provided proof that such interactions were possible, it failed to prove itself as easily reusable by subsequent developers. To address this concern, we have created an improved framework to both fulfill the goals of the original framework and confront its shortcomings.

Our improved framework features a new intra-component communication system, mobile device independence, configurable user interfaces, and automatic exposure of available interac- tions. All of these features coalesce to fulfill the goals of the original framework while improving its reusability.

!iii TABLE OF CONTENTS

ABSTRACT ...... iii

LIST OF FIGURES ...... v

1. INTRODUCTION ...... 1

2. PROBLEM STATEMENT ...... 6

3. APPROACH OVERVIEW...... 11

4. BUILDING THE COMMUNICATION LAYER ...... 14

5. SOLVING THE MOBILE ECOSYSTEM PROBLEM ...... 22

6. CONSTRUCTING MOBILE INTERFACES REMOTELY ...... 26

7. GENERATING TOPOLOGIC REPRESENTATIONS OF THE TABLETOP’S INTERACTION POINTS ...... 33

8. CONCLUSION AND FUTURE WORK ...... 37

REFERENCES ...... 39

!iv LIST OF FIGURES

Figure Page

1. Improved framework overall structure ...... 11

2. High-level message bus example ...... 14

3. Abstract MessageBase class ...... 15

4. Message bus service interface, IMessageBusService ...... 16

5. Message bus client interface, IMessageBusClient ...... 17

6. MessageReceivedEventArgs ...... 17

7. Service implementation ...... 18

8. Client implementation ...... 19

9. MessageBase and its descendants ...... 21

10. Mobile interaction sequence ...... 23

11. Experience class ...... 26

12. Passive classes ...... 27

13. Interaction classes ...... 28

14. Screenshot of sample user interface ...... 30

15. Experience object and corresponding mobile interface ...... 31

16. The ResponseCollection class ...... 32

17. Screenshot of sample auto generated topological interface ...... 34

18. A hand-held token ...... 35

19. The virtual rectangle representing the capture area ...... 35

20. Topologic generation when Bravo, Foxtrot, Hotel, and are captured ...... 36

!v 1. INTRODUCTION

Since the advent of telecommunications, physical proximity has become less and less necessary to satisfy the needs of interaction, however in-person, face-to-face interaction is still important [Wu03], particularly in collaborative endeavors. While handheld devices provide the ability for users to interact virtually, co-located users also have the ability to interact, togeth- er, with tabletop computers. Given their large screens, tabletops are especially suited to such col- laborative work and play. However, when collaborating in a shared space, focus becomes a con- cern and especially so when working together on a single large screen [Sch12]. This poses many challenge for multi-user simultaneous interaction.

For example, on tabletop computers, only a single virtual keyboard can be utilized at one time given the personal nature of computers, even in collaborative environments, which limits the ability for multiple users to simultaneously interact directly with a single system. Further- more, a shared tabletop provides a large enough screen for multiple users, but there is still only one speaker, allowing for, at most, a single audio source to play for the collaborators. While this may generally be preferred, it’s unfortunate that such a limitation exists. In addition, there may be instances where users wish or are required to interact with the collaborative space privately, and in an environment with multiple users and one shared collaborative tabletop, privacy is chal- lenging if not impossible.

Though the use of a large tabletop screen makes information accessible to multiple users, there are also physical considerations to be contemplated aside from the aforementioned chal- lenges, such as physical reachable distance and content orientation [Shen06], given users will have disparate physical orientations while seated around a tabletop. These differing orientations

!1 preclude the ability for multiple users to view the tabletop’s contents a optimal orientation (e.g. the orientation of text on the surface that is suitable for users seated on one side of the tabletop cannot be simultaneously suitable for users seated on the opposite side). Even if the orientation issues are somehow overcome, simply the minimum size necessary for a tabletop screen to be large enough for collaboration may preclude a smaller user from being able to reach different points upon the tabletop without moving.

Previous studies have developed methods and interaction strategies to overcome these challenges. For example, Schmidt et. al. [Sch12] proposed a cross-device interaction style for mobile devices and tabletops which uses the mobile device itself to provide tangible input on the tabletop in a stylus-like fashion. Manipulating the mobile device through different predefined gestures and movements allowed the user to interact with the tabletop to perform different tasks, all based upon data being collected simultaneously from not only the tabletop’s internal cameras, but also the physical orientation sensors on the mobile device (e.g. accelerometers, gyroscopes).

A drawback of this interaction style is that selection of objects on the tabletop screen sometimes erred due to collision of simultaneous touch events between mobile devices [Sch12]. Our ap- proach associates each device with a unique token or ByteTag to assure the correct association between a user’s mobile device and their interactions with the tabletop.

Another approach which was developed in order to overcome these challenges was a cross-device framework (called MobiSurf) between the tabletop, a Pixelsense comput- er, and Android smartphones. This framework, from Roudaki et al. [Rou14], was designed to fa- cilitate the operations of co-located users while each has a common goal but unique needs to ful- fill each individual’s role in group collaboration. However, this approach suffers in several areas.

!2 One of these problems was that the system was hardcoded in advance for a specific quantity of simultaneous users, and those users were forced to use specific Microsoft ByteTags (hand-held/ device-affixed tokens which represented the user’s interactions when placed against the tabletop surface). This unfortunate situation meant that whenever additional users were desired, or differ- ent ByteTags were employed, the Pixelsense application had to have its source code manually edited, rebuilt, and deployed to the tabletop. Another problem with the existing framework was that interactions required an app designed specifically and only for Android smartphones to be installed on the users’ devices which also contained more hardcoded settings, again requiring al- teration and redeployment of the applications to the devices whenever changes were desired. In addition, another problem surfaced in the difficulty software developers had in managing and generating interfaces for the smartphones to display from the Pixelsense application; no reusable framework was in place to make this task straightforward or reusable — each device-tabletop- interaction pairing required a device-specific user interface to be manually coded and deployed to the device. This is not only cumbersome in the short-term, but unsustainable for long-term software development.

Given the challenges brought forth by the implementation of Roudaki et al. [Rou14], we implemented our improved framework with several features to mitigate said challenges. One of these mitigations is to provide interaction with the tabletop via a web interface that is not device- specific, and therefore can be used on any device that has a network connection to the tabletop application without the need to build, test, and deploy native applications for each device we wish to support. The tabletop for which both frameworks have been developed is the Microsoft

Pixelsense, which is powered by Microsoft .NET, and specifically, the language C#. The original

!3 prototype system utilized an Android application which had to be developed using a different language, namely Java. The surface-similarities of the languages may impede rapid development of advanced features as the quirks of each language become more prominent during develop- ment. Furthermore, given the nature of applications for devices, these applications must be specifically deployed to said devices; a mitigable problem.

While interacting with both the tabletop and the personal device, messages can be sent between the devices, triggering UI and data updates as appropriate. In the original prototype, these UI and data updates had to be meticulously and manually developed on a case-by-case ba- sis. In other words, the personal device UIs were hardcoded for specific hardcoded messages as sent by the tabletop application. This plagued the framework with inflexibility while also making the barrier to creating new UIs and interactions very expensive to produce and maintain.

Furthermore, we have taken advantage of the Windows Communication Foundation

(WCF) framework to facilitate simultaneous multi-user interaction, and have developed a library of resources to be used when developing user interfaces for connected personal devices. The original prototype utilized a hand-crafted codebase riddled with hardcoded port and protocol as- signments, making refactoring extremely difficult to execute. In addition, simultaneous user in- teraction suffered from showstopper bugs resulting from the framework’s hand-crafted nature.

This should not be read as a critique of the prototype developer’s ability, however — building cross-device communication is exceedingly difficult to do by hand and requires extensive testing, wherein leveraging some existing features of .NET takes some of the burden off the developer so other problems can be explored with more care.

!4 The remainder of the paper is organized as follows: Chapter 2 discusses the background of the problem. Chapter 3 overviews our approach. Chapter 4 covers the communication frame- work for the components of the system. Chapter 5 discusses the problem presented by mobile applications across application ecosystems. Chapter 6 covers automatic generation of interfaces on a mobile device. Chapter 7 discusses topological generation of mobile interfaces based upon the tabletop interface and is followed by our conclusion and recommendations for future work in

Chapter 8.

!5 2. PROBLEM STATEMENT

The existing framework for developing multi-surface environments showed that it is pos- sible for multiple users to privately interact with a single, shared, tabletop environment. The ap- proach utilized, however, was wrought with issues which precluded the ability for developers to reuse the framework to build new applications without significant ‘heavy lifting’. These prob- lems include:

• Internal and external communication. In order to facilitate the communication of dis-

tinct components (Android smartphones and the Pixelsense application) of the overall

system with one another, the communication code was hand-developed from scratch.

This code, while a true feat of engineering, relied upon a plethora of hard-coded ports,

addresses, and identifiers in order to operate, which were tangled throughout the entire

system. Changes to a port given unforeseen circumstances, such as a competing but busi-

ness-necessary application on the same network with the system, would require meticu-

lous examination and recompilation of the source code in order to rectify, not only for the

Pixelsense application, but also for the Android application. Furthermore, for the Android

application, such changes will require, at a minimum, redeployment and installation of

the application on each and every Android device that will be used the the system. At a

worst-case, Android applications would have to be updated via the Google Play store,

which could take several hours [Google15] to become available to end-users.

• Android OS specificity. An Android application was developed to facilitate multi-sur-

face interaction with the Pixelsense tabletop computer. While Android smartphone mar-

ketshare is at a staggering 79 percent of all active smartphones in the world [Olenick15],

!6 the decision to develop an Android-only approach still leaves out roughly one-fifth of all

potential users of the system. Furthermore, within the , the market share of

Android is considerably lower, at roughly 47 percent of users which is effectively equal

to the user base of iOS [Leswing15]. In addition, fragmentation, while

being a problem for every major smartphone operating system, is especially problematic

for an Android-centric approach, given, at the time of this writing, there are statistically

significant percentages of Android OS devices in use across not only three major versions

(2, 4, and 5) but 8 minor versions [Swanner15]. Developing and testing for this many

disparate operating systems, even while under the shared umbrella provided by the An-

droid kernel, can be a prohibitively difficult task to perform. In addition to all of the

aforementioned concerns, it should also be noted that the Pixelsense development re-

quires C# language development on the Microsoft .NET framework, while Android de-

velopment requires Java. This multi-lingual necessity may prove to be problematic for

some developers. Interactions between the smartphone application and the tabletop appli-

cation, while always benefitting from testing no matter which ecosystems are involved,

suffer even more so given the time-consuming nature of deploying changes to each phys-

ically separate test device.

• Generation of smartphone interfaces. Smartphone interfaces for each interaction with

the Pixelsense application had to be hardcoded into the Android application. In other

words, whenever a new interaction between the smartphone and the tabletop is desired,

the triggering message has to be built into the Pixelsense application and the Android ap-

plication has to be refactored to accommodate this new message along with whatever in-

!7 terface is desired. This adversely affected the ability of the software developer to respond

to changing requirements with timeliness and agility. In addition to the difficulty in facili-

tating the ability of the smartphone to show the correct interface for a tabletop interac-

tion, it was difficult to actually create the interface for the smartphone at all. Each smart-

phone interface had to be individually and manually designed and integrated into the ap-

plication.

• Interacting with the tabletop via the smartphone. In order to interact with the tabletop

in the existing framework, a Microsoft ByteTag is affixed to the rear of the smartphone.

This ByteTag is a Pixelsense readable black-and-white representation of a uniquely-iden-

tifiable number that the Pixelsense application can associate with the smartphone user. A

screenshot of the tabletop is sent to the smartphone and displayed, fullscreen, and aligned

as such to make the smartphone screen appear to be transparent, or see-through to the

tabletop surface. Then, the smartphone must be moved about the tabletop surface until

controls on the Pixelsense app are seen through the smartphone’s screen. These controls

can then be clicked ‘through’ the smartphone screen which triggers actions in the tabletop

application, possibly triggering further communication with the smartphone (e.g. opening

a form, vibrating, etc.) The problem with this interaction is that it requires the Pixelsense

application’s interface to be designed in such a way that it is usable, or at least partially-

usable, when interacted with exclusively via a physically small portion of the overall in-

terface. In other words, the developer has to design tabletop controls to be clickable on a

smartphone’s screen due to this ‘user experience mismatch’.

!8 In addition to the aforementioned issues, the nature of the original framework was plagued by the hallmarks of throwaway prototyping. Given its intent as a proof-of-concept, but also an investigation of what was possible as the problem-space of the original research was ex- plored, this is not surprising. Fortunately, the original framework succeeded in showing what could be done in the space and generating momentum for further development. Unfortunately, the hardcoded nature of the original framework made it supremely inflexible for new develop- ment. For example, in order to add a new interaction to the Pixelsense application, existing inter- action code had to be examined from the user interface layer all the way through to the commu- nication layer in order to determine how to implement the new interaction. Once this had been examined, the new interaction had to essentially be copied-and-pasted with variable and function name-changes in order to facilitate new functionality. Notwithstanding the same types of changes that were also required in the Android application, this was a tedious, error-prone process.

It cannot be overstated that, given the desire to make the tabletop application as accessi- ble as possible, developing and maintaining multiple device-specific apps across several applica- tion ecosystems is a time-wasting task at this stage. In the future it is possible that the framework could be called upon to provide native-app-level graphics acceleration or processing, but at this time, accessibility of the users to the tabletop without specific app development, testing, and de- ployment is an absolute necessity. Developing and testing for many disparate operating systems can be a prohibitively difficult task to perform.

It should also be mentioned that, in order to facilitate the communication of distinct com- ponents (Android smartphones and the Pixelsense application) of the overall system with one another, the communication code was hand-developed from scratch and filled with hardcoded

!9 ports, protocols, and directives which were strewn throughout the entire system. The overall structure of the framework did very little to facilitate extension and reuse. In order to create any interactions that cross from the tabletop to the personal devices, the only way to implement them was to carefully examine existing interactions in the prototype’s sample application and perfectly mimic the sample’s implementation. While it certainly can be stated that software development thrives from the multitude of examples available which illustrate implementations and concepts in any given language or ecosystem, the prototype framework didn’t provide such examples per se; it provided inflexible end-to-end examples which tromped across the entire framework with specific and necessary adjustments at each step of the way, in order to accomplish preconceived objectives. These didn’t teach a developer how to implement the framework on one’s own — these provided arcane blueprints which had to be followed to the letter lest one be plagued by seemingly nonsensical errors and shortcomings if even one, seemingly unimportant, step was missed during implementation.

Redevelopment of the framework from the ground-up, using the original as reference for specific techniques when possible, was a logical path forward.

!10 3. APPROACH OVERVIEW

Our approach to developing the next generation of the framework was redevelopment from the ground-up, using the original as reference for specific techniques when possible.

The framework’s existing communication layer was hand-developed from scratch. We concluded we could utilize pre-existing .NET libraries to develop a clean, untangled communica- tion layer for both inter-component and intra-component communication. For this, we developed a message bus and related contracts to define the operations and interactions necessary to utilize it. By constructing this message bus and defining client objects which could interact through it, we were able to establish simple communication across the framework and also expose them for use by users of the framework in their own applications (see Figure 1).

Figure 1. Improved framework overall structure

After creating the communication components, we had to move to one of the next major problems we identified in the existing framework: its dependence upon user smartphones being

Android devices, since the only smartphone interface into the framework was written as an An- droid-specific app. To solve this issue, we made the decision to do a webpage-based approach rather than an app-specific approach, allowing users to interact via their web browsers. To this

!11 end, a web application server was developed to produce a web-based interface through which users can interact with tabletop applications built using the framework.

Once the web application server and rudimentary web-based interface was developed, we needed to allow the tabletop application developer, via this framework, to specify what will be shown to and gathered from a mobile user at any given time. To this end, we developed ‘experi- ences’ to encapsulate this information as it is sent back and forth to and from the tabletop appli- cation and the mobile device and give the application developer customizable, but limited control over the mobile interfaces, allowing them to be effectively auto-generated for the web browser.

This approach involved developing objects which could be sent along the message bus from the tabletop to the mobile devices which contained barebones data on what interactions were re- quired, letting the web application server worry about generating the interfaces for the mobile devices. Furthermore, the experiences deployed to the mobile devices can also, optionally, have responses which can be sent back from the mobile devices back to the tabletop application, con- taining requested data or events causing further interaction.

The last major hurdle was solving the problem of user experience mismatch wherein de- velopers of the tabletop application had to construct interfaces that could be utilized on both the tabletop and the mobile device while simultaneously having the same physical form factor and styling. In order to solve this problem, we decided to steer clear of the need for the tabletop de- veloper to worry about the smartphone’s interface at all by developing a system to automatically generating a topological representation of the area of interest around the user’s physical interac- tion point with the tabletop surface. This meant that only the names and types of the controls tru- ly matter, not their exact locations on the tabletop application. Our framework’s control library

!12 was developed to expose developer-configurable properties which allow the developer to define an area-of-interest around a user’s physical tabletop interaction point that exposes manipulatable controls to the user’s mobile device, allowing the user to interact with the aforementioned con- trols via a simple list interface that ignores the tabletop application controls’ unnecessary physi- cal and stylistic details, focusing only upon relevant information and available interactions.

!13 4. BUILDING THE COMMUNICATION LAYER

The existing communication layer in the framework, which was constructed to facilitate the communication of distinct components (Android smartphones and the Pixelsense application) of the overall system with one another, was hand-developed from scratch. This part of the framework was built from hard-coded ports, identifiers, and addresses which were tangled throughout the entire system. Such hand-spun development, while often times necessary, was not in this case — at least not to the depth that the original framework had done. We determined that we could leave the physical concerns of ports and addressing to parts of the .NET framework that already existed for these purposes and focus on implementing a clean, untangled communication layer for both inter-component and intra-component communication.

Figure 2. High-level message bus example

To this end, we decided to approach our communication needs through the implementa- tion of a message bus. A message bus (see Figure 2) is an architecture which is able to unite dis-

!14 parate software components through a shared set of interfaces, yet keep the components decou- pled enough so that clients of the bus can be easily added and removed without adversely affect- ing those which remain. In order to develop our message bus, we needed to define contracts to define base behavior for not only the clients of the bus, but the bus mechanism itself. But first, the messages we transport needed to be defined.

Figure 3. Abstract MessageBase class

Messages were determined to need an ID, the time they were sent, and a sender. In addi- tion, they needed recipients, but we also considered that messages would sometimes need to be broadcast to multiple recipients simultaneously, so it was determined that determining a specific recipient is optional by making recipientId nullable (see Figure 3). For example, while many messages would be sent between two specific clients, messages announcing that client “X” is about to disconnect would be useful for any other client who has a relationship with client “X”, so instead of sending individual messages to all previously contacted clients to inform them each, messages can be broadcast to clients which have already expressed an interest in client “X” by subscribing to it. We also developed the MessageBase class as an abstract class to prevent it from being instantiated on its own — the payload of a particular message needed to be defined

!15 on a case-by-case basis and cluttering the base class with anything more specific than sending and addressing information would be unnecessary given our ability to utilize inheritance.

Figure 4. Message bus service interface, IMessageBusService

The second contract (see Figure 4) we developed was the contract to be utilized by the message bus service itself, IMessageBusService. Clients of the message bus need to be able to register, unregister, send messages, and check their own messages. In addition, clients should also be able to subscribe to specific senders, retrieve their subscriptions, or the list of subscribers to another sender. Registering a recipient is for announcing to the service that the client is instan- tiated, set up, and ready to send and receive messages. Unregistering means that any cached mes- sages for that client should be discarded and the client should be removed from any service-side lists of available clients. SendMessage allows a client to send a message via the bus, while

CheckClientMessages is used to retrieve any messages sent to the client but have not yet been delivered. The subscription and subscriber methods are for managing subscriptions to clients, wherein, for example, clients “A” and “B” can subscribe to broadcast messages from client “C”

— this allows “A” and “B” to receive messages sent by “C” when neither “A” nor “B” has been

!16 explicitly specified as a recipient of the messages. As has been explained previously, this is a necessary ability for a service to have.

Figure 5. Message bus client interface, IMessageBusClient

The final important contract which needed to be defined was the client contract, IMes- sageBusClient. In this contract, (see Figure 5) we defined the operations and fields necessary to adequately communicate with the service by sending and receiving messages. To this end, a client needed an ID, the ability to send a message, and to subscribe to and unsubscribe from an- other client. In addition, the client, as we are using polling for our message bus, needed to be able to have its polling turned on and off. Furthermore, we needed users of the client to be able to

Figure 6. MessageReceivedEventArgs !17 react to messages received by the client, and henceforth a MessageReceived event (see Figure 6) was also added to the client, wherein its event arguments would contain the message which was received.

Once the service and client contracts were created, concrete implementations were able to be developed. The service implementation, Service (see Figure 7), implemented the IMessage-

BusService (see Figure 4). In addition to concrete implementations of the methods described by the contract, a private list of clients, a dictionary of client subscriptions, and a dictionary of client messages were needed to facilitate the implementations of the necessary methods. Dictionaries are collections of key and value pairs, and as such, were ideal for our implementation. For exam- ple, _clientMessages is implemented as a Dictionary

Figure 7. Service implementation

!18 Base>>. In our use, the key to the dictionary is a client ID while the value is a SortedList of mes- sages which are sorted based upon the time at which the message was received.

Figure 8. Client implementation

The concrete implementation of the IMessageBusClient (see Figure 5), Client, was next developed (see Figure 8). In addition to concrete implementations of the methods and properties described by the contract, a background worker was used to facilitate the implementations of the necessary methods. We determined a background worker was necessary due to our desire to im- plement the clients as non-blocking. That is, we utilize background workers to ensure that Client instances run on separate threads from the threads from which they are invoked. For example, without multi-threading being explicitly implemented in the Client, invocation of the StartMes- sagePolling method from the UI thread of the tabletop application, such as in the OnLoad event !19 handler of the tabletop application, would block the UI thread from continuing execution, pre- venting the application from being usable.

While developing our implementation, we realized that, during application termination, managed Client instances were not gracefully detaching from the Message Bus server. In order to prevent this issue from occurring, we chose to also have the IMessageBus interface implement the IDisposable interface, and thereby had to also implement this in our concrete implementation by providing cleanup via the Dispose method to stop polling the Message Bus service and call the UnregisterMessageRecipient method on the service. This not only prevented errors during termination, but also upon routine Client instance destruction. In order to take advantage of this, when removing a Client instance, users of Client objects must call the Dispose method. The ad- vantage of this implementation versus having Client simply implement a destructor is that the .NET runtime’s garbage collector schedule is out of the user’s hands, and as such, objects may not be destroyed immediately upon destruction. Use of IDisposable allows us to destroy the objects immediately, freeing up threads and server connections as quickly as possible.

Throughout development of the framework, the diverse needs of the disparate compo- nents which utilize client objects came to light. Given the extendable design of the original, ab- stract MessageBase class, several unforeseen message types were created in order to facilitate intra-component operations (see Figure 9). Each of the illustrated messages have specific, narrow purposes to fill within the overall framework, but work together in conjunction to support overall framework operation. The KeepAlive message is sent from the web server to the bus as an exten- sion of keep alive messages sent from the web application to the web server in order to maintain the network connection. The TextMessage and ImageMessage types are sent from the tabletop

!20 application and/or the Control Library to transmit strings and images via the bus to the web clients. The ImageMessage64 is a specialized case - in order to deliver messages to the web ap- plication which contain image data without resorting to streaming or multiple simultaneous con- nections, the byte data from an ImageMessage is converted into a base64 string which can be sent to the web application and decoded for direct display on the mobile clients. Once these im- ages are displayed, sometimes an ImageFocusMessage is utilized to direct the web client to focus upon a specific area of the aforementioned image without sending a new image as a replacement, thereby saving processing power. Another set of messages that should be looked at together are the Experiences and the ResponseCollection. Their purposes and contributions to the framework are discussed at length in Chapter 6.

Figure 9. MessageBase and its descendants

By developing the framework with this message bus and defining client objects which could interact through it, we are able to facilitate communication throughout the framework and also expose clients for use by users of the framework in their own applications, through both di- rect use and extension.

!21 5. SOLVING THE MOBILE ECOSYSTEM PROBLEM

As mentioned previously, one of the first major problems we identified in the existing framework was its dependence upon user smartphones being Android devices, since the only smartphone interface into the framework was written as an Android-specific app. In order to pro- vide the ability for as many users as possible to use applications developed using our framework, we made the decision to do a webpage-based approach rather than an Android-app-specific ap- proach in our improved framework. Given that Android smartphones have modern mobile browsers, and that modern mobile browsers are available on other types of devices, such as those in the iOS and Windows ecosystems, it is clear that we can cast a wider net for potential users by targeting browsers rather than Android specifically.

To this end, a web application was developed which utilizes ASP.NET (C#), HTML,

CSS, and JavaScript to produce a web-based interface through which the end user can interact with the tabletop application, using our framework. Let us illustrate with an example of interac- tion where Sue is using the system (see Figure 10).

When Sue approaches the system, she picks up a token which has both a QR-code and

URL on one side and a Microsoft ByteTag on the other. By scanning the QR-code with her smartphone’s QR-code reader or typing the URL into her smartphone’s browser, she is taken to a page on our web server specifically associated with the ByteTag on the reverse of the token. Be- fore this page is constructed and returned to Sue’s web browser, a new communication session which associates her smartphone with that ByteTag is created within the web server. After this session has been created, the web server also creates a new AsyncClient instance with a unique

GUID for communicating with the tabletop application via the Message Bus. Furthermore, the

!22 Figure 10. Mobile interaction sequence

!23 Client instance registers with the MessageBus and subscribes to all broadcast messages sent by the tabletop application on behalf of her associated ByteTag. Once the AsyncClient is registered, it begins to poll the Message Bus for new messages, and if any are received, they are stored in a list on the AsyncClient instance. After all of these steps are complete, the web page is returned to

Sue’s browser, allowing rendering to take place and the next step of the process to begin.

Once the HTML web page has been returned to Sue’s browser, the HTML instructs the browser to asynchronously load external Cascading Style Sheets (CSS) for styling the page, as well as the jQuery JavaScript library for manipulating the page. The page already contains our

JavaScript which is waiting for jQuery to be loaded at the end of the page rendering process.

Once the page has rendered, CSS has been applied to style the HTML, and jQuery has been loaded, the jQuery ‘loaded’ event fires, which signals our script to begin executing. Our script immediately begins to poll our web server’s PullMessagesForTag method every 100 millisec- onds, which returns all messages for our Client from the AsyncClient message list in chronologi- cal order. These messages are parsed by the JavaScript, one-by-one in chronological order from oldest to newest. Multiple messages of the same type are generally discarded in favor of the most recent message, and once these messages have been gathered, the UI is updated appropriately to reflect the directives contained in the messages.

From this point, if interaction with the smartphone is appropriate, Sue may do so.

Whether she does or not, the JavaScript continues to poll PullMessagesForTag every 100 mil- liseconds asynchronously. In the case that she does nothing, the UI continues to be updated ap- propriately by the received and parsed messages. If she does interact, the JavaScript detects this interaction, keeping the relevant UI elements in place. For example, if Sue wishes to fill out a

!24 form which has been generated by the UI on her smartphone, the messages her phone receives will not usurp her ability to fill out the form. She will receive subtle notification of her option to change what she is doing, but may fill out the form if she so chooses. Once she has filled out the form in our example and submits it, the data is posted to the web server along with the Mes- sageId of the message which generated the form in the first place as well as the original SenderId so that it may be sent as a response to the original sender. At this point, the form closes on Sue’s smartphone’s screen and the message, received by the SendMessage method on the web server, is sent along to the sender identified by the SenderId as a response via the AsyncClient via the

Message Bus to the Application Server to be processed.

!25 6. CONSTRUCTING MOBILE INTERFACES REMOTELY

As referenced in the previous section, foregoing the development of mobile ecosystem- specific applications meant we needed to interact with mobile devices via a web interface. In or- der to facilitate this and allow the tabletop application developer, via this framework, to specify what will be shown to and gathered from a mobile user at any given time, we developed ‘experi- ences’ to encapsulate this information as it is sent back and forth to and from the tabletop appli- cation and the mobile device.

6.1. Experiences and Components

We created the Experience class (see Figure 11) as a child of MessageBase, as experi- ences need to be sent just like messages. In this class is a single data member; a sorted list of

Components, these being interface components for the mobile user experience. When Compo- nents are added to this list, they are given a numeric order so the web server knows in which or- der they should be displayed from top to bottom. Furthermore, Components have ID values which must be accounted for by the tabletop application. These ID values will allow the tabletop application to appropriately handle mobile user input as it is sent back to the tabletop application via user interaction.

Figure 11. Experience class

!26 Components are broken down into two groups: Passives and Interactions. Passive com- ponents (see Figure 12) are for non-interactive items that still need to be delivered to the user.

Examples of these are blocks of text, images, and vibrations utilizing the vibrate feature of com- patible phones.

Figure 12. Passive classes

Interaction components are for items with interactivity. Examples of Interaction compo- nents are text fields, checkboxes, and buttons (see Figure 13). Using these components, a User

Experience can be created which allows the user to input data and have it sent back to the server.

In order to facilitate general use of Interaction components (but also Passives if necessary), a child of Experience was developed - FormExperience. When a FormExperience is utilized, the mobile interface automatically adds Submit, Cancel, and Clear buttons to facilitate interaction.

!27 Once these objects are understood, the tabletop application developer then uses these specialized Messages and components to build rudimentary interfaces for mobile users and de- livers them to users via the Message Bus and web server as necessary.

Figure 13. Interaction classes

6.2. Generating Experiences

To deliver Experience messages (see Figure 9) to the mobile device, they first must be appropriately generated. In order to facilitate this generation, we developed an interface and con- trol library for .NET for use in the Visual Studio XAML (Extensible Application Markup Lan- guage) editor and interface designer, which is part of the Presentation Foun- dation (WPF).

!28 For Microsoft Pixelsense (formerly ) applications, .NET developers are provided with a library of Pixelsense-specific controls which must be used in order for applica- tions to take advantage of the unique qualities of the tabletop interface. These controls expose a number of tabletop-specific events, methods, and properties, which are necessary for application development. For example, the SurfaceWindow class extends the stock WPF Window class and, while in use on a tabletop device, exposes TouchEventArgs objects which have tabletop-specific extensions, such as the GetIsFingerRecognized method and the GetTagData method. These methods allow developers using the SurfaceWindow as their application’s main window to detect when fingers are used on the tabletop, or when a ByteTag is placed upon it as well as which

ByteTag it is.

We chose to extend Pixelsense-specific controls as necessary to produce our own for a reusable library for developing applications with our framework. This extension meant perform- ing appropriate actions for certain tabletop-triggering events, as well as adding user-configurable properties on the controls themselves in the visual XAML Visual Studio designer.

Looking at the SurfaceWindow example again, we created our own SurfaceWindow class which extends SurfaceWindow. In order to utilize ours, one simply needs to include a namespace declaration in the application’s main window (e.g. xmlns:mobisurf=“clr-namespace:MobiSurf.-

ControlLibrary;assembly=MobiSurf.ControlLibrary”) and then declare the namespace for the

SurfaceWindow itself (defaulting to nothing in new Pixelsense applications) to ‘mobisurf’ (e.g.

(int), which, when combined with another property, SendCaptureAreaImage (boolean), define

!29 the width and height of a screenshot of the window which is sent to the appropriate mobile de- vice whenever the corresponding ByteTag is manipulated on the SurfaceWindow or any of its children. These screenshots are sent via the aforementioned Experience objects.

6.3. Interpreting Experiences

Many notions have been implied by previous paragraphs in this section about how the user will interact with the tabletop via the mobile device. Here we will explain how Experiences and Components are interpreted by the web server and web page to generate a user interface for the user (see Figure 14).

Figure 14. Screenshot of sample user interface

When the web page receives Experience objects, they contain ordered lists of Compo- nents. One-by-one, each component’s type is determined and then a corresponding HTML ele- ment is generated, in order, on the page by JavaScript utilizing the jQuery library. For instance,

!30 an ImagePassive component will cause an element to be generated, while a TextFieldIn- teraction component will cause an element to be generated. Utilizing the

HTML5 feature of ‘data-‘ attributes on the HTML itself, each of these elements is given a ‘data- id’ value of the ID value from their corresponding C# object. Furthermore, an event listener is attached to the element, if necessary, to capture interactions for later interpretation. Values, if any, are set on the object, and then the next element in the list is generated. Once all of these el- ements have been generated, they are rendered to the page (see Figure 15).

Figure 15. Experience object and corresponding mobile interface

When Interaction elements are appropriately interacted with (e.g. buttons are clicked, text inputs lose focus, etc.), the page’s SendInteraction method is invoked, sending along the new value of the element (if any) and the `data-id` of the element. This ID allows the server to know which element this interaction corresponds to and store the input or otherwise react appropriate- ly. For example, the tabletop application instructs the mobile device to display a UI with a ‘party’ button with an ID of a6e87038-a81d-44f5-91ff-2713d5d6e3d0. This ID is stored on the HTML button when the interface is generated. When the user presses the button, the JavaScript event

!31 handler for the button invokes the SendInteraction JavaScript method, causing a JSON object with that ID to be sent from the web page to the web server’s SendInteraction controller method.

The web server’s SendInteraction first creates a new ResponseCollection message object (see

Figure 9) and addresses it to the clientId corresponding to the web session’s current user. Then, we convert this JSON object into a a list of key-value-pairs (see Figure 16) and add them to the

ResponseCollection’s Responses list. As a ResponseCollection inherits from MessageBase, it can be sent via the Message Bus to the corresponding Client instance on the tabletop. From here, the tabletop unpackages the message and parses the Responses in the ResponseCollection. There is only one Response, signifying that the button corresponding to ID a6e87038- a81d-44f5-91ff-2713d5d6e3d0 was clicked. In reaction to this, the tabletop’s background begins to rapidly flash through the colors of the rainbow and play techno music, indicating that it is, in- deed, time to party.

Figure 16. The ResponseCollection class

!32 7. GENERATING TOPOLOGIC REPRESENTATIONS OF THE

TABLETOP’S INTERACTION POINTS

To interact with the tabletop, a Microsoft ByteTag is used. This ByteTag can be affixed to the user’s smartphone, attached to a hand-held token, or printed onto a business card. The impor- tant thing is that the ByteTag is easily moved. In the original framework, this ByteTag had to be affixed to the user’s smartphone. When the system was in use, a screenshot of the tabletop was sent to the smartphone and displayed fullscreen and aligned as such to make the smartphone screen appear to be transparent, or see-through to the tabletop surface. Then, the smartphone could be moved about the tabletop surface until controls on the tabletop application were seen

‘through’ the smartphone’s screen. These controls could then be clicked ‘through’ the smart- phone screen which triggers actions in the tabletop application, possibly triggering further com- munication with the smartphone (e.g. opening a form, vibrating, etc.) The problem with this in- teraction is that it requires the tabletop application’s interface to be designed in such a way that it is usable, or at least partially-usable, when interacted with exclusively via a physically small por- tion of the overall interface. In other words, the developer has to design tabletop controls to be sized appropriately and usable via a smartphone’s screen (see Figure 17).

In order to mitigate this problem of designing for two interfaces at once (i.e. ‘user experi- ence mismatch’), we decided to eschew the need for the tabletop developer to worry about the smartphone’s interface at all by automatically generating a topological representation of the area of interest around the user’s ByteTag. By topological, in our sense, we mean the properties of the tabletop application that do not change as the ByteTag is moved about the surface; only the names and types of the controls truly matter, not their exact locations on the tabletop application.

!33 Figure 17. Screenshot of sample auto generated topological interface

As briefly illustrated in section 6.3, our control library has a SurfaceWindow class which has CaptureAreaWidth (int) and CaptureAreaHeight (int) properties which define the width and height of a screenshot of the window which is sent to the appropriate mobile device whenever the corresponding ByteTag is manipulated on the SurfaceWindow or any of its children by way of Experience messages. These CaptureAreaWidth and CaptureAreaHeight properties, however, are also important for our topological representation generation.

To illustrate, our tabletop application has controls Alpha, Bravo, Charlie, Delta, Echo,

Foxtrot, Golf, Hotel, and India; Charlie, Echo, and Golf are disabled and cannot be interacted with. A rectangle, defined by the CaptureAreaWidth and CaptureAreaHeight properties is virtual- ly generated around the user’s hand-held token (see Figure 18). When the physical, visual repre- sentations of controls from our control library are physically contained or intersected by this vir-

!34 Figure 18. A hand-held token tual rectangle (see Figure 19), they are marked for potential inclusion in our topological repre- sentation. Then, the properties of these potential inclusions are checked to ensure that they are configured as manipulatable by mobile device users (as it is possible for controls to be unintend- ed for mobile user use), and, if so, they are added to a finalized list of control Interactions. This list is then sorted alphabetically, packaged into a specialized Experience message, and sent via the message bus to the appropriate user’s mobile device in order to have its options displayed for perusal.

Figure 19. The virtual rectangle representing the capture area

!35 Once this message has been received by the appropriate user’s mobile device, it is then interpreted. Passive experiences, such as directives to make the mobile device vibrate when its corresponding ByteTag token is in close physical proximity to a specific control on the tabletop, are automatically executed. Interactions are then listed, alphabetically, from top-to-bottom, as buttons on the mobile device’s interface. These buttons are sized and colored appropriately for the device; that is, they eschew any design properties they had on the tabletop applications UI and exhibit look-and-feel specific to mobile-device interaction (see Figure 20). This automatic translation from tabletop-UI to mobile-device-UI releases the tabletop application developer from the responsibility of designing for multiple interfaces simultaneously.

Figure 20. Topologic generation when Bravo, Foxtrot, Hotel, and India are captured

!36 8. CONCLUSION AND FUTURE WORK

Our approach to developing the next generation of the framework as redevelopment from the ground-up was realized through our work. Our new framework can further help to facilitate the fluid and seamless interaction between mobile devices and a shared tabletop workspace as the original framework sought to do.

A new communications layer, by way of a message bus system, was developed to facili- tate both inter-component and intra-component communication by leveraging pre-existing .NET libraries and techniques. By constructing this message bus and defining client objects and mes- sages (see Figure 9) which can interact through it, we are now able to easily communicate with disparate components within any application developed using our framework.

We removed the original framework’s reliance upon user-handled Android devices by utilizing a device-agnostic webpage-based approach rather than an app-specific approach. To this end we developed a web server and corresponding web pages to produce a web-based interface through which users can interact with tabletop applications built using our framework. On top of this, we also removed the necessity for the application developer to focus upon mobile interfaces, instead allowing developers to create rudimentary ‘experience’ objects that are automatically turned into the necessary mobile interfaces when used.

The user experience mismatch problem wherein developers of the tabletop application had to construct interfaces that could be utilized on both the tabletop and the mobile device si- multaneously was also eliminated through our development of auto generated topological inter- action interfaces. This development automatically generates a topological representation of the area of interest around the user’s physical interaction point with the tabletop surface and focuses

!37 only upon relevant information and available interactions, freeing the developer from the burden presented by user experience mismatch between the tabletop and mobile device.

With these four problems addressed, we have future work ahead. Our propositions for future work involve evaluating the usability of this new framework with real-world applications, providing easier mechanisms for developers to change general look-and-feel of the auto-generat- ed mobile interfaces, and, if upgrades to newer versions of the .NET library are available for the

Pixelsense (Surface) SDK, redeveloping communication throughout the framework to be driven by something other than polling, such as long-polling, server-sent events, or WebSockets.

!38 REFERENCES

[Wu03] M. Wu and R. Balakrishnan, “Multi-Finger and Whole Hand Gestural Interaction

Techniques for Multi-User Tabletop Displays”, Proc. UIST’03, pp.193-202, 2003.

[Sch12] D. Schmidt, J. Seifert, E. Rukzio and H. Gellersen, “A Cross-Device Interaction Style for Mobiles and Surfaces”, Proc. DIS, 2012.

[Rou14] A. Roudaki, J. Kong, G. Walia, and Z. Huang, A Framework for Bimanual Inter-Device

Interactions, Journal of Visual Languages and Computing, 25(6), 727-737, 2014

[Shen06] C. Shen, K. Ryall, C. Forlines, A. Esenther, F. D. Vernier, K. Everitt, M. Wu, D.

Wigdor, M. R. Morris, M. Hancock, E. Tse, “Informing the Design of Direct-Touch Tabletops”,

IEEE Computer Graphics and Applications, (26)5. P. 36-46.

[Google15] Google. (2015, Nov. 10). Upload & Distribute Apps [Online]. Available: https:// support.google.com/googleplay/android-developer/answer/113469?hl=en

[Olenick15] D. Olenick. (2015, May 27). Apple iOS And Google Android Smartphone Market

Share Flattening: IDC [Online]. Available: http://www.forbes.com/sites/dougolenick/2015/05/27/ apple-ios-and-google-android-smartphone-market-share-flattening-idc/

[Leswing15] K. Leswing. (2015, Feb. 4). Android and iOS are nearly tied for U.S. smartphone market share [Online]. Available: https://gigaom.com/2015/02/04/android-and-ios-are-nearly- tied-for-u-s-smartphone-market-share/

[Swanner15] N. Swanner. (2015, Aug. 5). This is what Android fragmentation looks like in 2015

[Online]. Available: http://thenextweb.com/insider/2015/08/05/this-is-what-android- fragmentation-looks-like-in-2015/

!39