special REPORTS John Edwards [ ]

Telepresence: in the Real World

magine a teleconferencing system inconvenient travel itiner- being adopted by enterprises that’s so realistic, so natural, that aries—the ability to reach in sectors ranging from gov- you feel you could reach out and out to anyone, anywhere as ernment to retail to manu- shake a participant’s hand, even if really there. “Our mind facturing to finance to though the individual may be sitting often has sparks, and those utilities. Currently available, Iin a room hundreds or even thousands ideas need to be immedi- telepresence systems are of miles away. That’s the idea behind ately debated and further usually designed to join an telepresence, an emerging teleconfer- brainstormed with col- Henry Fuchs of the enterprise’s existing com- encing that’s designed to not leagues and friends; other- University of North munications portfolio— only connect people together but to wise, they will be just gone Carolina. including Web confer encing, make them feel like they’re all collabo- forever,” Zhang observes. telephony, instant rating together inside the same room. messaging, and communica- Telepresence provides an immersive READY FOR THE REAL WORLD tion tools—to help employees boost meeting experience based on the highest Once the fodder of science fiction writers performance, increase sales, protect possible levels of audio and clarity. and filmmakers, telepresence is already investments, and create rich collaborative Ideally, participants are life-sized. Ide - ally, every sound, gesture, and facial expression are replicated in high-defini- tion (HD) video and high-fidelity sur- round sound. The ultimate goal is to make telepresence and in-person meet- ings virtually indistinguishable from each other. “Telepresence has a great potential for getting people to work together collaboratively, efficiently, and naturally regardless of their physical loca- tion,” observes Zhengyou Zhang, a princi- pal researcher focusing on telepresence at Microsoft Research in Redmond, Washington. With business becoming increas- ingly global, and travel getting ever more expensive and cumbersome, tele- presence’s popularity is soaring. According to a December 2010 study by Wintergreen Research of Lexington, Massachusetts, worldwide telepresence sales are projected to reach US$6.7 bil- lion by 2016. Yet the technology offers an important benefit that goes far beyond eliminating expensive and

Digital Object Identifier 10.1109/MSP.2011.941853 Ryan Schubert and Peter Lincoln (right) are grad students working on the University of Date of publication: 1 November 2011 North Carolina mobile project.

1053-5888/11/$26.00©2011IEEE IEEE SIGNAL PROCESSING MAGAZINE [9] NOVEMBER 2011 [special REPORTS] continued

leagues, friends, and colleagues from a remote location while family members to pop also being able to hear and see them. an electronic representa- Trevor Blackwell, Anybot’s chief tion of themselves into executive officer, says that ongoing places as easily as dialing advancements in video, broad- a phone. Researchers are band, and robotics technology are also working on trans- making it easier and more practical for porting realistic repre- people to project their presence to sentations of entire remote sites. “It’s now becoming meeting rooms and possible for to control offices to distant loca- bodies in much more interesting ways tions. Zhang says telep- than in the 1980s or 1990s,” he says. resence’s potential uses “So it was time to try to build a gen- are far-reaching and lim- eral-purpose robot that would be use- Nasser Peyghambarian, a professor of optical sciences at the University of Arizona in Tucson is shown with the ited only people’s imagi- ful in the office.” hologram. nation. “App lications Blackwell states that Anybots can include effective team be used to support a variety of train- experiences with colleagues and custom- , immersive entertainment ing, sales and customer service activi- ers. Communications integration means experiences, and improved customer ties. He believes that the technology is that participants can work face to face in satisfaction in marketing,” he says. particularly well suited for technology a virtual meeting room while also invit- support services, such as helping an ing in colleagues who are limited to ANYONE FOR ANYBOTS? employee master a new appli- phone- or Web-based conferencing While most telepresence technologies cation or fix a minor glitch. systems. are developed for incorporation into ded- “Most small offices these days need a Telepresence systems, including icated conference rooms, Mountain tech support person maybe one or two , displays, speakers, and other View, California-based Anybots has creat- hours a week,” he explains. He notes hardware and software components, are ed a system that’s designed to bring tele- that small organizations often can’t currently available from several presence sessions directly to people at afford the expense of hiring a full-time vendors, including , their desks or in virtually any other support professional. Meanwhile, a part- Tandberg, and Polycom. Prices can run indoor location. The company’s mobile time hire may not be available on site anywhere from several thousand dollars Anybots robot is designed to function as when needed. Blackwell says that using to upwards of a quarter of a million dol- a seeing and personal avatar that a telepresence robot allows a business to lars. Basic components in a high-end can freely roll around an office, produc- combine the cost and availability bene- system typically include at least three tion site, sales floor, convention center, fits of phone support with the interac- HD 65-in or larger displays, several HD or lobby. The system is operated via a tivity attributes of a personal visit. cameras, multiple microphones and Web browser using a wireless fidelity, “Being able to support people through a speakers, special lighting arrays, two or third generation, or fourth generation robot means that instead of being on more 1-Gb ports in the room, wireless connection. A video and the phone with someone, you can send and approximately 2–3 Mb/s of band- speaker mounted inside the robot’s head your robot to stand behind the person, width per active screen. allows operators to converse with remote look at what they’re doing on their While telepresence is already computer, talk to them about it, meeting the needs of a rapidly look at where the wires are growing number of organiza- going, and do whatever else is tions, researchers worldwide are necessary to troubleshoot the developing new technologies with problem,” he says. the aim of making electronic The technology can also be meetings even more like “being put to use as an office greeter, there.” Among the innovations allowing a human receptionist being investigated are robot ava- to remotely welcome guests tars that will allow people to cre- without leaving the front desk ate a virtual physical presence vacant. “Around our office, our almost anywhere. Another impor- robot greets visitors at the door tant research area is three- and shows them where the diet dimensional (3-D) technology, An Anybot telepresence robot lets users project their Cokes are and leads them to a which promises to allow col- presence to a work site. seat in the conference room,”

IEEE SIGNAL PROCESSING MAGAZINE [10] NOVEMBER 2011 Blackwell says. “And then it goes and being developed by the gets me.” BeingThere Centre, a joint Blackwell notes that besides project of the University of enabling efficient video and audio North Carolina (UNC) in streaming, signal processing played an Chapel Hill, North Carolina, important role in developing the Nanyang Technological robot’s physical attributes. “Our focus University (NTU) in on signal processing has been mainly Singapore, and the Swiss in the mobility and movement of the Federal Institute of robot,” he says. “The new technology Technology (ETH) in is all about controlling the way the Zurich. The project unites robot moves, making sure it doesn’t 32 leading telepresence crash into things, making it move research leaders working smoothly and as quietly as possible, across three continents and making it seem very responsive to A concept developed by the remote driver even though there is Gregory Welch, a UNC some small delay over the network as research professor, is to they’re driving it.” create a technology that Such attention is vital since physical gives people the option of movements that are trivial for most sending a mobile manne- people can be major challenges for a quin to represent them- robot. “There are all kinds of things that selves at a conference table. can happen as you’re rolling around an Such a unit would act as an office—like one wheel hits a bump and avatar that could freely nav- then all of a sudden it’s off the ground,” igate to a distant environ- Blackwell says. “Then, if you try to con- ment and take on the An Anybot telepresence robot allows remote workers trol the robot by spinning that wheel, appearance and gestures of to collaborate with on-site personnel directly and it’s going to do the wrong thing.” Signal its far-away human host. informally. processing algorithms help the robot The project’s ultimate goal know what to do in specific situations. is creating an autonomous virtual create the illusion that the rooms are Blackwell says that he and his team human with memory and awareness adjacent to each other and separated gain inspiration by observing the real capabilities that can take the place of its only by glass walls, even though they world. “We spend a lot of time looking at host whenever he or she is absent. may be located in different countries. what people do in offices and how much Another mobile-oriented telepresence Such “glass wall” displays will provide of that could be done remotely,” he says. application being developed by each person in the room with a correct, “We’re pretty optimistic that we can BeingThere Centre researchers is a porta- personalized stereo view into the remote replace a significant fraction of office ble display that can be used to bring a 3-D rooms, giving the illusion that all the of jobs with people controlling a robot graphical representation of a friend, rela- participants, both local and distant, from home.” tive, or colleague to a meeting room, share one common space. The idea is to office, home, or other location. Like the allow enterprises with offices located JUST LIKE BEING THERE mobile mannequin, the semitransparent around the world to function as if every- Robot avatar technology also figures display is designed to allow a user to cre- one were located inside the same into a telepresence system currently ate a virtual physical presence in a place building. perhaps hundreds or even “Teleconferencing is not taking a thousands of miles away. camera image and sending it through a Yet another telepres- and putting it onto ence model being investi- a TV screen,” says Henry Fuchs, a UNC gated by BeingThere Centre computer science professor and an researchers is a conference adjunct professor of biomedical room that connects with . Fuchs, who is also one of other similarly equipped the BeingThere Centre’s three codirec- rooms to form an entire tors (along with Prof. Markus Gross of virtual office suite. Each ETH and Prof. Nadia Thalmann of Zhengyou Zhang with John Apostolopoulos facility would feature wall- NTU), feels that to provide a truly real- Microsoft. with HP. sized displays designed to istic immersive experience, cameras

IEEE SIGNAL PROCESSING MAGAZINE [11] NOVEMBER 2011 [special REPORTS] continued

the ability to tell what someone is look- ing at by watching the direction of their eyes. Zhang says the challenge boils down to a matter of user perception. “People who are sitting at the same table see things differently: If I look at you, you see my face and your partner may see the side view, so your partner knows I am looking at you,” he says. “If I turn my head and speak to your partner, then he sees my face, and you see the side view, and you know I’m not looking at you any more.” Conquering gaze awareness would go a long way toward improving telep- resence realism and making sessions more like real world get-togethers. “In a virtual meeting, if you misinterpret a cue, you make the whole collaboration less effective,” Zhang explains. “That’s why, many times, people ask again and again for someone to clarify what’s going on.” Nasser Peyghambarian, a professor of optical sciences at the University of Arizona in Tucson, believes that hologra- phy could provide the key to creating a truly immersive 3-D telepresence experi- At the BeingThere Centre, researchers are looking into the possibility of using ence. “Holographic is closest to the way telepresence to link geographically isolated offices into collaborative virtual suites. (Images used with permission from the BeingThere Centre.) humans see their surroundings,” he says. “It’s also an approach that doesn’t require any eyeglasses or other special and other sensors need to be able to motion and stereoscopic parallax, and eye wear, unlike when you go to see a capture the 3-D geometry of every- immersive spatial audio.” 3-D movie or watch 3-D TV.” thing inside a room and to project that Zhang notes that his research is in Peyghambarian notes that holo- representation as accurately as possible a preliminary stage and is encounter- graphic presentation differs from today’s to the remote destination. “Not just ing the same road blocks that most movie and TV 3-D offerings in another the people, but the furniture, books, advanced telepresence projects are important way. “If you go to Avatar, you writing on the white board, waste bas- bumping into: restricted Internet con- see 3-D, but it has a very limited number ket, , coffee mugs—every- nection speeds and display resolution of perspectives; something like two per- thing,” he says. limits. “The current Internet band- spectives for one eye, and one for the width is not high enough, so we can- other eye.” THE 3-D FUTURE not send the full resolution of our , on the other hand, Three-dimensional technology has entire 360° panoramic video,” Zhang promises a vastly expanded number of become one of the most active areas of explains. “And even if we did send this perspectives, which would be handy for telepresence research. Microsoft’s in full resolution, a normal screen does addressing challenges like gaze aware- Zhang is one of many researchers not have enough resolution to display ness. “Let’s say it’s a live object and there focusing on creating a 3-D virtual all of this.” are 100 cameras taking 100 pictures audio-visual environment that’s both Telepresence researchers also face a from different angles,” Peyghambarian life-sized and photorealistic. “Such as subtler challenge, one that most end says, “so there are 100 different perspec- that when you meet remote people in users would have trouble defining, yet tives that are coming in to provide that environment, you feel as if they which gives many virtual conference detail.” Using Peyghambarian’s approach, were sitting with you at the same participants the feeling that something the data generated by the camera array is table,” he says. “This means you have just isn’t “quite right.” Telepresence the correct mutual gaze, accurate experts call the problem gaze awareness: (continued on page 142)

IEEE SIGNAL PROCESSING MAGAZINE [12] NOVEMBER 2011 [applications CORNER] continued

important diagnostic con- IQA applications, provided instructive [3] H. R. Sheikh and A. C. Bovik, “Image information and visual quality,” IEEE Trans. Image Processing, tained in medical images. It is therefore examples, and also disc ussed the main vol. 15, pp. 430–444, Feb. 2006. important to provide specific objective challenges. In the future, it is expected [4] H. R. Wu and K. R. Rao, Ed., Digital Video Image Quality and Perceptual Coding. Boca Raton: IQA measures that can help maximize that the development and application CRC, 2005. the level of compression, but without sides of objective IQA measures will [5] M. Elad and M. Aharon, “Image denoising via aff ecting the diagnostic value of medical mutually benefit each other. On one sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Processing, vol. images. Moreover, many modern medi- hand, more accurate and more efficient 15, pp. 3736–3745, Dec. 2006. cal imaging devices acquire images with IQA measures will certainly enhance [6] X. Zhu and P. Milanfar, “Automatic parameter selection for denoising algorithms using a no- much higher dynamic range of inten- their applicability in real-world applica- reference measure of image content,” IEEE Trans. Image Processing, vol. 19, pp. 3116–3132, Dec. sity levels than what can be appropri- tions. On the other hand, new challenges 2010. ately shown on standard dynamic range arising from real applications (e.g., [7] Z. Wang, G. Wu, H. R. Sheikh, E. P. Simoncelli, displays. Therefore, it is desirable to desired mathematical properties for opti- E.-H. Yang, and A. C. Bovik, “Quality-aware images,” IEEE Trans. Image Processing, vol. 15, pp. 1680– employ those IQA measures that can mization purposes) will impact the new 1689, June 2006. provide meaningful quality evaluations development of future IQA measures. [8] M. C. Q. Farias, S. K. Mitra, M. Carli, and A. Neri, “A comparison between an objective quality of the images after dynamic range com- measure and the mean annoyance values of water- marked ,” in Proc. IEEE Int. Conf. Image Pro- pression. Furthermore, both data rate AUTHOR cessing, Rochester, MN, Sept. 2002, pp. 469–472. and dynamic range compression of Zhou Wang ([email protected]) is an [9] A. Rehman and Z. Wang, “Reduced-reference SSIM estimation,” in Proc. IEEE Int. Conf. Image medical images should be optimized for associate professor in the Department of Processing, Hong Kong, China, Sept. 2010, pp. the IQA measures specif ically designed Electrical and Computer Engineering, 289–292. for medical applications. University of Waterloo, Canada. [10] D. Brunet, E. R. Vrscay, and Z. Wang, “A class of image metrics based on the structural similarity quality index,” in Proc. Int. Conf. Image Analysis and Recognition (Lect. Notes Comput. Sci.), vol. 6753, OUTLOOK REFERENCES Burnaby, BC, Canada, June 2011, pp. 100–110. [1] Z. Wang and A. C. Bovik, “Mean squared error: We have discussed the application Love it or leave it? A new look at signal fidelity mea- [11] J. Fierrez-Aguilar, Y. Chen, J. Ortega-Garcia, sures,” IEEE Signal Processing Mag., vol. 26, pp. and A. K. Jain, “Incorporating image quality in aspects of modern objective IQA meth- 98–117, Jan. 2009. multi-algorithm fingerprint verification,” Lect. Notes Comput. Sci., vol. 3832, pp. 213–220, 2005. ods. Rather than providing an exhaustive [2] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. survey of all applications, we have Simoncelli, “Image quality assessment: From error [12] Video Quality Experts Group Web site visibility to structural similarity,” IEEE Trans. Image [Online]. Available: www.vqeg.org emphasized on the great potentials of Processing, vol. 13, pp. 600–612, Apr. 2004. [SP]

[special REPORTS] (continued from page 12)

fed into a spatial light modulator, a device vide the information it gathered in 3-D nal-processing algorithms for multicore that can modulate light spatially in to doctors.” The technology could, for and GPU systems and so on,” he says. “I amplitude and phase. “You could get all example, help surgeons performing believe that advances in signal processing that information and display it in 3-D and brain surgery or other types of delicate will continue to be central to improving it can actually be in real time,” he says. operations. “They could use that [tech- telepresence in the future.” While most existing telepresence sys- nology] to see the information as they None of these improvements will tems are geared toward conferencing do the operation,” Peyghambarian says. come too soon for Microsoft’s Zhang, who applications, Peyghambarian feels that John Apostolopoulos, director of the admits that he has a personal interest in real-time holography has the potential Mobile and Immersive Experience Lab at seeing sophisticated telepresence systems to drive the technology into a wider Hewlett-Packard (HP) Laboratories in becoming commonplace. “I have fre- range of fields. “Benefits include 3-D Palo Alto, California, believes that signal quent phone calls with my parents and social networking, 3-D , processing will be vital to overcoming family members in China as well as my and 3-D collaborative research,” he says. many of the challenges telepresence research collaborators at Microsoft “The advantage of our technology is that researchers currently face. “This includes Research Asia in Beijing,” he says. it can continuously read and replace video and audio capture, noise reduction, “Telephony is a great invention, but data, so you can use it in an magnetic compression, transmission over a packet leaves much more to be desired com- resonance imaging or computer assisted network, packet-loss concealment, multi- pared with a face-to-face meeting.” tomography scan system that would pro- channel echo cancellation, efficient sig- [SP]

IEEE SIGNAL PROCESSING MAGAZINE [142] NOVEMBER 2011