ABSTRACT: of tasks: Tele operation of the actual data acquisition systems (cameras, We present a system that vehicles, etc). providing multimodal exploits advanced Virtual Reality interfaces for control rooms where technologies to create a surveillance information is analyzed; and finally, and security system. Surveillance empowering on-field agents with cameras are carried by a mini Blimp multimedia information to ease their which is tele-operated using an tasks of localizing problematic zones, innovative Virtual Reality interface etc. with haptic feedback. An interactive The approach to security control room (CAVE) receives that we follow in our work is based on multiple video streams from airborne video surveillance. Our system is and fixed cameras. Eye tracking based on mobile cameras that can be technology allows for turning the directed to a particular location while user’s gaze into the main interaction keeping an overview of the surveyed mechanism The user in charge can site. Airborne cameras are the most examine, zoom and select specific flexible solution for a mobile views by looking at them. Video surveillance system. This approach has streams selected at the control room been successfully applied during can be redirected to agents equipped public events with large audiences and with a PDA. On-field agents can high security demands such as the examine the video sent by the control Athens 2004 Olympic Games [Airship center and locate the actual position of Management Services ]. The Greek the airborne cameras in a GPS-driven Government awarded a contract to US map. The PDA interface reacts to the defence giant Science Applications user’s gestures. A tilt sensor International Corporation (SAIC) for a recognizes the position in which the Command, Control, Communications, PDA is held and adapts the interface Computer and Intelligence (C4I) accordingly. The prototype we present infrastructure. A Skyship 600 Blimp shows the added value of integrating (Global Sky ship Industries, Inc.) was VR technologies into a complex the central element for providing application and opens up several airborne surveillance over the Olympic research directions in the areas of tele- sites. The airship was equipped with a operation, Multimodal Interfaces, etc. highly sophisticated suite of sensors including infra-red cameras for day and night use, etc. With a security INTRODUCTION: budget of more than USD$ 2 billion, the hi-tech security systems deployed Information technology in Athens are still far from being (IT) plays an important role in security automated and fully optimized.The and surveillance initiatives such as referred blimp required a 20-person security. Security is a major concern crew to be operational. In for governments worldwide, which consequence, the system is error prone must protect their populations and the and expensive in terms of human critical infrastructures that support Resources. them [Reiter and Rohatgi2004]. Together with IT, Virtual Rea lity We believe the basic offers a promising future as key principles of an airborne surveillance component of surveillance and security systems have proved their efficacy but systems. VR can be used in a number there are still issues to be addressed concerning the interaction between the system. As stated in the introduction, command, control and on field the system we present in this paper is personnel. Our goal is to enhance the based on airborne cameras for communication channels and operation surveillance of large sites. Aerial interfaces by means of VR technology. monitoring of ground sites using video cameras is playing an increasingly The prototype we present important role in autonomous in this paper shows that efficient cost surveillance applications. effective surveillance systems based on Following the Command and VR technologies can operate in Control notion, we have designed an complex environments with relatively surveillance and security system low requirements in terms of composed of three main parts, equipment and personnel. One of our covering the basic functions of a main contribution is the integration of general surveillance system presented different VR and multimedia before: data acquisition, information technologies into a coherent analysis, on-field operation. Next surveillance system. The paper is subsections present an overview of the organized as follows: first we present state of the art of technologies applied a short overview on surveillance and in such areas. security systems; we continue by describing the system architecture of DATA ACQUISITION: our prototype; the paper ends with a Data acquisition is performed discussion on each of the system by means of a set of video cameras. components and perspectives of future Several kind of cameras can be work. 2 Surveillance and Security distinguished: fixed, orientable and Systems A general surveillance and mobile Fixed cameras are an efficient security system is composed of three alternative for outdoors use for main parts: data acquisition, surveying car traffic or crowds in information analysis, on-field public sites. They can also be used operation. Any surveillance system indoors to survey critical zones such as requires means to monitor the restricted access areas in a building or environment and acquire data in the office. Orientable cameras (see figure form of video, still images, audio, etc. 1) have additional functionalities Such data is to be processed and (zoom, 2DOF orientation) that can be analyzed by a human, a computer controlled at distance by an operator or or a combination of both at the in automatic mode. They allow for command center (analysis). A focusing the attention on a particular commander can take the decision of point. For instance, for taking a closer performing an on-field operation to put look to a particular person in a the environment back into a situation crowded site such as a stadium. considered as normal. On-field control However, there are circumstances operations are performed by on-field under which there is no possibility to agents who require efficient fix a camera in advance, due to cost communication channels to keep a restrictions or lack of appropriate close interaction with the command locations for wide visibility. In this center. The current state of our case it is necessary to use mobile research focuses more on enhancing cameras that are usually airborne. the interaction infrastructure of the three main parts composing a command and control surveillance “Feeling” a control tool is essential, otherwise the manipulation requires too much effort and becomes unprecise. Haptic technologies aim at solving this problem by enabling virtual objects to provide a tangible feedback to the user. Moreover, virtual Figure 1: left : a UAV; Center: an interfaces allow for implementing a oriental camera; right: A variety of feedback mechanisms to helicopter Wes cam. ease the teleoperation, such as vibrating controls and audiovisual Airborne cameras (see figure 1) can signals to inform the user about the be embedded in helicopters, vehicle status and the surrounding recognition airplanes, blimps, etc. environment. Flying vehicles can be either manned or unmanned. Unmanned flying INFORMATION ANALYSIS: vehicles have the advantage of a reduced size and increased Information analysis is performance. They can fly longer and the central part of a surveillance through places that would be difficult system. In order to provide an or impossible to access with manned appropriate response to a given vehicles. In urban environments, incident within reasonable timing, all blimps are the safest aerial devices for the information about the whole many reasons: they are light, easy to situation, needs to be gathered in one operate, and they fall slowly in case of unique place. A control room for problem, minimizing the risk of surveillance is composed, in most injuring people. We decided to base cases, by a large video wall and our system on a Tele operated mini- multiple screens displaying views from blimp and focused on implementing an surveillance cameras, for a proper intuitive and flexible interface that interpretation of the situation. A set of could take advantage of VR buttons and joysticks are used to select, technologies. move and setup appropriate views.

There is an important amount of research concerning Tele operation A common approach to teleportation interfaces consists on implementing physical controls such as joysticks, steering wheels, handles, buttons, and so on. The problem with physical interfaces is that they are expensive to implement, and difficult to reconfigure to match different user Figure 2: Various control rooms (top- requirements and/or applications. left: EADS, right: Oakland Virtual entities (3D models) can solve airport, bottom-left: Willow creek the problem of reconfiguration and community). adaptation, but also have some One of the main centers of drawbacks. The main disadvantage of interest concerns treatment and an interface based on 3D models is the rendering of various types of data absence of physical feedback. (maps, graphics, text, photo, video, sounds...) through a convivial and permanent multimodal ergonomic interface. The US army communication.. researches on C3I, study models and interfaces to manipulate and interact with the flow of information. Companies, like the System Design Center (DCS) from EADS, are specialized in building such systems. DCS has developed a multimodal platform, NetCOS, for visualizing, organizing, linking and simulating various types of input data. The Figure 3: The overall system complexity inherent to manipulate all architecture. the control panels for the application, requires generally several human SYSTEM ARCHITECTURE: operators. The supervisor uses mainly oral communication with the control This section describes the room operators and on-field agents. overall surveillance system that we have created. The figure 3 shows We decided to use VR different modules that are used for devices for improving the ergonomics surveillance. We can distinguish three of existing systems. Visualization main parts systems for video surveillance based  Control of the aerial device (the on an Augmented Virtual Environment Blimp) supporting the video (AVE) are an important topic cameras. This task is done by a nowadays. AVE fuses dynamic single pilot seating on Haptic imagery with 3D models in a real-time WorkstationTM inside a distant display to help observers comprehend and closed environment. The multiple streams of temporal data and pilot can control the blimp as if imagery from arbitrary views of the he were inside it. scene. We provide a novel interface  On-field agents : They are based on eye-tracking technologies equipped with handheld devices which minimizes the use of keyboards in order to receive precise orders and other interaction devices. The including multimedia content commander is immersed in a CAVE (text, images, sound). which displays live video enhanced with graphic overlays. R/C BLIMP:

ON-FIELD OPERATION:

On-field operation is the result of decisions taken at the control center and require a team of surveillance agents to take action for controlling the situation on the ground. Common communication devices Fig:4 figure of R/C blimp include: pagers, in-vehicle systems, radios and headsets; etc. Recent Our blimp, as shown on figure 4, is a security studies and initiatives have low-cost Unmanned Aerial Vehicle pointed out the importance of (UAV) that we use in our teleportation research. The R/C Blimp is composed radio controller. To put in a nutshell, by a 6;75m long and 2;20m diameter figure 6 describes the computers that envelope that is filled with 11m3 of are used to control and gather Helium gas (He). The total weight information from the blimp. All these including its standard flight equipment equipments are located on the roof of is 9kg, so there is around 2kg of Figure our laboratory in order to communicate 4: Photo of the R/C blimp. maximum without interferences. All these payload for the cameras and the video communications are near transmission system. Below, there is a real-time because everything is done in gondola containing the electronics part, hardware: the video is delayed for less the power supply (6600mAh allowing than 50ms, and the control of the blimp 1h at half-speed) and supporting the via the servo controller is in real-time. two electric motors. Each have 1;5kg In the next subsection we will describe of power, allowing the blimp to fly at the Virtual Cockpit system used to 35km=h when there is no wind. The control this Blimp. range of the transmission of the radio controller is 1:5km, but it can be VIRTUAL COCKPIT: extended with repeaters. In this subsection, we will describe how we have implemented the cockpit using VR devices. The R/C blimp is not so easy to pilot, even with the remote controller, which is usually the classic device for this purpose. Figure 5: Actuators of the blimp: Moreover, the Virtual Cockpit is in an isolated room without any direct-view Figure 5 shows the five actuators of the R/C Blimp. Therefore the controlling this R/C Blimp. interface must be precise (for a fine Aerodynamic stabilizers are used only control), instructive (to give location when the blimp has reached a certain information) and intuitive (to avoid speed (approximately 8km=h). manipulation errors).

Figure 7: The Blimp’s Virtual Cockpit. Figure 6: The Blimp communications system The wide-angle camera that is used by Figure 7 shows the representation of the pilot is also connected to the the blimp inside the virtual Futaba radio, so the pilot has to control environment. The user is seated inside seven channels. There is also a second a 3D gondola made of glasses. This radio controller, a Graupner MC-10, video is mapped on a polygon with the that is used by the Control Room to stream coming from the pilot camera. move the high-quality camera. We Since the camera follows head have used two USB SC-8000 Servo movements and there is less than 5ms Controllers that allow a pc to control a of ping between the Control Server computer (see figure 6) and the Haptic composed by four video projectors and WorkstationTM server, there is a good four associated computers for matching between the head orientation recovering the video streams, and the real scene displayed. Finally, displaying an eye picking area and the GPS information transmitted to the sending the selected part of the image pilot is overlayed on the window. We to the On-field Agents. The video choose to not integrate it inside the 3D stream of the distant cameras is virtual environment in order to keep it transmitted and rendered via the local always available to the pilot. This area network using RTP. information is represented as a 2D map indicating the blimp’s location, heading, altitude, and speed (horizontal and vertical) are displayed as well.

Figure 10: The four sided CAVE.

Figure 8: Haptic WorkstationTM and HMD used for rendering the virtual cockpit.

SURVEILLANCE CONTROL ROOM: Figure 11: Top and left: The In order to place the Eye-tracking system. Bottom supervisor in the best disposition for right: taking decisions, we focused on designing an ergonomic interface, A joystick allows the supervisor to efficient and intuitive, using Virtual move mobile cameras to get the most Reality devices. Our system displays appropriate point of view (see figure several video streams in a CAVE, and 11). Orientation is controlled by the allows the supervisor to select and pad and two buttons are used to zoom send visual information to the On-field in and out. By pressing a single button, Agents intuitively and instantaneously. the user validates his eye selection and sends it to the On-field Agents. The picture is sent via internet using virtual private network for security purposes. We choose to send a single relevant picture rather than directly the video stream viewed by the supervisor for readability reasons and network Figure 9: The control room system. resources management. Passing the right information at the right moment The CAVE offers the is crucial. Since our camera is aerial, it possibility to display multiple video gets a field of view covering a large streams on four projection screens (see area. In contrast, agents on the ground figure 10), providing full immersion of get a local view of the situation. It’s the user into the scene. The system is important for them to be able to relate the received image to their field of Wireless LAN capabilities. By means action. For these reasons we needed, in of a tilt sensor, the PDA detects addition to the zoom, to be able to send whether it is positioned horizontally or a cropped part of the full image. Based vertically and presents different on those two positions, the system information. Parts of the live video computes the gaze direction and sends stream received at the control room it continuously to the central computer could be sent on the handheld, as well through the serial port. Since the user as map of the site enhanced with GPS information coming from the aerial is able to move inside the cave, we monitoring system (R/C blimp). cannot only rely on the eye orientation to determine his gaze target. The system sends position and orientation to the central computer via the local network. The central computer gathers the information from Figure 12: On-field agent the eye tracker and from the motion communication equipment. capture. The system corrects the inherent instability of the When held vertically, see measurements by filtering the input figure 12, the interface shows a data. The exact gaze vector is given by reduced image of the live video stream combining the orientation basis from selected at the control room as well as the magnetic sensor and the deviation a map of the surveyed site. GPS of the gaze from the head. Knowing information is used to point out on the the exact configuration of the CAVE, map the current position of the blimp. the system detects which plane is When held horizontally (figure 12), the observed and the exact pixel looked at. interface shows a higher resolution Some threshold values and a time view of the live video stream. This delay avoid the target from blinking way, On-field Agents can better between two planes on the appreciate specific details while singularities. For smoothing the maintaining communication with the movement of the selection zone, the control room. cropping box is attached to the gaze The PDA application is target with a damped string algorithm. loaded as a web page using the default Its size is also fixed at 320£240 pixels, Internet browser pre-installed on the the maximum resolution of the PDA PDA. One of the two web pages is screen. This system provides the automatically loaded according to the supervisor with powerful tools to send PDA position detected by the tilt visual and vocal information to On- sensor. An application running on the field Agents. PDA reads the tilt sensor values and generates a web page that redirects the ON-FIELD AGENT internet browser to the corresponding EQUIPMENT: web page generated at the web server. A PC-based application receives Handheld communication information from the control room equipment (PDA) provides a dynamic (selected region of the video stream) interface that reacts to the way it is and data from the airborne GPS. It held by the agent. The system is based generates two web pages on commercial PDA devices using the corresponding to each of the two PocketPC operating system with modalities (horizontal, vertical). Web pages are published through a This paper briefly discussed conventional Apache Web server. This a full surveillance and security system way, the alternative data display based on advanced Virtual Reality interfaces are accessible for handheld technologies. This system can be devices and any PC system connected applied surveillance of public sites to the Virtual Private Network of the such as stadiums, universities, parks site under surveillance. etc.

MODULES INTERCOMMUNICATION: REFERENCES: www.hitl.washington.edu/scivw/ Video transport over EVE/II.G.Military.html internet in real-time is a complex problem that has often been addressed. www.portal.acm.org/citation.cfm Today, RTP, the Real-Time Transport Protocol, fulfilled most of the www.computerworlduk.com/.../serv requirements necessary to this purpose, ers-data-centre/infrastructure- like asynchrony solving capacity, management server/client traffic regulation or multiplexing of different kind of www.infosectoday.com/Articles/Ma streams. Many implementations of naged_Services.htm RTP are available, but we choose the one that works in the Sun Java Media www.science.howstuffworks.com/vi Framework (JMF) for three reasons: it rtual-military.html is free, it contains image compression utilities useful for reducing the bit rate www.military.com/soldiertech/0,146 of the stream and finally its easier to 32,Soldiertech_Science,,00.html program with it than with other free libraries . Virtual Cockpit and the R/C Blimp Servo Controller Server. It performs in real-time with a latency of less than 1ms with 5 bytes packets (one forth number, and four for the servo position This is represented by a 4 bytes float value). Figure 13 resumes the lag in the communication streams.

Figure 13 The lag times per Communication Channel.

CONCLUSION: