Projectable Interactive Surface Using Microsoft Kinect V2
Total Page:16
File Type:pdf, Size:1020Kb
2015 IEEE/SICE International Symposium on System Integration (SII) December 11-13, 2015. Meijo University, Nagoya, Japan ����������������������������������������������������������� ������������������������������������������������������� P. Sharma, R. P. Joshi, R. A. Boby, S. K. Saha, Member, IEEE and T. Matsumaru, Member, IEEE � Abstract�An Image-projective Desktop Varnamala Trainer Indian �.3,00,000 or US $5000) and not easily transportable. (IDVT) called SAKSHAR has been designed to improve the These are important requirements for a system to be used in learning by children through an interactive audio visual fields like education in developing and underdeveloped feedback system. This device uses a projector to render a countries. With this purpose in mind "SAKSHAR: Image- virtual display, which permits production of large interactive projective Desktop Varnamala Trainer (IDVT)" has been ��������������������������������������������is recognized with the help of a Microsoft Kinect Version 2. The entire system is designed. The device has been named SAKSHAR which portable, i.e., it can be projected on any planar surface. Since means "Literacy" in Hindi (an Indian language). the Kinect does not give precise 3D coordinates of points for "Varnamala" is also a Hindi word which stands for complete detecting a touch, a model of recognition of a touch purely set of alphabets. We feel that devices like SAKSHAR will ��������������������������������������������������������������� give a boost to reading skills and improve literacy. not yield accurate results. We have instead modeled the touch SAKSHAR is based on SOI (step-on interface) concept, action by using multiple points along the trajectory of the ��������������������������������������������������������� with which uses the projected screen as a bidirectional interface. the surface. Fitting a curve through these points and analyzing Through this the information is transferred from a robot to the errors is used to make the detection of touch accurate. the user, and the user instructions are delivered to the robot [7, 8]. Based on SOI concept, a device named IDAT [9-11] I. INTRODUCTION was earlier designed for upper limb rehabilitation by training Primary education is necessary for overall development of to improve eye-hand coordination. SAKSHAR: IDVT is the a person as well as a society. There are major improvements next generation of IDAT but designed for use in educational in social indicators when quality of education improves [1]. purposes. To lower the total costs, a cheaper RGBD sensor The traditional education system based on procedural and Kinect V2 is proposed in this paper. For using this sensor, a rote learning does not help kids to improve basic skills like touch detection using statistical information and calibration literacy and arithmetic skills [2]. The use of procedure was developed. The other difference between incomprehensible books makes the children to shy away IDAT device and the current device is that IDAT uses only from learning and may not be suitable for primary education. 2D data along a plane close to the projection surface, This might be a major contributing factor for school drop whereas the proposed set up it uses 3D information about the outs in developing countries [3]. hand movement. In this paper, we are proposing a device which can make Many earlier implementations of interactive surfaces have learning interactive and full of fun. We are suggesting the used a camera for sensing and thus are not portable [12, 13]. use of a virtual touch screen in place of the computer for the In the proposed set up, we have used a RGBD sensor instead primary education of children. This device facilitates a new and have overcome the shortcomings of inaccuracies in concept of human-machine interaction for helping younger detection due to low resolution using regression analysis. students to learn. There is one implementation of projectable touch surface In recent years, touch screen has become very popular. using a Kinect called Touchless Touch [14] but the ���������� �������� ������ ������ ��� ������ �4��� ������������ calibration was complex and the results were not satisfactory PixelSense [5], and Surface Table [6] are examples of large even after multiple trials of calibration. Unlike other devices touchscreens. These devices recognize fingers, hands and such as High Precision Multi-Touch sensing [15] which are objects placed on the screen, enabling vision-based custom made for LCD kind of displays we are trying to get interaction. Though, they deliver high quality of graphics rid of these expensive displays. Devices in [16,17] can detect with good interactive experience, they are expensive (about only on plane since the formulation is made on the basis of homography between camera and projected plane. RGBD *Research supported by Waseda University (2014K-6191, 2014B-352, based sensing can possibly be used to even model gestures 2015B-346), Kayamori Foundation of Informational Science Advancement in 3D. But this as well as modeling of 2d gestures, may be (K26kenXIX453), SUZUKI Foundation (26-Zyo-I29) and BRNS, India considered in the future. P. Sharma, R. P. Joshi, R. A. Boby and S. K. Saha are with the Dept. of Mechanical Engineering, IIT Delhi, Hauz Khas, New Delhi -110016, India; The structure of the paper is as follows: Section II (e-mail: [email protected], [email protected], discusses the system, whereas regression analysis used for [email protected], [email protected],). touch detection is explained in Section III. Later, sensor T. Matsumaru is with Bio-Robotics & Human-Mechatronics Laboratory, Graduate School of IPS, Waseda University, Kitakyushu, calibration is discussed in Section IV. Section V talks about Fukuoka, Japan, (e-mail: [email protected]). the game that has been developed for educational purpose. 978-1-4673-7242-8/15/$31.00 ©2015 IEEE 795 The performance of the system and its future scope is discussed in Section VI. Finally the conclusions are USER PROJECTOR presented in Section VII. PC II. SAKSHAR SYSTEM SAKSHAR, as shown in Fig. 1, consists of a projector, Microsoft Kinect Version 2 and a PC. The projector projects graphics rendered by the PC on a projection surface. Kinect, KINECT which is mounted on a stand, provides the information about the coordinates at which surface has been touched. The Kinect [18, 19] is a RGBD sensor made by Microsoft. It has Fig. 1. The current setup built in APIs for the detection of various joints of the body. used in the sensor. To make the device sensitive to touch It has detection range of 0.8 to 3 m [20]. Currently the without compromising on the accuracy of different gestures, projector is mounted on an aluminum stand and the Kinect we built an algorithm which was based on regression on a camera stand, both of these being portable. A Microsoft analysis performed on the data points obtained from the .NET based [21] application was used to build the interface sensor. A touch was looked at as a hand approaching a between the sensor and the audio/visual system. Multiple surface and touching it and then moving away from the threads run simultaneously to gather/analyze sensor data surface. In this case we can use the data points along the from the user and to give suitable feedback to the user. A direction of movement of the hand towards the point of game aiding in the learning of alphabets by children was contact and then moving away from it, as shown in Fig. 2. A programmed. best fitting curve through height of the points of approach We have used Kinect Version 2 for this purpose [22]. The (Figs. 2(a) and 2(b)), touch (Figs. 2(c) and 2(d)), and advantage of using a Kinect is that it is 1/8th of the price of a separation (Figs. 2(e) and 2(f)) would give us a larger scanning laser range finder (A laser sensor costs about number of relevant data points to analyse statistically. This Indian �.85,000 (US $1284.10) whereas a Kinect would cost will hence help us to make a decision based on the us Indian �10,000 (US $151.07)) or highly sensitive cameras information we gathered by looking at the motion of the and thus more affordable. More importantly, we have the hand and not the location of the hand at any particular advantage of getting 3D information rather than just the 2D instant of time alone. information as in the case of IDAT devices. Ability to gather B. Overview of the Method Used 3D information enables application of algorithms to even correctly detect the gesture of the user or select one user out The first stage was calibration. Calibration is necessary of many users. On the other hand, we have overcome the for the projected screen to behave in accordance with the disadvantage of having a poor resolution of data as decisions taken based on data obtained from the Kinect. The compared to a laser sensor by adequately supporting it Kinect has its own coordinate system defined with respect to through algorithm. The algorithm used will be explained in its camera, while for the projected surface we had defined Section III. �������� ����������� �������� ���� ��������������� ��� �� �������� coordinates from one system to another wass taken care by III. D ETECTION OF TOUCH USING REGRESSION ANALYSIS the calibration (Section IV). The calibration of the device after setting it up, fixes the plane of projection, coordinate A. Purpose axes and the origin. Building a sensitive projectable interactive surface that ����� ��� ���� ������� ����� ��s moved over the plane, the can compete with the existing expensive touch tables joints of the hand were tracked by the Kinect using the available in terms of cost without compromising on the user inbuilt Kinect Version 2 API. The points of concern were the experience was of primary importance. One serious issue centre of the hand, the thumb and the tip of the hand. The with trying to use RGBD sensor was that we compromised centroid of the triangle formed by joining these three points on the resolution of the data obtained.