Facial-Expression Analysis for Predicting Unsafe Driving Behavior
Total Page:16
File Type:pdf, Size:1020Kb
FACE AND GESTURE RECOGNITION Facial-Expression Analysis for Predicting Unsafe Driving Behavior A system for tracking driver facial features aims to enhance the predictive accuracy of driver-assistance systems. The authors identify key facial features at varying pre-accident intervals and use them to predict minor and major accidents. very year in the US alone, more than We propose an active driver-safety frame- 42,000 Americans die as a result of work that captures both vehicle dynamics and 6.8 million automobile accidents.1 the driver’s face. We then merge the two levels Consequently, driver-safety tech- of data to produce an accident prediction and nology is an active research area in investigate the frameworks performance. Our Eboth industry and academia. Pervasive comput- study differs from previous work in active driver ing environments, with integrated sensors and safety in four ways. First, we use a bottom-up networking, can provide an ideal platform for approach, analyzing the movement of a com- developing such technology. Taking effective prehensive set of 22 raw facial features, rather countermeasures to enhance safe vehicle opera- than simple metafeatures such as eye gaze or tion requires merging information from diverse head orientation. Second, we evaluate a range sets of information. of time and frequency domain statistics to deter- As a first step, an active mine the most valuable statistics for driving ac- driver safety system (a system cident prediction. Third, we predict major and Maria E. Jabon, Jeremy N. Bailenson, designed to prevent accidents) minor accidents directly, not intermediate driver Emmanuel Pontikakis, must monitor vehicle state and states such as fatigue. Finally, we explore the Leila Takayama, and Clifford Nass surroundings (see the “Re- use of the face and car outputs at varying pre- Stanford University lated Work in Active Driver accident intervals, uncovering important tempo- Safety Systems” sidebar). ral trends in predictive accuracy for each feature However, to fully transform subset. the vehicle into a smart envi- ronment,2 the driver must also be monitored. Experimental Testbed Human factors researchers have long studied We recruited 49 undergraduate students the driver’s role in causing and preventing ac- to drive a 40-minute simulated course in a cidents and have found that the driver’s physi- STiSIM driving simulator. We set up the simula- cal and emotional state, including fatigue3 and tor, developed by Systems Technology,5 to run stress levels,4 play a role in a significant num- on a single PC and project the simulated im- ber of traffic accidents. Thus, many research- age of the roadway onto a white wall in the lab ers have begun developing active driver-safety (see Figure 1). systems that monitor the driver as well as the During the driving course, we projected the vehicle. sounds of the car, the road, and events happening 2 PERVASIVE computing Published by the IEEE CS n 1536-1268/11/$26.00 © 2011 IEEE PC-10-04-Jabon.indd 2 8/4/11 4:09 PM Related Work in Active Driver Safety Systems uch work has been done in the area of holistic vehicle 3. J. McCall and M. Trivedi, “Video-Based Lane Estimation and Track- M sensing for active driver safety, including systems ing for Driver Assistance: Survey, System, and Evaluation,” IEEE Trans. Intelligent Transportation Systems, vol. 7, no. 1, 2006, pp. 20–37. for monitoring the vehicle environment, vehicle state, and more recently driver state. Systems developed to monitor the 4. M. Bertozzi, A. Broggi, and A. Lasagni, “Infared Stereo Vision-Based vehicle environment include pedestrian and obstacle detec- Pedestrian Detection,” Proc. IEEE Intelligent Vehicles Symp., IEEE Press, 2005, pp. 24–29. tors, lane-guidance systems, rear-bumper proximity sensors, blind-spot car detectors, automatic windshield wipers, and 5. Y. Suzuki, T. Fujii, and M. Tanimoto, “Parking Assistance Using Multi- surround imaging systems for parking assistance.1–6 Systems Camera Infrastructure,” Proc. IEEE Intelligent Vehicles Symp., IEEE Press, 2005, pp. 106–110. for monitoring the vehicle state include tracking vehicle location via GPS and accelerometers and other sensors to 6. H. Kurihata, T. Takahashi, and I. Ide, “Rainy Weather Recognition monitor driving speed, steering wheel angle, braking, and from In-Vehicle Camera Images for Driver Assistance,” Proc. IEEE Intelligent Vehicles Symp., IEEE Press, 2005, pp. 205–210. acceleration.7–8 Systems for monitoring driver state include frameworks that gauge driver fatigue, drowsiness, or stress 7. J. Waldo, “Embedded Computing and Formula One Racing,” IEEE levels.9–14 Pervasive Computing, vol. 4, no. 3, 2005, pp. 18–21. In this article, we extend previous work by using computer 8. M. Bertozzi et al., Knowledge-Based Intelligent Information and Engi- vision algorithms to directly map specific facial features to unsafe neering Systems, B. Appolloi et al., eds., Springer, 2007, pp. 704–711. driving behavior. We use a comprehensive set of raw facial- 9. M. Trivedi, T. Gandhi, and J. McCall, “Looking-In and Looking-Out feature points that are absent in previous work, including points of a Vehicle: Computer-Vision-Based Enhanced Vehicle Safety,” around the nose. Furthermore, we don’t infer any specific mental IEEE Trans. Intelligent Transportation Systems, vol. 8, no. 1, 2007, states such as fatigue, but rather implement a more empirical pp. 108–120. approach that uses machine learning algorithms to find and use 10. A. Williamson and T. Chamberlain, Review of On-Road Driver Fatigue the facial features that are most correlated with accidents. In ad- Monitoring Devices, NSW Injury Risk Management Research Centre, dition, we identify important temporal trends in predictive accu- Univ. of New South Wales, 2005. racy for each feature subset, revealing how to best use the face to 11. J.A. Healy and R.W. Picard, “Detecting Stress During Real-World improve the predictive accuracy of classifiers up to four seconds Driving Tasks Using Physiological Sensors,” IEEE Trans. Intelligent before a driving accident. Transportation Systems, vol. 6, no. 2, 2005, pp. 156–166. 12. A. Doshi and M.M. Trivedi, “On the Roles of Eye Gaze and Head REFERENces Dynamics in Predicting Driver’s Intent to Change Lanes,” IEEE Trans. Intelligent Transportation Systems, vol. 10, no. 3, 2009, pp. 453–462. 1. L. Li et al., “IVS 05: New Developments and Research Trends for Intelligent Vehicles,” IEEE Intelligent Systems, vol. 20, no. 4, 2005, 13. L. Fletcher, L. Petersson, and A. Zelinsky, “Road Scene Monotony pp. 10–14. Detection in a Fatigue Management Driver Assistance System,” Proc. IEEE Intelligent Vehicles Symp., IEEE Press, 2005, pp. 484–489. 2. T. Gandhi and M. Trivedi, “Vehicle Surround Capture: Survey of Techniques and a Novel Omni Video-Based Approach for Dynamic 14. Q. Ji, Z. Zhu, and P. Lan, “Real Time Non-intrusive Monitoring and Panoramic Surround Maps,” IEEE Trans. Intelligent Transportation Prediction of Driver Fatigue,” IEEE Trans. Vehicular Technology, IEEE Systems, vol. 7, no. 3, 2006, pp. 293–308. Press, 2004, pp. 1052–1068. around the car into the simulator room collect a large sample size of accidents the videos to AVI format in real time via four PC speakers. The course sim- to use in our analyses. This let us gener- using DirectX and DivX technology. ulated driving in a suburban environ- ate separate models for major accidents Although many technologies can cap- ment with conditions varying from (for example, hitting pedestrians, other ture facial movements, we opted for light to intense traffic. We included vehicles, or off-road objects) and minor image-based capture because it does not such challenging situations as busy in- accidents (such as unwarranted lane require special markers or user inter- tersections, unsafe drivers, construc- changes, driving off the road, or run- vention. Thus, our system is less intru- tion zones, sharp turns, and jaywalking ning a stoplight). sive, increasing transparency. We also pedestrians in an effort to increase the During the experimental sessions, recorded the simulator’s output during drive’s complexity (see Table 1). we recorded participants’ faces with the driving sessions, which was a text- We used a virtual driving simulator two Logitech Web cameras at a rate of based log file listing road conditions, instead of real cars in order to safely 15 frames per second. We compressed steering wheel angle, lane tracking OCTOBER–DECEMBER 2011 PERVASIVE computing 3 PC-10-04-Jabon.indd 3 8/4/11 4:09 PM FACE AND GESTURE RECOGNITION the video and could extract the pre- accident facial geometry and vehicle information. Our goal was to deter- mine the optimal way to combine the facial movements and vehicle out- puts to predict when accidents would occur. Thus, we sampled several pre- accident time intervals beginning be- tween one and four seconds before the accident and ranging from one to 10 seconds long. For each inter- val, we extracted the data preceeding every major and minor accident in our dataset, along with a random number of nonaccident intervals to use in our analyses. Figure 1. A study participant being monitored in the STISIM driving simulator. The simulator runs on a single PC and projects the image of the roadway onto a white Time Series Statistics Calculation wall in the lab. After data synchronization, we com- puted a series of time-domain statis- tics on the coordinates in each interval information, car speed, longitudinal we used the Neven Vision library.6 For to use as inputs to our classifiers. We acceleration (feet/second2), braking each video frame, the Neven Vision li- calculated averages, velocities, maxi- information, and number and type of brary automatically detects, without any mums, minimums, standard devia- accidents. preset markers worn on the user’s face, tions, and ranges for each of the Neven the x and y coordinates of 22 points on outputs and for the vehicle outputs Analysis Procedure the face, eye- and mouth-openness lev- such as speed, wheel angle, throttle, Using the collected videos and driving els, and head movements (for example, and braking outputs.