Plenoptic Imaging and Vision Using Angle Sensitive Pixels
Total Page:16
File Type:pdf, Size:1020Kb
PLENOPTIC IMAGING AND VISION USING ANGLE SENSITIVE PIXELS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Suren Jayasuriya January 2017 c 2017 Suren Jayasuriya ALL RIGHTS RESERVED This document is in the public domain. PLENOPTIC IMAGING AND VISION USING ANGLE SENSITIVE PIXELS Suren Jayasuriya, Ph.D. Cornell University 2017 Computational cameras with sensor hardware co-designed with computer vision and graph- ics algorithms are an exciting recent trend in visual computing. In particular, most of these new cameras capture the plenoptic function of light, a multidimensional function of ra- diance for light rays in a scene. Such plenoptic information can be used for a variety of tasks including depth estimation, novel view synthesis, and inferring physical properties of a scene that the light interacts with. In this thesis, we present multimodal plenoptic imaging, the simultaenous capture of multiple plenoptic dimensions, using Angle Sensitive Pixels (ASP), custom CMOS image sensors with embedded per-pixel diffraction gratings. We extend ASP models for plenoptic image capture, and showcase several computer vision and computational imaging applica- tions. First, we show how high resolution 4D light fields can be recovered from ASP images, using both a dictionary-based machine learning method as well as deep learning. We then extend ASP imaging to include the effects of polarization, and use this new information to image stress-induced birefringence and remove specular highlights from light field depth mapping. We explore the potential for ASPs performing time-of-flight imaging, and in- troduce the depth field, a combined representation of time-of-flight depth with plenoptic spatio-angular coordinates, which is used for applications in robust depth estimation. Fi- nally, we leverage ASP optical edge filtering to be a low power front end for an embedded deep learning imaging system. We also present two technical appendices: a study of using deep learning with energy-efficient binary gradient cameras, and a design flow to enable agile hardware design for computational image sensors in the future. BIOGRAPHICAL SKETCH The author was born on July 19, 1990. From 2008 to 2012 he studied at the University of Pittsburgh, where he received a Bachelor of Science in Mathematics with departmental honors and a Bachelor of Arts in Philosophy. He then moved to Ithaca, New York to study at Cornell University from 2012 to 2017, where he earned his doctorate in Electrical and Computer Engineering in 2017. iii To my parents and brother. iv ACKNOWLEDGEMENTS I am grateful for my advisor Al Molnar for all his mentoring over these past years. It was a risk to accept a student whose background was mathematics and philosophy, and had no experience in ECE or CS, but Al was patient enough to allow me to make mistakes and grow as a researcher. I also want to thank my commitee members Alyssa Apsel and Steve Marschner for their advice throughout my graduate career. Several people made important contributions to this thesis. Sriram Sivaramakrishnan was my go-to expert on ASPs, and co-author on many of my imaging papers. Matthew Hirsch and Gordon Wetzstein, along with the supervision of Ramesh Raskar, collaborated on the dictionary-based learning for recovering 4D light fields in Chapter 3. Arjun Jauhari, Mayank Gupta, and Kuldeep Kulkarni, along with the supervision of Pavan Turaga, per- formed deep learning experiments to speed up light field reconstructions in Chapter 3. Ellen Chuang and Debashree Guruaribam performed the polarization experiments in Chapter 4. Adithya Pediredla and Ryuichi Tadano helped create the depth fields experimental proto- type and worked on applications in Chapter 5. George Chen realized ASP Vision from conception with me, and Jiyue Yang and Judy Stephen collected and analyzed the real digit and face recognition experiments in Chapter 6. Ashok Veeraraghavan helped supervise the projects in Chapters 5 and 6. The work in Appendix A was completed during a summer internship at NVIDIA Re- search under the direction of Orazio Gallo, Jinwei Gu, and Jan Kautz, with the face detec- tion results compiled by Jinwei Gu, the gesture recognition results run by Pavlo Molchanov, and the captured gradient images for intensity reconstruction by Orazio Gallo. The design flow in Appendix B was developed in collaboration with Chris Torng, Moyang Wang, Na- garaj Murali, Bharath Sudheendra, Mark Buckler, Einar Veizaga, Shreesha Srinath, and Taylor Pritchard under the supervision of Chris Batten. Jiyue Yang also helped with the de- v sign and layout of ASP depth field pixels, and Gerd Kiene designed the amplifier presented in the Appendix B. Special thanks go to Achuta Kadambi for many long discussions about computational imaging. I also enjoyed being a fly on the wall of the Cornell Graphics/Vision group es- pecially chatting with Kevin Matzen, Scott Wehrwein, Paul Upchurch, Kyle Wilson, Sean Bell. My heartfelt gratitude goes to my friends and family who supported me while I un- dertook this journey. I couldn’t have done it without Buddy Rieger, Crystal Lee Morgan, Brandon Hang, Chris Hazel, Elly Engle, Steve Miller, Kevin Luke, Ved Gund, Alex Ruy- ack, Jayson Myers, Ravi Patel, and John McKay. Finally, I’d like to thank my brother Gihan and my parents for their love and support for all these years. This thesis is dedicated to them. vi TABLE OF CONTENTS Biographical Sketch................................. iii Dedication...................................... iv Acknowledgements.................................v Table of Contents.................................. vii List of Tables....................................x List of Figures.................................... xi 1 Introduction1 1.1 Dissertation Overview............................2 2 Background5 2.1 Plenoptic Function of Light..........................5 2.2 Applications of Plenoptic Imaging......................6 2.3 Computational Cameras............................ 11 2.4 Angle Sensitive Pixels............................ 12 2.4.1 Background.............................. 13 2.4.2 Applications............................. 15 2.4.3 Our Contribution........................... 15 3 Angle 17 3.1 Introduction.................................. 17 3.2 Related Work................................. 19 3.3 Method.................................... 20 3.3.1 Light Field Acquisition with ASPs.................. 21 3.3.2 Image and Light Field Synthesis................... 24 3.4 Analysis.................................... 26 3.4.1 Frequency Analysis.......................... 26 3.4.2 Depth of Field............................ 27 3.4.3 Resilience to Noise.......................... 31 3.5 Implementation................................ 31 3.5.1 Angle Sensitive Pixel Hardware................... 31 3.5.2 Software Implementation....................... 33 3.6 Results..................................... 35 3.7 Compressive Light Field Reconstructions using Deep Learning....... 38 3.7.1 Related Work............................. 39 3.7.2 Deep Learning for Light Field Reconstruction............ 40 3.7.3 Experimental Results......................... 45 3.7.4 Discussion.............................. 52 3.7.5 Limitations.............................. 53 3.7.6 Future Directions........................... 54 vii 4 Polarization 55 4.1 Polarization Response............................. 55 4.2 Applications.................................. 59 4.3 Conclusions and Future Work......................... 62 5 Time-of-Flight 63 5.1 Motivation for Depth Field Imaging..................... 63 5.2 Related Work................................. 66 5.3 Depth Fields.................................. 67 5.3.1 Light Fields.............................. 69 5.3.2 Time-of-Flight Imaging....................... 70 5.3.3 Depth Fields............................. 70 5.4 Methods to Capture Depth Fields....................... 71 5.4.1 Pixel Designs............................. 71 5.4.2 Angle Sensitive Photogates...................... 73 5.4.3 Experimental Setup.......................... 73 5.5 Experimental Setup.............................. 75 5.6 Applications of Depth Fields......................... 76 5.6.1 Synthetic Aperture Refocusing.................... 76 5.6.2 Phase wrapping ambiguities..................... 77 5.6.3 Refocusing through partial occluders................ 78 5.6.4 Refocusing past scattering media................... 80 5.7 Discussion................................... 81 5.7.1 Limitations.............................. 83 5.8 Conclusions and Future Work......................... 83 6 Visual Recognition 85 6.1 Introduction.................................. 85 6.1.1 Motivation and Challenges...................... 86 6.1.2 Our Proposed Solution........................ 87 6.1.3 Contributions............................. 88 6.1.4 Limitations.............................. 88 6.2 Related Work................................. 89 6.3 ASP Vision.................................. 90 6.3.1 Hardcoding the First Layer of CNNs................. 91 6.3.2 Angle Sensitive Pixels........................ 92 6.4 Analysis.................................... 98 6.4.1 Performance on Visual Recognition Tasks.............. 98 6.4.2 FLOPS savings............................ 101 6.4.3 Noise analysis............................ 101 6.4.4 ASP parameter design space..................... 102 6.5 Hardware Prototype and Experiments.................... 102 viii 6.6