Combining Open-Source Programming Languages with GIS for Spatial Data Science
Total Page:16
File Type:pdf, Size:1020Kb
Technische Universität München Department of Civil, Geo and Environmental Engineering Chair of Cartography Prof. Dr.-Ing. Liqiu Meng Combining Open-source Programming Languages with GIS for Spatial Data Science Maja Kalinic Master's Thesis Duration: 01.04.2017 - 30.09.2017 Study Course: Cartography M.Sc. Supervisor: Dr. Mathias Jahnke (TUM) Prof. Dr. Menno-Jan Kraak (ITC) Dr. Jan Wilkening (Esri Deutschland GmbH) Cooperation: Esri Deutschland GmbH, Kranzberg 2017 Table of Contents DECLARATION OF AUTHORSHIP ..................................................................... 4 ACKNOWLEDMENTS ......................................................................................... 5 LIST OF FIGURES ............................................................................................... 6 ABSTRACT ........................................................................................................... 8 CHAPTER ............................................................................................................. 9 I INTRODUCTION ..................................................................................................... 9 1.1. Motivation and problem statement ..................................................................... 9 1.2. Research identification ....................................................................................... 10 1.3. Research questions ............................................................................................. 10 II RELATED WORK .................................................................................................. 12 2.1. Open-source ......................................................................................................... 12 2.2. Programming languages .................................................................................... 13 2.3. GIS software ........................................................................................................ 15 2.4. Data Science and Big Data ................................................................................ 17 2.5. Crime analysis - overview .................................................................................. 20 2.5.1. ArcGIS Desktop in Crime analyses ............................................................... 21 2.5.2. Tools – Python and R for Crime Analysis ..................................................... 22 III CASE STUDY ...................................................................................................... 26 3.1. Data ...................................................................................................................... 26 3.2. Methodology ........................................................................................................ 27 3.3. Workflow .............................................................................................................. 29 3.3.1. Strategic analysis in Python .......................................................................... 29 3.3.3. Tactical analysis with R-ArcGIS Bridge ....................................................... 43 3.3.4. Predictive policing with Python scikit-learn ................................................ 49 IV OUTPUTS ............................................................................................................ 50 V DISCUSSION ......................................................................................................... 52 5.1. Discussion of the findings ....................................................................................... 52 5.2. Finding about San Francisco crime data .......................................................... 54 VI CONCLUSIONS AND FINAL REMARKS ............................................................ 55 6.1. Summary ............................................................................................................. 55 6.2. Conclusions .......................................................................................................... 55 6.3. Limitations and Further study recommendations ........................................... 56 BIBLIOGRAPHY ................................................................................................ 57 ABBREVIATIONS ..................................................................................................... 62 APPENDIX 1 .............................................................................................................. 63 2 APPENDIX 2 .............................................................................................................. 64 APPENDIX 3 .............................................................................................................. 65 APPENDIX 4 .............................................................................................................. 66 APPENDIX 5 .............................................................................................................. 67 APPENDIX 6 .............................................................................................................. 73 APPENDIX 7 .............................................................................................................. 74 3 DECLARATION OF AUTHORSHIP I hereby certify that this thesis has been composed by me and is based on my own work, unless stated otherwise. No other person’s work has been used without due acknowledgement in this thesis. All references and verbatim extracts have been quoted, and all sources of information, have been specifically acknowledged. Date: Signature: 22.09.2017. Maja Kalinic 4 ACKNOWLEDMENTS After an intensive period of six months, today is the day: writing this note of thanks is the finishing touch of my master thesis work. It has been a period of intense learning for me, not only in the scientific domain, but also on a personal level. I would like to reflect on the people who have supported and helped me so much throughout this period. I would like to express my sincere gratitude to my supervisors, Dr. Mathias Jahnke, Prof. Dr. Menno-Jan Kraak and Dr. Jan Wilkening, for the continuous support during my thesis research, for your patience, motivation, and immense knowledge. Your guidance helped me in all aspects of this research and writing of this thesis. I could not have imagined having a better supervisors for my master thesis work. Spatial thanks to Corné van Elzakker, Master’s thesis semester coordinator, for his insightful comments and encouragement, but also for the motivation questions which stimulated me to widen my research from various perspectives. My special thanks go to my friend, muse and one of the greatest person that I have ever met, Juline Cron, our program coordinator. Thank you for all your support, patience, generosity and kindness. All this would never be possible without you being always there for me, saying the right words in right time and making me believe in myself to the very end. I would like to thank all my friends, my boyfriend and my loved ones, who have supported me throughout entire process, both by keeping me harmonious and helping me putting pieces together. I will be grateful forever for your love and support. Special thanks to my friend Victoria Curl, for proofreading and inspirational words. Final special thanks go to my mom, who has been with me in the hardest moments and inspired me to never give up. Thank you all for being with me on this journey. Maja Kalinic 5 LIST OF FIGURES Figure 1: Thesis structure ..........................................................................................11 Figure 2: GIS software overview and ranking ..........................................................17 Figure 3: Data analysis pipeline ................................................................................27 Figure 4: Overview of the crime occurrence in San Francisco ................................30 Figure 5: Crime counts per district............................................................................30 Figure 6: Number of reported crimes per Year ........................................................31 Figure 7: Number of reported crimes per Month .....................................................31 Figure 8: Number of reported crimes per Day of Week ...........................................32 Figure 9: Number of reported crimes per Day of Month .........................................32 Figure 10: Number of reported crimes per Hour ......................................................33 Figure 11: Relative number of top crimes per weekday ..........................................34 Figure 12: Crime occurrence by Hour .......................................................................35 Figure 13: Map of crime occurrence per District ......................................................36 Figure 14: Density map of crime occurrence per District ........................................37 Figure 15: Relationship between day of the week and crime volume ....................37 Figure 16: Relationship between crime rates and districts .....................................38 Figure 17: Relation between crime types and day of the week ...............................39 Figure 18: Relation between crime types and months .............................................39 Figure 19: Larceny/Theft per District .......................................................................40 Figure 20: Larceny/Theft on Friday...........................................................................41