Boman, A., Nilsson, E., Go with the Flow, 2020.Pdf

Total Page:16

File Type:pdf, Size:1020Kb

Boman, A., Nilsson, E., Go with the Flow, 2020.Pdf Examensarbete 30 hp Juni 2020 Go with the flow A study exploring public transit performance using a flow network model Axel Boman Erik Nilsson Acknowledgments For 22 weeks in the spring of 2020, we have had the honor of working with Samtrafiken and the Swedish Public Transport Data Lab (KoDa). We gratefully want to acknowledge Jerry L¨ofvenhaft for providing us with this meaningful context, network, and direction, forming the foundation of this project. Additionally, we want to express our sincere appreciation to our supervisor, Associate Professor Kristiaan Pelckmans, who has the substance of a genius: he convincingly led us through the world of graph mining. Without his clear-sightedness and persistent vision, the goal of this project would not have been realized. Moreover, we would like to recognize Gabriella Canas and Timo Palokangas at UL's traffic unit for accommodating us with a warm welcome and insights into the public transport industry. Also, special thanks to Jonathan Gustafsson and Oscar M¨orke | the beloved janitors at The Park | for always delivering invaluable advice on life and making us laugh during the late working nights. Lastly, we would like to thank room 13:135 at Bl˚asenhus where most of this thesis was written. To most people, it's just a room with four walls, a table, and a whiteboard. To us, it's a parallel universe providing a level of embowerment and focus-instillment we did not know physical space was capable of. Thank you, Akademiska Hus. Axel & Erik June 2020, Uppsala 1 Popul¨arvetenskaplig sammanfattning Idag bor halva v¨arldensbefolkning i stadsomr˚aden. Ar˚ 2050 f¨orv¨antas an- talet ¨oka till 70 procent [1]. V¨arldengenomg˚aren strukturell urbanisering som leder till ekonomiska, sociala och milj¨om¨assigakonsekvenser. St¨aders infrastruktur pressas h˚artmed att bem¨otabefolknings¨okningensinneboende utmaningar; tr¨angseloch f¨ororening[2]. F¨oratt n˚ade globala m˚als¨attningarnainom h˚allbarhet, kr¨avsen omdef- inition av hur vi r¨oross, och en vidareutveckling av systemen vi r¨oross med. Det beh¨ovsen ny verklighet. En verklighet d¨armobilitet samordnas, utv¨arderasoch utformas i linje med regionernas utveckling [2]. Nordiska l¨anderhar generellt s¨atth¨ogtuppsatta m˚alf¨ordenna f¨or¨andring,och i syn- nerhet n¨ardet kommer till kollektivtrafik. K¨openhamn har som m˚als¨attning att 75% av alla resor 2025 ska genomf¨orasmed kollektivtrafik, och Sverige har ett gemensamt m˚alom att dubbla antalet kollektivtrafikresor mellan 2006 och 2020. Man ¨ar¨overens om att systemen bakom kollektivtrafiken m˚astebli mer robusta, effektiva och datadrivna [1]. Denna studie ¨amnaratt ta fasta vid de ledorden, och ta ett steg i samma riktning. Detta arbete unders¨oker de outnyttjade potentialerna i Googles GTFS1- format. Ett format f¨orkollektivtrafikdata inneh˚allandesbl.a. positionering i realtid, information kring schemaavikelser och n¨atverkets organisation. At- traktiviteten i GTFS bottnar framf¨orallti dess imponerande anv¨andarbas; 1240 transitn¨atverk i 672 regioner. Denna utbredning har gjort formatet till en de facto-standard och l¨osningarsom bygger ovanp˚aformatet ¨oppnar indirekt upp d¨orrartill kollektivtrafikn¨ati hela v¨arlden. Formatets design ¨arprim¨artgjord med m˚als¨attningenatt tillgodose applika- tioner som utf¨orreseplanering ˚atslutanv¨andare.D¨aremotkommer detta ar- bete att unders¨oka ifall samma data (dessutom) har potential i termer av att utv¨arderakollektivtrafikn¨atetsprestanda. D.v.s. fr˚anen n¨at¨agaresperspek- 1Ett format f¨orkollektivtrafikdata som g¨ordet m¨ojligtf¨orakt¨oreri kollektivtrafiksek- torn att publicera sina data och f¨orutvecklare att skriva applikationer som konsumerar datan. 2 tiv. Mer specifikt kommer denna uppsats unders¨oka hur lokala s˚arbarheter i n¨atetkan modelleras m.h.a. GTFS-data insamlad fr˚anUL-n¨atetunder Januari 2020. Syftet med detta arbete ¨artv˚afaldigt;F¨orstunders¨okshur specifika s˚arbarhetsegenskaper i ett kollektivtrafikn¨atkan utv¨arderas med hj¨alpav algoritmer inom graph mining2. Slutligen kommer vi unders¨oka m¨ojlighetenatt utveckla en pipeline som aggregerar GTFS data till att passa in i en fl¨odesn¨atsmodell. Resul- tatet visar att det ¨arm¨ojligt,genom det f¨oreslagnaanalytiska ramverket, att modellera och bed¨omas˚arbarheter nod-till-nod i ett kollektivtrafikn¨at baserat p˚aGTFS data. Dessutom visualiseras resultaten i kontext genom Uppsalas n¨atverk (UL) med hj¨alpav ett webbaserat verktyg. 2Graph mining ¨arden upps¨attningverktyg och tekniker som anv¨andsf¨oratt (a) anal- ysera egenskaperna f¨orverkliga grafer, (b) f¨oruts¨agahur strukturen och egenskaperna f¨or en given graf kan p˚averka vissa till¨ampningar,och (c) utveckla modeller som kan generera realistiska grafer som matchar de m¨onstersom finns i verkliga grafer av intresse. 3 Distribution of work This work was created by Axel Boman and Erik Nilsson. All areas covered in this thesis have been researched, written, and reviewed in collaboration. Individual responsibilities were apportioned during the course of the project, which was then audited and aligned every week by both authors. Towards the end of the project, Axel focused on preparing the thesis writing while Erik developed the web-based tool for visualizing the results [5.5]. As both writers have worked with all parts of the thesis, the overall distri- bution of work is close to 50/50. 4 Abbreviations GTFS - General Transit Feed Specification PTN - Public Transit Networks RISE - Research Institutes of Sweden UL - Uppsala Lokaltrafik (Uppsala's transit network) API { Application Programming Interface HI-HS { High impact, high serviceability HI-LS { High impact, low serviceability LI-HS { Low impact, high serviceability LI-LS { Low impact, low serviceability 5 Contents 1 Introduction 8 1.1 Problem formulation . .9 1.2 Purpose . .9 1.2.1 Research questions . .9 1.2.2 Goals . 10 1.3 Thesis outline . 10 2 Background 12 2.1 Industry setting . 12 2.1.1 Coordination in a fragmented sector . 12 2.1.2 From an agency perspective . 13 2.1.3 Infrastructure for distribution of PT data . 14 2.2 The potential of modeling PTN as a graph . 15 2.2.1 Assessing vulnerability with topological features . 16 2.2.2 Assessing vulnerability with serviceability . 16 3 Theory 18 3.1 Key definitions in graph theory . 18 3.2 The concept of flow networks . 19 3.3 Modeling the real-world as flow networks . 22 3.4 Analytical framework . 23 4 Methods 26 4.1 Residual capacity model on an edge-level . 27 6 4.2 The data . 28 4.2.1 Data collection together with RISE . 29 4.2.2 Data preprocessing and cleaning . 30 4.2.3 Data modeling using NetworkX . 34 4.3 Vulnerability assessment . 35 4.4 Visualization using Mapbox . 36 4.5 General presumptions . 36 4.6 Data exploration . 37 5 Results 40 5.1 Capacity model in a PTN context . 40 5.2 Vulnerability assessment using the min-cut algorithm . 42 5.3 Fitting raw GTFS data in a flow network model . 43 5.4 In-context results from Uppsala's transit network (UL) . 44 5.4.1 HI-HS: High impact, high serviceability . 45 5.4.2 LI-HS: Low impact, high serviceability . 46 5.4.3 LI-LS: Low impact, low serviceability . 46 5.4.4 HI-LS, High impact, low serviceability . 47 5.4.5 HI-RLS, High impact, relatively low serviceability . 48 5.5 Interactive web application to visualize results . 49 6 Discussion 51 6.1 Beyond the complete disruption . 51 6.2 Comparison with previous report . 52 6.3 Vulnerability characteristics { not a distinct classification . 53 6.3.1 HI-LS: High impact, low serviceability . 53 6.3.2 HI-RLS: High impact, relatively low serviceability . 54 7 Conclusion 57 7.1 Future work . 58 8 Bibliography 59 7 Chapter 1 Introduction Efficient, flexible, and robust public transport (PT), especially along major commuting arteries, will be needed to reduce traffic congestion caused by urbanization [2]. In order to future proof cities, governments must explore ways of enhancing PT so that it remains an appealing alternative to private transportation and meets the mobility needs of people who depend on it [2]. In recent years, an increasing volume of data related to public transit net- works (PTN) is being generated in real-time [3]. Data following the same General Transit Feed Specification (GTFS1) format [4] and is distributed openly by 1240 transit agencies in 672 locations worldwide. This broad adaptation of GTFS has made it a de facto standard, making a product built on it inherently scalable as it potentially could be deployed in PTN all over the world [3]. 1A format for real-time public transportation data and associated geographic infor- mation allowing public transit agencies to publish their data and developers to write applications to consume it. 8 1.1 Problem formulation As opposed to transit agencies' well-developed data generation capabilities [3], their utilization of their data is often overlooked. Although agencies' data utilization capabilities differ between countries and regions, qualitative legacy methods for evaluating PTN are still being used to a great extent [5][6]. Additionally, the GTFS data format is originally designed to be sufficient for trip planning functionality, rather than for transit performance measures. As a consequence, additional data processing is required to expand the data potential in this context[7]. In this study, we will tap into the potential of using GTFS data from an agency stakeholder perspective to assess transit performance. More specifi- cally, we will outline a data-driven approach for quantifying service disrup- tions to assess network vulnerabilities in a flow network model2. Lastly, our approach will be applied to a real-world context. 1.2 Purpose The academic purpose of this study is (1) to explore how to assess specific properties in terms of transit performance using flow network algorithms and (2) develop a pipeline for processing GTFS data to fit in a flow network model.
Recommended publications
  • Sagemath and Sagemathcloud
    Viviane Pons Ma^ıtrede conf´erence,Universit´eParis-Sud Orsay [email protected] { @PyViv SageMath and SageMathCloud Introduction SageMath SageMath is a free open source mathematics software I Created in 2005 by William Stein. I http://www.sagemath.org/ I Mission: Creating a viable free open source alternative to Magma, Maple, Mathematica and Matlab. Viviane Pons (U-PSud) SageMath and SageMathCloud October 19, 2016 2 / 7 SageMath Source and language I the main language of Sage is python (but there are many other source languages: cython, C, C++, fortran) I the source is distributed under the GPL licence. Viviane Pons (U-PSud) SageMath and SageMathCloud October 19, 2016 3 / 7 SageMath Sage and libraries One of the original purpose of Sage was to put together the many existent open source mathematics software programs: Atlas, GAP, GMP, Linbox, Maxima, MPFR, PARI/GP, NetworkX, NTL, Numpy/Scipy, Singular, Symmetrica,... Sage is all-inclusive: it installs all those libraries and gives you a common python-based interface to work on them. On top of it is the python / cython Sage library it-self. Viviane Pons (U-PSud) SageMath and SageMathCloud October 19, 2016 4 / 7 SageMath Sage and libraries I You can use a library explicitly: sage: n = gap(20062006) sage: type(n) <c l a s s 'sage. interfaces .gap.GapElement'> sage: n.Factors() [ 2, 17, 59, 73, 137 ] I But also, many of Sage computation are done through those libraries without necessarily telling you: sage: G = PermutationGroup([[(1,2,3),(4,5)],[(3,4)]]) sage : G . g a p () Group( [ (3,4), (1,2,3)(4,5) ] ) Viviane Pons (U-PSud) SageMath and SageMathCloud October 19, 2016 5 / 7 SageMath Development model Development model I Sage is developed by researchers for researchers: the original philosophy is to develop what you need for your research and share it with the community.
    [Show full text]
  • Networkx Tutorial
    5.03.2020 tutorial NetworkX tutorial Source: https://github.com/networkx/notebooks (https://github.com/networkx/notebooks) Minor corrections: JS, 27.02.2019 Creating a graph Create an empty graph with no nodes and no edges. In [1]: import networkx as nx In [2]: G = nx.Graph() By definition, a Graph is a collection of nodes (vertices) along with identified pairs of nodes (called edges, links, etc). In NetworkX, nodes can be any hashable object e.g. a text string, an image, an XML object, another Graph, a customized node object, etc. (Note: Python's None object should not be used as a node as it determines whether optional function arguments have been assigned in many functions.) Nodes The graph G can be grown in several ways. NetworkX includes many graph generator functions and facilities to read and write graphs in many formats. To get started though we'll look at simple manipulations. You can add one node at a time, In [3]: G.add_node(1) add a list of nodes, In [4]: G.add_nodes_from([2, 3]) or add any nbunch of nodes. An nbunch is any iterable container of nodes that is not itself a node in the graph. (e.g. a list, set, graph, file, etc..) In [5]: H = nx.path_graph(10) file:///home/szwabin/Dropbox/Praca/Zajecia/Diffusion/Lectures/1_intro/networkx_tutorial/tutorial.html 1/18 5.03.2020 tutorial In [6]: G.add_nodes_from(H) Note that G now contains the nodes of H as nodes of G. In contrast, you could use the graph H as a node in G.
    [Show full text]
  • Networkx: Network Analysis with Python
    NetworkX: Network Analysis with Python Salvatore Scellato Full tutorial presented at the XXX SunBelt Conference “NetworkX introduction: Hacking social networks using the Python programming language” by Aric Hagberg & Drew Conway Outline 1. Introduction to NetworkX 2. Getting started with Python and NetworkX 3. Basic network analysis 4. Writing your own code 5. You are ready for your project! 1. Introduction to NetworkX. Introduction to NetworkX - network analysis Vast amounts of network data are being generated and collected • Sociology: web pages, mobile phones, social networks • Technology: Internet routers, vehicular flows, power grids How can we analyze this networks? Introduction to NetworkX - Python awesomeness Introduction to NetworkX “Python package for the creation, manipulation and study of the structure, dynamics and functions of complex networks.” • Data structures for representing many types of networks, or graphs • Nodes can be any (hashable) Python object, edges can contain arbitrary data • Flexibility ideal for representing networks found in many different fields • Easy to install on multiple platforms • Online up-to-date documentation • First public release in April 2005 Introduction to NetworkX - design requirements • Tool to study the structure and dynamics of social, biological, and infrastructure networks • Ease-of-use and rapid development in a collaborative, multidisciplinary environment • Easy to learn, easy to teach • Open-source tool base that can easily grow in a multidisciplinary environment with non-expert users
    [Show full text]
  • Graph Database Fundamental Services
    Bachelor Project Czech Technical University in Prague Faculty of Electrical Engineering F3 Department of Cybernetics Graph Database Fundamental Services Tomáš Roun Supervisor: RNDr. Marko Genyk-Berezovskyj Field of study: Open Informatics Subfield: Computer and Informatic Science May 2018 ii Acknowledgements Declaration I would like to thank my advisor RNDr. I declare that the presented work was de- Marko Genyk-Berezovskyj for his guid- veloped independently and that I have ance and advice. I would also like to thank listed all sources of information used Sergej Kurbanov and Herbert Ullrich for within it in accordance with the methodi- their help and contributions to the project. cal instructions for observing the ethical Special thanks go to my family for their principles in the preparation of university never-ending support. theses. Prague, date ............................ ........................................... signature iii Abstract Abstrakt The goal of this thesis is to provide an Cílem této práce je vyvinout webovou easy-to-use web service offering a database službu nabízející databázi neorientova- of undirected graphs that can be searched ných grafů, kterou bude možno efektivně based on the graph properties. In addi- prohledávat na základě vlastností grafů. tion, it should also allow to compute prop- Tato služba zároveň umožní vypočítávat erties of user-supplied graphs with the grafové vlastnosti pro grafy zadané uži- help graph libraries and generate graph vatelem s pomocí grafových knihoven a images. Last but not least, we implement zobrazovat obrázky grafů. V neposlední a system that allows bulk adding of new řadě je také cílem navrhnout systém na graphs to the database and computing hromadné přidávání grafů do databáze a their properties.
    [Show full text]
  • Networkx Reference Release 1.9.1
    NetworkX Reference Release 1.9.1 Aric Hagberg, Dan Schult, Pieter Swart September 20, 2014 CONTENTS 1 Overview 1 1.1 Who uses NetworkX?..........................................1 1.2 Goals...................................................1 1.3 The Python programming language...................................1 1.4 Free software...............................................2 1.5 History..................................................2 2 Introduction 3 2.1 NetworkX Basics.............................................3 2.2 Nodes and Edges.............................................4 3 Graph types 9 3.1 Which graph class should I use?.....................................9 3.2 Basic graph types.............................................9 4 Algorithms 127 4.1 Approximation.............................................. 127 4.2 Assortativity............................................... 132 4.3 Bipartite................................................. 141 4.4 Blockmodeling.............................................. 161 4.5 Boundary................................................. 162 4.6 Centrality................................................. 163 4.7 Chordal.................................................. 184 4.8 Clique.................................................. 187 4.9 Clustering................................................ 190 4.10 Communities............................................... 193 4.11 Components............................................... 194 4.12 Connectivity..............................................
    [Show full text]
  • Exploring Network Structure, Dynamics, and Function Using Networkx
    Proceedings of the 7th Python in Science Conference (SciPy 2008) Exploring Network Structure, Dynamics, and Function using NetworkX Aric A. Hagberg ([email protected])– Los Alamos National Laboratory, Los Alamos, New Mexico USA Daniel A. Schult ([email protected])– Colgate University, Hamilton, NY USA Pieter J. Swart ([email protected])– Los Alamos National Laboratory, Los Alamos, New Mexico USA NetworkX is a Python language package for explo- and algorithms, to rapidly test new hypotheses and ration and analysis of networks and network algo- models, and to teach the theory of networks. rithms. The core package provides data structures The structure of a network, or graph, is encoded in the for representing many types of networks, or graphs, edges (connections, links, ties, arcs, bonds) between including simple graphs, directed graphs, and graphs nodes (vertices, sites, actors). NetworkX provides ba- with parallel edges and self-loops. The nodes in Net- sic network data structures for the representation of workX graphs can be any (hashable) Python object simple graphs, directed graphs, and graphs with self- and edges can contain arbitrary data; this flexibil- loops and parallel edges. It allows (almost) arbitrary ity makes NetworkX ideal for representing networks objects as nodes and can associate arbitrary objects to found in many different scientific fields. edges. This is a powerful advantage; the network struc- In addition to the basic data structures many graph ture can be integrated with custom objects and data algorithms are implemented for calculating network structures, complementing any pre-existing code and properties and structure measures: shortest paths, allowing network analysis in any application setting betweenness centrality, clustering, and degree dis- without significant software development.
    [Show full text]
  • Radisson Blu Arlanda Airport Conference Brochure
    P WELCOME P Conveniently situated at Stockholm Arlanda Airport, the Radisson Blu SkyCity Hotel HOTEL GPS COORDINATES RADISSON BLU & Arlanda Airport Conference is perfect for effective meetings in an inspiring environment. 59°38′59.74″N, 17°55′47.45″E ARLANDIA HOTEL Located in the heart of the airport in SkyCity, between the international and domestic terminals, the hotel is a natural meeting place – whether you arrive by plane, regional train, airport bus 273 to / from or commuter train. It only takes 20 minutes to Stockholm City by express train. NORRTÄLJE / ALMUNGE Guests arriving by car can park in the garage directly below the hotel. Our modern conference facilities and flexible rooms make your meeting at Stockholm Arlanda Airport easy to plan and attend. State-of-the-art equipment, innovative food concept and professionally trained E4 to / from staff will make your meeting at Radisson Blu SkyCity Hotel & Arlanda Airport Conference a pleasure. STOCKHOLM, UPPSALA P P BUILDING P P P EXPERIENCE MEETINGS TERMINAL 5 Experience Meetings by Radisson Blu is a concept that ensures successful meetings SAS INTERNATIONAL P BUILDING by harmonizing the tangibles: breakout rooms, connectivity and food, P BUILDING with the intangibles such as: service, satisfaction and sustainability. RADISSON BLU SKYCITY HOTEL MEETINGS An integral part of Experience Meetings, Brain Food is perfect meeting food that improves efficiency, ability to concentrate and mental speed; while lowering stress levels. Designed to elevate your efficiency TERMINAL 2 TERMINAL 4 and creativity, Brain Box touches upon all the senses with walls you can write on, colors that spark SAS DOMESTIC your brain and flexible furniture that gives you the space to think.
    [Show full text]
  • Information on Venue, Abstracts, Deadlines, Accomodation and Transportation
    COST Action CM1104 REDUCIBLE OXIDE CHEMISTRY, STRUCTURE AND FUNCTIONS 2nd General Meeting Uppsala, 06. – 08. November 2013 INFORMATION ON VENUE, ABSTRACTS, DEADLINES, ACCOMODATION AND TRANSPORTATION MEETING VENUE: The Ångström laboratory, Uppsala University (street address: Lägerhyddsvägen 1). The conference hall is The Siegbahn Hall on the entrance floor. The laboratory is located about 2.5 km from the center of town. See the map at http://goo.gl/aLoE03 MEETING TIME : Start "after lunch" on Nov 6 th , end at or before 1 p.m. on Nov 8 th . ABSTRACTS : We welcome all abstracts, and particularly those that concern joint endeavors between different member groups of our COST action. We also especially welcome efforts and applications that emphasize the interconnections and synergies between the areas of our different Working Groups. So if you think that your contribution does this - it is good if you highlight it. The abstact template can be downloaded on the website internal area in the “Meetings” page. DEADLINES : The deadline is Tuesday 10 th SEPTEMBER for registration and abstract submission via the Action website. Within 1-2 weeks you will receive an official invitation from the Chair of our Action through the eCOST system, including information about reimbursement or non-reimbursement. After this you should immediately confirm/cancel your participation via the eCOST link provided in the official invitation mail, NO LATER THAN 25 th SEPTEMBER. ACCOMODATION : There are rather many hotels in Uppsala: Hotel Svava, Grand HotelHörnan and Clarion are good hotels in the vicinity of the Central station and there are several others (see e.g.
    [Show full text]
  • Graph and Network Analysis
    Graph and Network Analysis Dr. Derek Greene Clique Research Cluster, University College Dublin Web Science Doctoral Summer School 2011 Tutorial Overview • Practical Network Analysis • Basic concepts • Network types and structural properties • Identifying central nodes in a network • Communities in Networks • Clustering and graph partitioning • Finding communities in static networks • Finding communities in dynamic networks • Applications of Network Analysis Web Science Summer School 2011 2 Tutorial Resources • NetworkX: Python software for network analysis (v1.5) http://networkx.lanl.gov • Python 2.6.x / 2.7.x http://www.python.org • Gephi: Java interactive visualisation platform and toolkit. http://gephi.org • Slides, full resource list, sample networks, sample code snippets online here: http://mlg.ucd.ie/summer Web Science Summer School 2011 3 Introduction • Social network analysis - an old field, rediscovered... [Moreno,1934] Web Science Summer School 2011 4 Introduction • We now have the computational resources to perform network analysis on large-scale data... http://www.facebook.com/note.php?note_id=469716398919 Web Science Summer School 2011 5 Basic Concepts • Graph: a way of representing the relationships among a collection of objects. • Consists of a set of objects, called nodes, with certain pairs of these objects connected by links called edges. A B A B C D C D Undirected Graph Directed Graph • Two nodes are neighbours if they are connected by an edge. • Degree of a node is the number of edges ending at that node. • For a directed graph, the in-degree and out-degree of a node refer to numbers of edges incoming to or outgoing from the node.
    [Show full text]
  • Networkx: Network Analysis with Python
    NetworkX: Network Analysis with Python Dionysis Manousakas (special thanks to Petko Georgiev, Tassos Noulas and Salvatore Scellato) Social and Technological Network Data Analytics Computer Laboratory, University of Cambridge February 2017 Outline 1. Introduction to NetworkX 2. Getting started with Python and NetworkX 3. Basic network analysis 4. Writing your own code 5. Ready for your own analysis! 2 1. Introduction to NetworkX 3 Introduction: networks are everywhere… Social Mobile phone networks Vehicular flows networks Web pages/citations Internet routing How can we analyse these networks? Python + NetworkX 4 Introduction: why Python? Python is an interpreted, general-purpose high-level programming language whose design philosophy emphasises code readability + - Clear syntax, indentation Can be slow Multiple programming paradigms Beware when you are Dynamic typing, late binding analysing very large Multiple data types and structures networks Strong on-line community Rich documentation Numerous libraries Expressive features Fast prototyping 5 Introduction: Python’s Holy Trinity Python’s primary NumPy is an extension to Matplotlib is the library for include primary plotting mathematical and multidimensional arrays library in Python. statistical computing. and matrices. Contains toolboxes Supports 2-D and 3-D for: Both SciPy and NumPy plotting. All plots are • Numeric rely on the C library highly customisable optimization LAPACK for very fast and ready for implementation. professional • Signal processing publication. • Statistics, and more…
    [Show full text]
  • A New Way to Measure Homophily in Directed Graphs
    The Popularity-Homophily Index: A new way to measure Homophily in Directed Graphs Shiva Oswal UC Berkeley Extension, Berkeley, CA, USA Abstract In networks, the well-documented tendency for people with similar characteris- tics to form connections is known as the principle of homophily. Being able to quantify homophily into a number has a signicant real-world impact, ranging from government fund-allocation to netuning the parameters in a sociological model. This paper introduces the “Popularity-Homophily Index” (PH Index) as a new metric to measure homophily in directed graphs. The PH Index takes into account the “popularity” of each actor in the interaction network, with popularity precalculated with an importance algorithm like Google PageRank. The PH Index improves upon other existing measures by weighting a homophilic edge leading to an important actor over a homophilic edge leading to a less important actor. The PH Index can be calculated not only for a single node in a directed graph but also for the entire graph. This paper calculates the PH Index of two sample graphs, and concludes with an overview of the strengths and weaknesses of the PH Index, and its potential applications in the real world. Keywords: measuring homophily; popularity-homophily index; social interaction networks; directed graphs 1 Introduction Homophily, meaning “love of the same,” captures the tendency of agents, most com- monly people, to connect with other agents that share sociodemographic, behavioral, or intrapersonal characteristics (McPherson et al, 2001). In the language of social interaction networks, homophily is the “tendency for friendships and many other arXiv:2109.00348v1 [cs.SI] 29 Aug 2021 interpersonal relationships to occur between similar people” (Thelwall, 2009).
    [Show full text]
  • PDF Reference
    NetworkX Reference Release 1.0 Aric Hagberg, Dan Schult, Pieter Swart January 08, 2010 CONTENTS 1 Introduction 1 1.1 Who uses NetworkX?..........................................1 1.2 The Python programming language...................................1 1.3 Free software...............................................1 1.4 Goals...................................................1 1.5 History..................................................2 2 Overview 3 2.1 NetworkX Basics.............................................3 2.2 Nodes and Edges.............................................4 3 Graph types 9 3.1 Which graph class should I use?.....................................9 3.2 Basic graph types.............................................9 4 Operators 129 4.1 Graph Manipulation........................................... 129 4.2 Node Relabeling............................................. 134 4.3 Freezing................................................. 135 5 Algorithms 137 5.1 Boundary................................................. 137 5.2 Centrality................................................. 138 5.3 Clique.................................................. 142 5.4 Clustering................................................ 145 5.5 Cores................................................... 147 5.6 Matching................................................. 148 5.7 Isomorphism............................................... 148 5.8 PageRank................................................. 161 5.9 HITS..................................................
    [Show full text]