Mining Time-Stamped Co-Editing Networks from Large Git Repositories
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Networkx: Network Analysis with PythonNetworkX: Network Analysis with Python Salvatore Scellato Full tutorial presented at the XXX SunBelt Conference “NetworkX introduction: Hacking social networks using the Python programming language” by Aric Hagberg & Drew Conway Outline 1. Introduction to NetworkX 2. Getting started with Python and NetworkX 3. Basic network analysis 4. Writing your own code 5. You are ready for your project! 1. Introduction to NetworkX. Introduction to NetworkX - network analysis Vast amounts of network data are being generated and collected • Sociology: web pages, mobile phones, social networks • Technology: Internet routers, vehicular flows, power grids How can we analyze this networks? Introduction to NetworkX - Python awesomeness Introduction to NetworkX “Python package for the creation, manipulation and study of the structure, dynamics and functions of complex networks.” • Data structures for representing many types of networks, or graphs • Nodes can be any (hashable) Python object, edges can contain arbitrary data • Flexibility ideal for representing networks found in many different fields • Easy to install on multiple platforms • Online up-to-date documentation • First public release in April 2005 Introduction to NetworkX - design requirements • Tool to study the structure and dynamics of social, biological, and infrastructure networks • Ease-of-use and rapid development in a collaborative, multidisciplinary environment • Easy to learn, easy to teach • Open-source tool base that can easily grow in a multidisciplinary environment with non-expert users
- 
												  STATS 701 Data Analysis Using Python Lecture 27: Apis and Graph Processing Some Slides Adapted from CSTATS 701 Data Analysis using Python Lecture 27: APIs and Graph Processing Some slides adapted from C. Budak Previously: Scraping Data from the Web We used BeautifulSoup to process HTML that we read directly We had to figure out where to find the data in the HTML This was okay for simple things like Wikipedia… ...but what about large, complicated data sets? E.g., Climate data from NOAA; Twitter/reddit/etc.; Google maps Many websites support APIs, which make these tasks simpler Instead of scraping for what we want, just ask! Example: ask Google Maps for a computer repair shop near a given address Three common API approaches Via a Python package Service (e.g., Google maps, ESRI*) provides library for querying DB Example: from arcgis.gis import GIS Via a command-line tool Ultimately, all three of these approaches end up submitting an Example: twurl https://developer.twitter.com/ HTTP request to a server, which returns information in the form of a Via HTTP requests JSON or XML file, typically. We submit an HTTP request to a server Supply additional parameters in URL to specify our query Example: https://www.yelp.com/developers/documentation/v3/business_search * ESRI is a GIS service, to which the university has a subscription: https://developers.arcgis.com/python/ Web service APIs Step 1: Create URL with query parameters Example (non-working): www.example.com/search?key1=val1&key2=val2 Step 2: Make an HTTP request Communicates to the server what kind of action we wish to perform https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods Step 3: Server returns a response to your request May be as simple as a code (e.g., 404 error)..
- 
												  Fastnet: an R Package for Fast Simulation and Analysis of Large-Scale Social NetworksJSS Journal of Statistical Software November 2020, Volume 96, Issue 7. doi: 10.18637/jss.v096.i07 fastnet: An R Package for Fast Simulation and Analysis of Large-Scale Social Networks Xu Dong Luis Castro Nazrul Shaikh Tamr Inc. World Bank Cecareus Inc. Abstract Traditional tools and software for social network analysis are seldom scalable and/or fast. This paper provides an overview of an R package called fastnet, a tool for scaling and speeding up the simulation and analysis of large-scale social networks. fastnet uses multi-core processing and sub-graph sampling algorithms to achieve the desired scale-up and speed-up. Simple examples, usages, and comparisons of scale-up and speed-up as compared to other R packages, i.e., igraph and statnet, are presented. Keywords: social network analysis, network simulation, network metrics, multi-core process- ing, sampling. 1. Introduction It has been about twenty years since the introduction of social network analysis (SNA) soft- ware such as Pajek (Batagelj and Mrvar 1998) and UCINET (Borgatti, Everett, and Freeman 2002). Though these software packages are still existent, the last ten years have witnessed a significant change in the needs and aspiration of researchers working in the field. The growth of popular online social networks, such as Facebook, Twitter, LinkedIn, Snapchat, and the availability of data from large systems such as the telecommunication system and the inter- net of things (IoT) has ushered in the need to focus on computational and data management issues associated with SNA. During this period, several Python (Van Rossum et al. 2011) and Java based SNA tools, such as NetworkX (Hagberg, Schult, and Swart 2008) and SNAP (Leskovec and Sosič 2016), and R (R Core Team 2020) packages, such as statnet (Hunter, Handcock, Butts, Goodreau, and Morris 2008; Handcock et al.
- 
												  Network Visualization with IgraphUsing igraph for Visualisations Dr Jamsheed Shorish The Australian National University [email protected] 15 July 2013 - Beihang University Introduction I igraph is a network analysis and visualisation software package, currently for R and Python. I It can be found at igraph.sourceforge.net. I For our course, we will be using igraph for R . Screenshot from the igraph website, depicting an Erdős - Rényi graph and associated colour-coded components, or clusters. Installing igraph for R I Installation of igraph for R is very simple–the command is: > install.packages( 'igraph ') I You may need to specify a particular directory if e.g. you don’t have privileges to install to the system location for R : > install.packages( 'igraph ', lib='my/ package/location ') I To load the library, use > library( 'igraph ',lib) I or > library( 'igraph ', lib. loc='my/package/ location ') Loading a Dataset I The first thing to do is to get some data! I For consistency I’ll assume that all data is loaded in graphml format. I This can be exported by the Python networkx package. I To load a network dataset from a file, use: > G = read.graph( 'network.graphml ', format ='graphml ') I Confirm that your dataset is recognised by igraph : > G IGRAPHD-W- 560 1257 -- \ + attr: label (v/c), id (v/c), weight (e/n ), id (e/c) Layout of a Graph I To visualise a network well, use the layout function of igraph to specify the layout prior to plotting. I Different networks work best with different layouts–this is more art than science.
- 
												  A Comparative Analysis of Large-Scale Network Visualization ToolsA Comparative Analysis of Large-scale Network Visualization Tools Md Abdul Motaleb Faysal and Shaikh Arifuzzaman Department of Computer Science, University of New Orleans New Orleans, LA 70148, USA. Email: [email protected], [email protected] Abstract—Network (Graph) is a powerful abstraction for scalability for large network analysis. Such study will help a representing underlying relations and structures in large complex person to decide which tool to use for a specific case based systems. Network visualization provides a convenient way to ex- on his need. plore and study such structures and reveal useful insights. There exist several network visualization tools; however, these vary in One such study in [3] compares a couple of tools based on terms of scalability, analytics feature, and user-friendliness. Due scalability. Comparative studies on other visualization metrics to the huge growth of social, biological, and other scientific data, should be conducted to let end users have freedom in choosing the corresponding network data is also large. Visualizing such the specific tool he needs to use. In this paper, we have chosen large network poses another level of difficulty. In this paper, we five widely used visualization tools for an empirical and identify several popular network visualization tools and provide a comparative analysis based on the features and operations comparative analysis. Our comparisons are based on factors these tools support. We demonstrate empirically how those tools such as supported file formats, scalability in visualization scale to large networks. We also provide several case studies of and analysis, interacting capability with the drawn network, visual analytics on large network data and assess performances end user-friendliness (e.g., users with no programming back- of the tools.
- 
												  Luatex LunaticE34 MAPS 39 Luigi Scarso LuaTEX lunatic And Now for Something Completely Different Examples are now hosted at contextgarden [35] while – Monty Python, 1972 [20] remains for historical purposes. Abstract luatex lunatic is an extension of the Lua language of luatex to permit embedding of a Python interpreter. Motivations & goals A Python interpreter hosted in luatex allows macro pro- TEX is synonymous with portability (it’s easy to im- grammers to use all modules from the Python standard li- plement/adapt TEX the program) and stability (TEX the brary, allows importing of third modules, and permits the language changes only to Vx errors). use of existing bindings of shared libraries or the creation of We can summarize by saying that “typesetting in T X new bindings to shared libraries with the Python standard E module ctypes. tends to be everywhere everytime.” Some examples of such bindings, particularly in the area of These characteristics are a bit unusual in today’s scientific graphics, are presented and discussed. scenario of software development: no one is surprised Intentionally the embedding of interpreter is limited to the if programs exist only for one single OS (and even for python-2.6 release and to a luatex release for the Linux op- a discontinued OS, given the virtualization technology) erating system (32 bit). and especially no one is surprised at a new release of a program, which actually means bugs Vxed and new Keywords features implemented (note that the converse is in some Lua, Python, dynamic loading, ffi. sense negative: no release means program discontinued).
- 
												  A Walk on Python-IgraphA Walk on Python-igraph Carlos G. Figuerola - ECyT Institute University of Salamanca - 17/04/2015 Carlos G. Figuerola: A walk on Python-igraph i-graph · i-graph is a library · written in C++ (fast and memory e�cient) · a tool for programmers · it works from programs in R, Python (and C++, btw) here, we are going to work with Python + i-graph A walk on Python-igraph 3/46 Carlos G. Figuerola: A walk on Python-igraph Python · a programming language intended for scripting · interpreted language (semi-compiled/bytecodes, actually) · it is free software (runing on Linux, Win, Mac machines) · it has a lot of modules for no matter about wich task · it is very intuitive but ... it has powerfull data structures, but complex A walk on Python-igraph 4/46 Carlos G. Figuerola: A walk on Python-igraph Installing i-graph · Web Home at http://igraph.org/ · in general you will need Python and a C/C++ compiler. Use of pip or easy_install is more convenient · Linux: it is in main repositories, better install from there · Win: there are (unno�cial) installers in: http://www.lfd.uci.edu/~gohlke /pythonlibs/#python-igraph (see instructions on top of this page) · Mac: see https://pypi.python.org/pypi/python-igraph A walk on Python-igraph 5/46 Carlos G. Figuerola: A walk on Python-igraph i-graph: pros & cons · it can deal with big amounts of data · it has a lot of measures, coe�cients, procedures and functions · it is �exible and powerfull, can be used with ad-hoc programs/scripts or in interactive way (commands from a text terminal) · no (native) graphic interface, no mouse nor windows · not very good branded graphics (although yo can import/export easily from/to another software) A walk on Python-igraph 6/46 Carlos G.
- 
												  Constructing 5-Chromatic Unit Distance Graphs Embedded in the Euclidean Plane and Two-Dimensional SpheresConstructing 5-chromatic unit distance graphs embedded in the Euclidean plane and two-dimensional spheres V.A. Voronov,∗ A.M. Neopryatnaya,† E.A. Dergachev‡ [email protected] June 24, 2021 Abstract This paper is devoted to the development of algorithms for finding unit distance graphs with chromatic number greater than 4, embedded in a two-dimensional sphere or plane. Such graphs provide a lower bound for the Nelson–Hadwiger problem on the chromatic number of the plane and its generalizations to the case of the sphere. A series of 5-chromatic unit distance graphs on 64513 vertices embedded into the plane is constructed. Unlike previously known examples, these graphs do not contain the Moser spindle as a subgraph. The construction of 5-chromatic graphs embedded in a sphere at two values of the radius is given. Namely, the 5-chromatic unit distance graph on 372 vertices embedded into the circumsphere of an icosahedron with a unit edge length, and the 5-chromatic graph on 972 vertices embedded into the circumsphere of a great icosahedron are constructed. 1 Introduction This paper is devoted to one of the questions related to the Nelson–Hadwiger problem on the chromatic number of the plane [15, 36], namely the development of algorithms that allow constructing new examples of 5-chromatic unit distance graphs embedded in a plane or a sphere, based on a given 4-chromatic unit distance subgraph. The chromatic number of a subset of the Euclidean space X ⊆ Rn is the minimum number of colors needed to color X, so that the points u, v at a unit distance apart are arXiv:2106.11824v2 [math.CO] 23 Jun 2021 colored differently, i.e.
- 
												  Ucod - Simplifying Supply Chain Structures in the BrowserUCoD - Simplifying Supply Chain Structures in the Browser Alex Trostanovsky and Nikola Cucuk Fig. 1: The Underutilized Constraint Detector. Abstract— Visualizing supply chain networks is difficult but important. Analyzing supply chains as node-link graphs can reveal topological information about the dependencies that exist between parts and the resources those parts depend on. In previous work, we developed a heuristic algorithm that detects underutilized components in a supply chain. In this work, we present UCoD: the Underutilized COnstraint Detector, a tool that highlights the output of this heuristic algorithm, and allows supply chain planners to visualize the structure of their supply chain, and help them identify underutilization and anomalous patterns and trends in their network. UCoD consists of a juxtaposed view of a node-link diagram and an adjacency matrix of the supply chain graph, and a superimposed line chart that allows supply chain planners to inspect time-series of utilization for the nodes in their network. This functionality allows supply chain planners to verify the output of our heuristic algorithm and simplify their product structures. Index Terms—Supply chain management, Supply chain visualization 1 INTRODUCTION Kinaxis is a supply chain management company that models the prod- into structures known as Calculation Families. uct structures of its customers using graphs. A supply chain product Inclusion in calculation families is a transitive operation that can re- structure is a complete description of how something gets made. To sult in the merging of families. For example, if two calculation families schedule a product structure, one must consider all of its constituent contain parts that share common constraints, then the two families will parts.
- 
												  Igraph – a Package for Network Analysisigraph { a package for network analysis G´abor Cs´ardi [email protected] Department of Medical Genetics, University of Lausanne, Lausanne, Switzerland Outline 1. Why another graph package? 2. igraph architecture, data model and data representation 3. Manipulating graphs 4. Features and their time complexity igraph { a package for network analysis 2 Why another graph package? • graph is slow. RBGL is slow, too. 1 > ba2 # graph & RBGL 2 A graphNEL graph with undirected edges 3 Number of Nodes = 100000 4 Number of Edges = 199801 ? Why another graph package? • graph is slow. RBGL is slow, too. 1 > ba2 # graph & RBGL 2 A graphNEL graph with undirected edges 3 Number of Nodes = 100000 4 Number of Edges = 199801 5 > system.time(RBGL::transitivity(ba2)) ? 6 user system elapsed 7 7.517 0.000 7.567 Why another graph package? • graph is slow. RBGL is slow, too. 1 > ba2 # graph & RBGL 2 A graphNEL graph with undirected edges 3 Number of Nodes = 100000 4 Number of Edges = 199801 5 > system.time(RBGL::transitivity(ba2)) ? 6 user system elapsed 7 7.517 0.000 7.567 8 > summary(ba) # igraph 9 Vertices: 1e+05 10 Edges: 199801 11 Directed: FALSE 12 No graph attributes. 13 No vertex attributes. 14 No edge attributes. Why another graph package? • graph is slow. RBGL is slow, too. 1 > ba2 # graph & RBGL 2 A graphNEL graph with undirected edges 3 Number of Nodes = 100000 4 Number of Edges = 199801 5 > system.time(RBGL::transitivity(ba2)) ? 6 user system elapsed 7 7.517 0.000 7.567 8 > summary(ba) # igraph 9 Vertices: 1e+05 10 Edges: 199801 11 Directed: FALSE 12 No graph attributes.
- 
												![Rcy3: Network Biology Using Cytoscape from Within R[Version 1; Peer Review: 2 Approved]](https://docslib.b-cdn.net/cover/1542/rcy3-network-biology-using-cytoscape-from-within-r-version-1-peer-review-2-approved-3331542.webp)  Rcy3: Network Biology Using Cytoscape from Within R[Version 1; Peer Review: 2 Approved]F1000Research 2019, 8:1774 Last updated: 04 AUG 2021 SOFTWARE TOOL ARTICLE RCy3: Network biology using Cytoscape from within R [version 1; peer review: 2 approved] Julia A. Gustavsen 1, Shraddha Pai 2, Ruth Isserlin 2, Barry Demchak3, Alexander R. Pico 4 1University of British Columbia, Vancouver, B.C., Canada 2The Donnelly Centre, University of Toronto, Toronto, ON, Canada 3Department of Medicine, University of California at San Diego, La Jolla, CA, USA 4Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA v1 First published: 18 Oct 2019, 8:1774 Open Peer Review https://doi.org/10.12688/f1000research.20887.1 Second version: 27 Nov 2019, 8:1774 https://doi.org/10.12688/f1000research.20887.2 Reviewer Status Latest published: 04 Dec 2019, 8:1774 https://doi.org/10.12688/f1000research.20887.3 Invited Reviewers 1 2 3 Abstract RCy3 is an R package in Bioconductor that communicates with version 3 Cytoscape via its REST API, providing access to the full feature set of (revision) Cytoscape from within the R programming environment. RCy3 has 04 Dec 2019 been redesigned to streamline its usage and future development as part of a broader Cytoscape Automation effort. Over 100 new version 2 functions have been added, including dozens of helper functions (revision) report report specifically for intuitive data overlay operations. Over 40 Cytoscape 27 Nov 2019 apps have implemented automation support so far, making hundreds of additional operations accessible via RCy3. Two-way conversion with version 1 networks from \textit{igraph} and \textit{graph} ensures 18 Oct 2019 report report interoperability with existing network biology workflows and dozens of other Bioconductor packages.
- 
												  Counting to One: Reducibility of One-And Two-Loop Amplitudes at ThePublished for SISSA by Springer Received: July 16, 2012 Revised: October 8, 2012 Accepted: November 9, 2012 Published: December 7, 2012 Counting to one: reducibility of one- and two-loop amplitudes at the integrand level JHEP12(2012)038 Ronald H.P. Kleiss,a Ioannis Malamos,b,1 Costas G. Papadopoulosc,d and Rob Verheyena aRadboud University, Nijmegen, The Netherlands bInstituto de F´ısica Corpuscular, Consejo Superior de Investigaciones Cient´ıficas-Universitat de Val`encia, Parc Cient´ıfic, E-46980 Paterna (Valencia), Spain. cNCSR ‘Demokritos’, Agia Paraskevi, 15310, Greece dDepartment of Physics, Theory Division, CERN, 1211 Geneva 23, Switzerland E-mail: [email protected], [email protected], [email protected], [email protected] Abstract: Calculation of amplitudes in perturbative quantum field theory involve large loop integrals. The complexity of those integrals, in combination with the large number of Feynman diagrams, make the calculations very difficult. Reduction methods proved to be very helpful, lowering the number of integrals that need to be actually calculated. Especially reduction at the integrand level improves the speed and set-up of these calcu- lations. In this article we demonstrate, by counting the numbers of tensor structures and independent coefficients, how to write such relations at the integrand level for one− and two−loop amplitudes. We clarify their connection to the so-called spurious terms at one loop and discuss their structure in the two−loop case. This method is also applicable to higher loops, and the results obtained apply to both planar and non-planar diagrams. Keywords: QCD Phenomenology, NLO Computations ArXiv ePrint: 1206.4180 1Supported by REA Grant Agreement PITN-GA-2010-264564 (LHCPhenoNet), by the MICINN Grants FPA2007-60323, FPA2011-23778 and Consolider-Ingenio 2010 Programme CSD2007-00042 (CPAN).