Gephi and Cytoscape

Network Visualization: Gephi and Cytoscape Caf'E.phe, février 2016 Pablo Ruiz Fabo — LATTICE [email protected] Network visualization • Requires relational data [ http://cvcedhlab.hypotheses.org/125 ] . 2 . Network analysis • Some terminology: [ http://cvcedhlab.hypotheses.org/106 ] . 3 Network analysis • Network: composed of nodes, linked by edges • Nodes represent actors in our domain – People, characters, concepts, places, … • Edges encode the relation between the nodes – Interacting with someone, citing someone’s work, occurring in the same paragraph, … • Edges can be weighted: encodes importance of the link – E.g. How many times did this link occur? • Edges can bear direction or not: – [Being a correspondent] vs. [being the sender vs. being the addressee of a letter] 4 Objectives • Create an co-occurrence network visualization with Gephi and Cytoscape, for two corpora: – History corpus on the American crisis of 2008 • A CSV file representing the network’s edges was used – Philosophy corpus: Jeremy Bentham’s manuscripts. • For Gephi, a GEXF file representing the network was used • For Cytoscape , a Graphml file representing the network was used (it can also be used for Gephi) • Export a navigable network so that it can be visualized outside these tools [ Some example files to import or create networks with, and example exported networks are available at apps.lattice.cnrs.fr/nav/cafephe11 ] 5 2008 Crisis Corpus: PoliInformatics Smith et al. (2014) [12] 6 Bentham Corpus Jeremy Bentham: Philosopher, social reformer (1748-1832, London) Transcribe Bentham (Causer & Terras, 2014) [13] • UCL (London) • Unpublished manuscripts transcribed by volunteers (crowdsourcing) Image: blogs.ucl.ac.uk/transcribe-bentham/ • 30,000 pages 7 8 Gephi version • This presentation covers Gephi 0.9, which came out in December 2015, and which works with Java 8 or 7 • Most training materials on Gephi are about version 0.8.2 (worked with Java 7, NOT 8) • Small UI changes between 0.8.2 and 0.9 Cytoscape version • Cytoscape 3.3.0, works with Java 8, NOT 7 9 10 Import Edges table (1) • Start Gephi and go to Data Laboratory. You may need to close the Projects popup. Do File / New Project • Click on Import Spreadsheet and search in the materials for a file whose name ends with “edges.csv”. Import it as an Edges table 11 Import Edges table (2) 1. Import Edge Table Weight and Create missing nodes must be checked in the dialogue 2. Once the table is imported, create labels by copying ID with the “Copy data to another column” tab in the bottom row 2a 2b 12 Initial Network • Click on the Overview tab to see the initial, not spatialized network: 13 Saving and exporting a project • It is advisable to both save and export a project To save a project, just click on Save, as would be expected. It will be saved as a project file with the .gephi extension (it’s a sort of zip file) Additionally, also export the network as a graph file for safety 14 Network Layout (1) • Run the Force Atlas layout, with these settings: 1. Choose the Layout 2. Specify Settings Determines how far apart nodes will be, thus affecting the readability of the network (how wide it will spread) Helps avoid label overlap (but there are other means for this too) In force-based layouts (like Forced Atlas or Forced Atlas 2), linked nodes attract each other and unrelated nodes are represented as further apart. See [3] and [8]. 15 Network Layout (2) • Once the network stabilizes, you can stop Force Atlas. • The initial layout will look similar to below • The zoom slider can be used to see more or less of the network Toggle bottom pane here Zoom 16 Node and Edge Appearance • In Gephi 0.9, unlike in 0.8.2, there are two modes for node and edge appearance, Unique and Attribute-based 17 Node and Edge Appearance • In Gephi 0.9, unlike in 0.8.2, there are two modes for node and edge appearance, Unique and Attribute-based • Colour • Size • Label colour • Label size Attributes correspond to properties of nodes and edges, reflecting their role in the network as per different metrics 18 Node Size • Different types of metrics can be encoded in the node size. Here, we use a node’s Degree (how many nodes it is connected to) In the Appearance tab, choose After applying the ranking, node size the Nodes and Attribute buttons: will reflect the ranking criterion. In and then: this case, more strongly connected - Degree in the dropdown menu nodes will be bigger - The CIRCLES icon for node size in the button bar, hit Apply For information on other ranking criteria, see [4] 19 Node Labels (1) • Other Node Label settings can be accessed from the bottom panel, that can be toggled here • If at any point node labels overlap, this can be fixed by running the Label Adjust layout 20 Node Labels (2) • Label Sizes are defined with the leftmost button - In scaled mode, all labels bear the - In node size mode, label size same size, scaled for readability matches node size - In fixed mode, all labels bear the size - Run Label Adjust Layout in case of specified in the font dropdown label overlap (Dialog bold 32 in the example) • Label Colour is defined with the rightmost button21 Community Detection: Modularity • The modularity tool can be run to detect communities, i.e. groups of nodes that are more strongly connected among them than they are to other groups of nodes [9]. 1. In the Statistics pane on the right, look for Modularity and hit Run 2. Go to the Partition tab on the left, select Modularity Class from the dropdown, and hit Apply. The colors can be changed by clicking and right-clicking inside the colored square, or with the Palette button 22 Community Detection: Modularity • The modularity tool can be run to detect communities, i.e. groups of nodes that are more strongly connected among them than they are to other groups of nodes [9]. 1. In the Statistics pane on the right, look for Modularity and hit Run 2. Go to the Partition tab on the left, select Modularity Class from the dropdown, and hit Apply. The colors can be changed by clicking and right-clicking inside the colored square, or with the Palette button 23 Preview Pane Preview after applying a node size criterion and community detection Settings are default, unless specified on the screenshot. Show Labels was activated Edge Thickness was reduced to 0.2 to avoid too thick edges on highly connected nodes 24 Hit Refresh after any changes to the Settings or to reset an unreadable preview pane Filters • The network can be filtered according to many criteria (see [6]). Here, we filter nodes that have less than six connections, to get rid of generally less relevant nodes and edges Expand the Topology dropdown - Double click on Degree Range - Move the slider at the bottom up up to the desired minimum degree 25 Exporting visualization as PDF or image • In the Preview pane, there’s a button to export the visualization (bottom left) 26 Export visualization as an interactive website: sigma.js exporter (1) • Gephi has several plugins that allow exporting the network in an interactive website format. – The website allows zooming in and out – In some cases, the user can selectively focus a part of the network and run searches for nodes • We’ll be using the sigma.js exporter plugin [10], which has all of the functions above. Depending on your browser, it may need to be run inside a web server (Apache, XAMPP, Wamp, EasyPHP etc.) • Other plugins allowing some of the above functions: – Seadragon plugin – Google Maps Exporter 27 Network as a website (2): sigma.js • We need to do three things: – Install the sigma.js exporter plugin – Export the network as a sigma.js site – Make the site available from a web server • To install the plugin: – Go to Tools/Plugins, and select Sigma Exporter in the Available Plugins tab (once installed, it will move to the Installed tab) 28 Network as website (3): Exporting 1. Export the network from • Jafkaj File/Export and Sigma.js template 2. Fill in the dialogue: Give the path to folder to export the site to, and the legend to be displayed for the site’s data 29 Network as website (4): Web Server • We need to take the exported site from the previous step and put it in a web server. Note: some browsers (e.g. Firefox) allow seeing the networks just by opening the index.html file, no need for the local web server • If you don’t have a web server installed, a possibility is to install XAMPP https://www.apachefriends.org – Windows: • https://blog.udemy.com/xampp-tutorial/ • https://www.apachefriends.org/faq_windows.html – Linux: https://www.apachefriends.org/faq_linux.html – Mac: https://www.apachefriends.org/faq_osx.html • Once you have the web server, to see the network, point a browser to http://localhost/XXX , where XXX corresponds to the name of your sigma.js network (by default the name is network when Gephi exports it). 30 Network as website (5): Config If edges on the exported network are too thin and node labels are not visible Look for config.json inside the folder where the sigma.js site was exported (network by default) - Increase minEdgeSize and maxEdgeSize for thicker edges - Decrease labelThreshold to see more labels 31 32 Import the network or edges file The example involves the graphml network for the Bentham corpus. Other graphml networks are available in the materials and can be manipulated similarly.

Load more