Recent Large Graph Visualization Tools : Areview
Total Page:16
File Type:pdf, Size:1020Kb
Information and Media Technologies 8(4): 944-960 (2013) reprinted from: Computer Software 30(2): 159-175 (2013) © Japan Society for Software Science and Technology Recent Large Graph Visualization Tools : AReview Sorn Jarukasemratana Tsuyoshi Murata Large graph visualization tools are important instruments for researchers to understand large graph data sets. Currently there are many tools available for download and use under free license,others in research papers or journals, each with its own functionalities and capabilities. This review focuses on giving an intro- duction to those large graph visualization tools and emphasizes their advantages over other tools. Criteria for selection of the tools being reviewed are it was recently published (2009 or later), or a new version was released during the last two years. The tools being reviewed in this paper are igraph, Gephi, Cytoscape, Tulip, WiGis, CGV, VisANT, Pajek, In Situ Framework, Honeycomb and two visualization toolkits which are JavaScript InfoVis Toolkit and GraphGL. The last part of the review presents our suggestion on building large graph visualization platform based on advantages of tools and toolkits that are being reviewed. platform. In the first part (section 1), visualization 1 Introduction tools are reviewed. The review does not cover ev- Large graph visualization and analysis are proven erything that the tool can do, but focuses more on to be vital in many research fields such as sociology, its advantages compared to other tools. In toolkits biology, and computer science. Visualization tools part (section2), introduction to very recent toolkits are important instruments for researchers to make are given. The toolkits being reviewed are already an understanding of large graph data sets. Cur- creating an impact in visualization field and are still rently, there are many tools available with each its in very active development. The last part (section own strengths and weaknesses, advantages and dis- 3) presents our suggestion on large graph visualiza- advantages. Some tools are available to download tion platform which is based on advantages of tools under free license. Some tools have appeared in re- and toolkits that being reviewed. search papers but are not available for public use. There is a related research about visualization In this review, we introduced the tools and toolkits tools; GSVR by Bruno Pinaud and Pascale Kuntz that are in active development which are capable of [32], that provides an on-line guide for choosing handling graphs with more than 10,000 nodes. The graph visualization software. The website asks for selection criteria are that it was published recently requirements such as graph size, graph type, or de- (2009 or later) or within two years of a new version sired analysis functionalities, then shows the search release. results of suitable visualization tools. However, The review consists of three parts; tools, toolk- GSRV only provides tools that are available for its and our suggestion on large graph visualization download and excludes those that only appear in research papers or academic conferences. More- 最近の大規模グラフ可視化ツールレビュー over, some tools are outdated, not being actively Sorn Jarukasemratana, 村田剛志, 東京工業大学,Tokyo developed, or no longer available for download. Institute of Technology. Therefore, this review focuses on recent tools from コンピュータソフトウェア,Vol.30, No.2 (2013),pp.159–175. research papers and conferences or tools which are [解説論文] 2012 年 3 月 30 日受付. still being actively developed that contain advan- 944 Information and Media Technologies 8(4): 944-960 (2013) reprinted from: Computer Software 30(2): 159-175 (2013) © Japan Society for Software Science and Technology tages that distinguish it from others. their needs. In this section, we aim to provide a In addition to the visualization tools that we re- short introduction of visualization tools and their viewed in this paper, there are some other tools capabilities in order to assist other researchers to that are worth mentioning, although they do not select the right visualization tools for their projects. satisfy our selection criteria. LGL (Large Graph Each tool has its own characteristics such as Layout) [43][48] is one of the oldest tools which can input types of the dataset, visual representation be used to dynamically visualize large networks on styles, interaction techniques, analytic tasks, or al- the order of hundreds of thousands of vertices and gorithms that it can perform. Graphs with time at- millions of edges. Web Community Chart [47] is a tribute or dynamic graphs are also important when tool for navigating the Web and for observing its selecting a tool because dynamic graphs will require evolution through web communities. It shows the much more visualization space. evolutions of communities that are extracted from The challenge of visualizing large graphs is that the snapshots of the whole Japanese Web archives there are too many nodes and edges to be displayed. that are crawled every year. Koshida et al. [44] de- This can lead to many problems; for example, mem- veloped Social Cosmo Browser, a tool for visualiz- ory shortage, long computation time, or insufficient ing large graphs on the order of millions of vertices. display space. Users’ cognitive ability is also lim- Communities are extracted from given graph, and ited, which can be an obstacle for visualizing large they are projected to 3D space based on MDS (mul- graphs. Two main strategies for countering com- tiple dimensional scaling). As an attempt for visu- putational problems are 1) finding more process- alizing temporal changes of social media, Itoh et ing power and 2) inventing better methods or algo- al. [45] developed a 3D visualization system which rithms to compute the data. Finding better hard- enables users to select a topic, find some events ware is also an alternative; such as parallel comput- that are related to the topic within a specific time ing with super computer, grid or cloud computing, span, and drill down the details of the events. Mat- or using GPU to assist drawing. Inventing new subayashi et al. [46] developedaninteractivereal- methods mostly means either creating new algo- time graph visualization system using HTML5 and rithms or modifying existing ones to achieve better WebCL. Their fast approximate method of force- results with the same processing power. Lack of directed placement enables the visualization with display space can be easily solved by making the time complexity O(n), where n is the number of display bigger such as wall display or table top dis- vertices in a graph. play. These strategies have their advantages be- cause users can observe the entire graph while see- ing all the details. However, some very huge data 2 Large graph visualization tools sets such as Facebook accounts or Twitter users Visual representation of graphs has high impact are not able to be displayed even with a very large on graph analysis. New visualization can bring new screen. Therefore, the next strategy for displaying discovery or new knowledge about the data set [41]. large data sets is to divide those data into many Each year, according to state-of-the-art report on pieces and display only the part that is critical visual analysis of large graphs by Landesberger et for users, such as selecting one node and display- al. [40], new graph visualizing applications or new ing only the nodes that are connected with that visualization techniques are presented, each with node. With this method, we have the local view its own purposes and advantages. Moreover, some of a small fraction of the entire graph but the full visualization tools can be enhanced with the plug- view of the graph is lost. Another method is to ins that are constructed to add more functionality. abstract the data and display the whole graph by Many researchers construct plugins for visualiza- sacrificing some or most of the details. Many tech- tion tools and then use that plugin to achieve their niques are used for this strategy such as clustering goals. the nodes and displaying the clusters, or transform- Constructing new visualization tools or plugins is ing data into charts or matrixes. With this method, a time consuming and complex task. It is perhaps overview of the graph is visible but the details are better to use existing tools in research if they meet lost. 945 Information and Media Technologies 8(4): 944-960 (2013) reprinted from: Computer Software 30(2): 159-175 (2013) © Japan Society for Software Science and Technology Furthermore, time-varying graphs or dynamic graphs are gaining much more attention in research fields recently. In normal graphs, connections be- tween nodes and edges are the main point of con- cern. However, in dynamic graphs, time is another element that needs to be considered. In dynamic graphs, node and edge attributes can change over time or even graph structures can evolve. Display- Fig. 1 Example of Cytoscape images (a) ing the large static graphs in full detail is diffi- main window (b) large graph visual- cult enough, displaying the whole dynamic graph is ization by Cytoscape much harder. In order to display dynamic graphs in a limited space, some strategies are needed. The Cytoscape is its large user society that keeps con- first strategy is to select time, and display only the tributing for making more plugins. Cytoscape also graphs of that time. This strategy is mostly used received huge compliments from outside of bioinfor- with animations to display the changing of graphs matics field when it was chosen as the winner for in each specific time or with time scroll bar and large scale graph visualization tools held by UM- let users control the flow of time themselves while BEL [3] from 26 large graph visualization tools. showing the changes in graphs. Another strategy Cytoscape’s core functions are mainly about vi- is to abstract the time such as changing from 2D sualization and basic analysis of graphs.