Mastering Gephi Network Visualization
Total Page:16
File Type:pdf, Size:1020Kb
Mastering Gephi Network Visualization Produce advanced network graphs in Gephi and gain valuable insights into your network datasets Ken Cherven BIRMINGHAM - MUMBAI Mastering Gephi Network Visualization Copyright © 2015 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First published: January 2015 Production reference: 1220115 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78398-734-4 www.packtpub.com Credits Author Project Coordinator Ken Cherven Leena Purkait Reviewers Proofreaders Ladan Doroud Cathy Cumberlidge Miro Marchi Paul Hindle David Edward Polley Samantha Lyon Mollie Taylor George G. Vega Yon Indexer Monica Ajmera Mehta Commissioning Editor Ashwin Nair Graphics Abhinash Sahu Acquisition Editor Sam Wood Production Coordinator Conidon Miranda Content Development Editor Amey Varangaonkar Cover Work Conidon Miranda Technical Editors Shruti Rawool Shali Sasidharan Copy Editors Rashmi Sawant Stuti Srivastava Neha Vyas About the Author Ken Cherven is a Detroit-based data visualization and open source enthusiast, with 20 years of experience working with data and visualization tools. In addition to Gephi, he has worked with a variety of open source tools, including MySQL, SpagoBI, JasperServer, D3, Protovis, Omeka, QGIS, Leaflet, and Exhibit. He also has considerable experience using corporate software tools from Microsoft, Cognos, Tableau, and Oracle. An automotive analyst and visualizer by day, he spends much of his personal time turning baseball data into web-based visualizations housed on his website, http:// visual-baseball.com. He has previously authored Network Graph Analysis and Visualization with Gephi, Packt Publishing, as well as a self-published book, MLB Pennant Races, 1901-1968: A Visual Analysis of Baseball's Pennant Races, Visual-Baseball Press. His current areas of interest include visual dashboards, interactive networks, and anything involving geographic information. Acknowledgments I would like to thank the members of my family for their patience and understanding over the course of several months spent working on this book. This always starts with my wife, Karen, and extends to my children, Kellen, Kristopher, and Katie, as well as my always helpful mother-in-law, Carole Young. This book would not have been possible without the considerable efforts of a group of thorough technical and content editors. I would like to sincerely thank Mollie Taylor, Ladan Doroud, Miro Marchi, Ted Polley, George Vega Yon, Marta Castellani, and Manasi Pandire for their considerable efforts to make this the best possible book. All of your input has been noted, and many improvements have been incorporated. A special thanks also to Amey Varangaonkar at Packt Publishing for managing the entire process while also making recommendations that will result in a more enjoyable reading experience. Thanks also to others who helped in the early stages by providing useful feedback to get the book started. This list includes Joanne Fitzpatrick and Richard Gall at Packt Publishing, plus Gephi community members, Randy Novak, Mike Hughes, Matthieu Totet, Marco Valli, Gerry Wilson, and Carlos Benito Amat. Finally, I would like to thank the creators and maintainers of Gephi for providing such a powerful tool that allows users to explore the fascinating world of network science. Thanks also to the growing community of enthusiasts who use Gephi to create some remarkable visualizations. My hope is that this book will make it easier for you to tap into the power of Gephi and, perhaps, even provide a few new approaches to leverage this powerful tool. About the Reviewers Ladan Doroud is a PhD candidate at the University of California, Davis. She received her master's degree in computer science from the same university in 2013. She is currently working on her PhD in computer science in Prof. Eisen's lab as a computational biologist and data scientist. Her research interests mainly lie in the area of large-scale network analysis, clustering and data mining with special focus on community detection, and function prediction of protein sequences in large-scale biological networks. She has an extensive background in learner-centered education, including her collaboration with Udacity, Inc. in 2014 as a course manager on the data science track, as well as her collaboration with the California State Summer School for Mathematics and Science (COSMOS) in 2011. She can be reached at [email protected]. Miro Marchi is a PhD candidate at the University of Verona, Italy. He received his master's degree in cultural anthropology, ethnology, and ethnolinguistics from Ca' Foscari University of Venice in 2010. He has authored Self-Governance Lessons from Bali and Stephen Lansing, Cangiani M. (ed.), Alternative Approaches to Development, Cleup, 2012, where he has reviewed the research of the interdisciplinary team coordinated by the anthropologist, Stephen J. Lansing, on farmers' cooperation network for rice cultivation in Bali. His current research focuses on finding practical ways to foster the emergence of self-organization in social-economic networks. He is applying ethnographic methods coupled with community-based online network visualization, which is built with Drupal and D3 and available at www.retebuonvivere.org/rete, and he is interested in the use of complexity theory for sustainability and the commons. He can be reached at [email protected]. David Edward Polley is a social sciences librarian at Indiana University-Purdue University Indianapolis (IUPUI). Prior to joining IUPUI, he worked as a researcher at the Cyberinfrastructure for Network Science Center in the Indiana University School of Informatics and Computing, Bloomington. He is interested in the various ways people use data, generated in social science research. He is the coauthor of a book on data visualization with Dr. Katy Börner titled, Visual Insights: A Practical Guide to Making Sense of Data. Mollie Taylor is the President of Proximity Viz LLC, located in Atlanta, Georgia, USA, which provides data visualization and mapping services to a wide range of clients. She holds degrees in economics and international affairs from the Georgia Institute of Technology. Her blog on programming for data analysis can be found at http://blog.mollietaylor.com/. George G. Vega Yon is currently a PhD student at the California Institute of Technology. He holds a BA degree in business administration and an MA degree in economics and public policy from Adolfo Ibáñez School of Government (Chile). He is the author of several R and Stata modules, including ABCoptim: Implementation of Artificial Bee Colony (ABC) Optimization, rgexf: an R package to work with GEXF graph files, and Introducing PARALLEL: Stata module for parallel computing. He has shown a deep interest in statistical computing and data visualization; furthermore, he is the founder of the Chilean R-Users Group (useR). He is the cofounder of the entrepreneurship, NodosChile.org Social Network Analysis, one of the first companies in Chile to put the eye on applied SNA analysis. George's scholarly interests are focused on policy analysis, complexity and statistical computing—recognized by the community, as he has served as a reviewer of the Journal of Computational Economics. www.PacktPub.com Support files, eBooks, discount offers, and more For support files and downloads related to your book, please visit www.PacktPub.com. Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub. com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details. At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks. TM https://www2.packtpub.com/books/subscription/packtlib Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books. Why subscribe? • Fully searchable across every book published by Packt • Copy and paste, print, and bookmark content • On demand and accessible via a web browser Free access for Packt account holders If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access. Table of Contents Preface 1 Chapter 1: Fundamentals of Complex Networks and Gephi 7 Graph applications 8 Collaboration graphs 8 Who-talks-to-whom