Connected Components, and Pagerank

Total Page:16

File Type:pdf, Size:1020Kb

Connected Components, and Pagerank COMP4650 Connected Components, and Pagerank Lexing Xie Research School of Computer Science, ANU Lecture slides credit: Lada Adamic, U Michigan Jure Leskovec, Stanford, Andreas Haeberlen, UPenn Connected components • Strongly connected components – Each node within the component can be reached from every other node in the component by following directed links B F n Strongly connected components C G n B C D E A n A n G H H n F D E n Weakly connected components: every node can be reached from every other node by following links in either direcRon n Weakly connected components B F n A B C D E C G n G H F A H D n In undirected networks one talks simply E about ‘connected components’ by Lada Adamic, U Michigan Strongly Connected Component (SCC) Proof (*) Giant component • if the largest component encompasses a significant fracRon of the graph, it is called the giant component by Lada Adamic, U Michigan Why not? (*) Bow-Tie Structure of the Web Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata, Andrew Tomkins, and Janet Wiener. 2000. Graph structure in the Web. Comput. Netw. 33, 1-6 (June 2000), 309-320. Not Everyone Asks/Replies The Web is a bow Re The Java Forum network is an uneven bow Re • Core: A strongly connected component, in which everyone asks and answers • IN: Mostly askers. • OUT: Mostly Helpers Jun Zhang, Mark S. Ackerman, and Lada Adamic. 2007. ExperRse networks in online communiRes: structure and algorithms. In Proceedings of the 16th internaonal conference on World Wide Web (WWW '07)., 221-230. Minkyoung Kim, Lexing Xie, Peter Christen, Event Diffusion Paerns in Social Media (2012) Intl. Conf. on Weblogs and Social Media (ICWSM), 8 pages, Dublin, Ireland, June 2012 Modern Informaon Retrieval: The Concepts and Technology behind Search Ricardo Baeza-Yates, Berthier Ribeiro-Neto Addison-Wesley Professional; 2 ediRon (February 10, 2011) Google's PageRank (Brin/Page 98) • A technique for esRmang page quality – Based on web link graph • Results are combined with IR score – Think of it as: TotalScore = IR score * PageRank – In pracRce, search engines use many other factors (for example, Google says it uses more than 200) 11 A. Haeberlen , Z. Ives, U Penn Shouldn't E's vote be worth more than F's? PageRank: IntuiRon G A H E B How many levels I C should we consider? F J D • Imagine a contest for The Web's Best Page – IniRally, each page has one vote – Each page votes for all the pages it has a link to – To ensure fairness, pages voRng for more than one page must split their vote equally between them – VoRng proceeds in rounds; in each round, each page has the number of votes it received in the previous round – In pracRce, it's a lihle more complicated - but not much! 12 Source: A. Haeberlen , Z. Ives, U Penn PageRank • Each page i is given a rank xi • Goal: Assign the xi such that the rank of each page is governed by the ranks of the pages linking to it: 1 x x i = ∑ j j∈Bi N j Rank of page i Rank of page j Number of How do we compute links out the rank values? Every page j that links to i from page j 13 Source: A. Haeberlen , Z. Ives, U Penn Iterave PageRank (simplified) Initialize all ranks to (0) 1 be equal, e.g.: x = i n 1 Iterate until x(k+1) x(k) convergence i = ∑ j j∈Bi N j 14 Source: A. Haeberlen , Z. Ives, U Penn Example: Step 0 1 Initialize all ranks x(0) = to be equal i n 0.33 0.33 0.33 15 Source: A. Haeberlen , Z. Ives, U Penn Example: Step 1 Propagate weights (k+1) 1 (k) across out-edges x x i = ∑ j j∈Bi N j 0.33 0.17 0.33 0.17 16 Source: A. Haeberlen , Z. Ives, U Penn Example: Step 2 Compute weights 1 based on in-edges x (1) x (0) i = ∑ j j∈Bi N j 0.50 0.33 0.17 17 Source: A. Haeberlen , Z. Ives, U Penn Example: Convergence 1 x(k+1) x(k) i = ∑ j j∈Bi N j 0.40 0.4 0.2 18 Source: A. Haeberlen , Z. Ives, U Penn Naïve PageRank Algorithm Restated • Let – N(p) = number outgoing links from page p – B(p) = number of back-links to page p 1 PageRank( p) = PageRank(b) ∑ N(b) b∈Bp – Each page b distributes its importance to all of the pages it points to (so we scale by 1/N(b)) – Page p’s importance is increased by the importance of its back set Source: A. Haeberlen , Z. Ives, U Penn 19 In Linear Algebra formulaon • Create an m x m matrix M to capture links: – M(i, j) = 1 / nj if page i is pointed to by page j and page j has nj outgoing links = 0 otherwise – IniRalize all PageRanks to 1, mulRply by M repeatedly unRl all values converge: ⎡ PageRank( p1')⎤ ⎡ PageRank( p1)⎤ ⎢PageRank( p ')⎥ ⎢PageRank( p )⎥ ⎢ 2 ⎥ = M ⎢ 2 ⎥ ⎢ ... ⎥ ⎢ ... ⎥ ⎢PageRank( p ')⎥ ⎢PageRank( p )⎥ ⎣ m ⎦ ⎣ m ⎦ – Computes principal eigenvector via power iteraon Source: A. Haeberlen , Z. Ives, U Penn 20 A Brief Example Google g' 0 0.5 0.5 g y’ = 0 0 0.5 * y Amazon Yahoo a’ 1 0.5 0 a Running for multiple iterations: g 1 1 1 1 y = 1 , 0.5 , 0.75 , … 0.67 a 1 1.5 1.25 1.33 Total rank sums to number of pages Source: A. Haeberlen , Z. Ives, U Penn 21 Oops #1 – PageRank Sinks g' 0 0 0.5 g Google y’ = 0.5 0 0.5 * y a’ 0.5 0 0 a Amazon Yahoo 'dead end' - PageRank is lost aer each round Running for multiple iterations: g 1 0.5 0.25 0 y = 1 , 1 , 0.5 , … , 0 a 1 0.5 0.25 0 22 Source: A. Haeberlen , Z. Ives, U Penn Oops #2 – PageRank hogs Google g' 0 0 0.5 g y’ = 0.5 1 0.5 * y a’ 0.5 0 0 a Amazon Yahoo PageRank cannot flow out and accumulates Running for multiple iterations: g 1 0.5 0.25 0 y = 1 , 2 , 2.5 , … , 3 a 1 0.5 0.25 0 23 Source: A. Haeberlen , Z. Ives, U Penn Improved PageRank • Remove out-degree 0 nodes (or consider them to refer back to referrer) • Add decay factor d to deal with sinks 1 PageRank( p) = (1− d) + d ∑ PageRank(b) b∈Bp N(b) • Typical value: d=0.85 24 Source: A. Haeberlen , Z. Ives, U Penn Stopping the Hog g' Google 0 0 0.5 g 0.15 y’ = 0.85 0.5 1 0.5 * y + 0.15 a’ 0.5 0 0 a 0.15 Amazon Yahoo Running for multiple iterations: g 0.57 0.39 0.32 0.26 , , , , … , y = 1.85 2.21 2.36 2.48 a 0.57 0.39 0.32 0.26 … though does this seem right? 25 Source: A. Haeberlen , Z. Ives, U Penn Random Surfer Model • PageRank has an intuiRve basis in random walks on graphs • Imagine a random surfer, who starts on a random page and, in each step, – with probability d, klicks on a random link on the page – with probability 1-d, jumps to a random page (bored?) • The PageRank of a page can be interpreted as the fracRon of steps the surfer spends on the corresponding page – TransiRon matrix can be interpreted as a Markov Chain 26 Source: A. Haeberlen , Z. Ives, U Penn PageRank, random walk on graphs G W = (1-d)*G + d*U + Pagerank vector – with probability d, klicks on a random link on the page – with probability 1-d, jumps to a random page (bored?) – TransiRon matrix W can be interpreted as a Markov Chain COMP4650 L Xie, ANU Computer Science COMP4650 L Xie, ANU Computer Science 28 Search Engine OpRmizaon (SEO) • Has become a big business • White-hat techniques – Google webmaster tools – Add meta tags to documents, etc. • Black-hat techniques – Link farms – Keyword stuffing, hidden text, meta-tag stuffing, ... – Spamdexing • IniRal soluRon: <a rel="nofollow" href="...">...</a> • Some people started to abuse this to improve their own rankings – Doorway pages / cloaking • Special pages just for search engines • BMW Germany and Ricoh Germany banned in February 2006 – Link buying 29 Source: A. Haeberlen , Z. Ives, U Penn Recap: PageRank • EsRmates absolute 'quality' or 'importance' of a given page based on inbound links – Query-independent – Can be computed via fixpoint iteraon – Can be interpreted as the fracRon of Rme a 'random surfer' would spend on the page – Several refinements, e.g., to deal with sinks • Considered relavely stable – But vulnerable to black-hat SEO • An important factor, but not the only one – Overall ranking is based on many factors (Google: >200) 30 Source: A. Haeberlen , Z. Ives, U Penn What could be the other 200 factors? Posive Negave Keyword in Rtle? URL? Links to 'bad neighborhood' Keyword in domain name? Keyword stuffing Quality of HTML code Over-opRmizaon On-page Page freshness Hidden content (text has Rate of change same color as background) ... Automac redirect/refresh ... High PageRank Fast increase in number of Anchor text of inbound links inbound links (link buying?) Links from authority sites Link farming Off-page Links from well-known sites Different pages user/spider Domain expiraon date Content duplicaon ... ... • Note: This is enRrely speculave! 31 Source: Web Informaon Systems, Prof.
Recommended publications
  • Search Engine Optimization: a Survey of Current Best Practices
    Grand Valley State University ScholarWorks@GVSU Technical Library School of Computing and Information Systems 2013 Search Engine Optimization: A Survey of Current Best Practices Niko Solihin Grand Valley Follow this and additional works at: https://scholarworks.gvsu.edu/cistechlib ScholarWorks Citation Solihin, Niko, "Search Engine Optimization: A Survey of Current Best Practices" (2013). Technical Library. 151. https://scholarworks.gvsu.edu/cistechlib/151 This Project is brought to you for free and open access by the School of Computing and Information Systems at ScholarWorks@GVSU. It has been accepted for inclusion in Technical Library by an authorized administrator of ScholarWorks@GVSU. For more information, please contact [email protected]. Search Engine Optimization: A Survey of Current Best Practices By Niko Solihin A project submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Information Systems at Grand Valley State University April, 2013 _______________________________________________________________________________ Your Professor Date Search Engine Optimization: A Survey of Current Best Practices Niko Solihin Grand Valley State University Grand Rapids, MI, USA [email protected] ABSTRACT 2. Build and maintain an index of sites’ keywords and With the rapid growth of information on the web, search links (indexing) engines have become the starting point of most web-related 3. Present search results based on reputation and rele- tasks. In order to reach more viewers, a website must im- vance to users’ keyword combinations (searching) prove its organic ranking in search engines. This paper intro- duces the concept of search engine optimization (SEO) and The primary goal is to e↵ectively present high-quality, pre- provides an architectural overview of the predominant search cise search results while efficiently handling a potentially engine, Google.
    [Show full text]
  • Received Citations As a Main SEO Factor of Google Scholar Results Ranking
    RECEIVED CITATIONS AS A MAIN SEO FACTOR OF GOOGLE SCHOLAR RESULTS RANKING Las citas recibidas como principal factor de posicionamiento SEO en la ordenación de resultados de Google Scholar Cristòfol Rovira, Frederic Guerrero-Solé and Lluís Codina Nota: Este artículo se puede leer en español en: http://www.elprofesionaldelainformacion.com/contenidos/2018/may/09_esp.pdf Cristòfol Rovira, associate professor at Pompeu Fabra University (UPF), teaches in the Depart- ments of Journalism and Advertising. He is director of the master’s degree in Digital Documenta- tion (UPF) and the master’s degree in Search Engines (UPF). He has a degree in Educational Scien- ces, as well as in Library and Information Science. He is an engineer in Computer Science and has a master’s degree in Free Software. He is conducting research in web positioning (SEO), usability, search engine marketing and conceptual maps with eyetracking techniques. https://orcid.org/0000-0002-6463-3216 [email protected] Frederic Guerrero-Solé has a bachelor’s in Physics from the University of Barcelona (UB) and a PhD in Public Communication obtained at Universitat Pompeu Fabra (UPF). He has been teaching at the Faculty of Communication at the UPF since 2008, where he is a lecturer in Sociology of Communi- cation. He is a member of the research group Audiovisual Communication Research Unit (Unica). https://orcid.org/0000-0001-8145-8707 [email protected] Lluís Codina is an associate professor in the Department of Communication at the School of Com- munication, Universitat Pompeu Fabra (UPF), Barcelona, Spain, where he has taught information science courses in the areas of Journalism and Media Studies for more than 25 years.
    [Show full text]
  • Analysis of the Youtube Channel Recommendation Network
    CS 224W Project Milestone Analysis of the YouTube Channel Recommendation Network Ian Torres [itorres] Jacob Conrad Trinidad [j3nidad] December 8th, 2015 I. Introduction With over a billion users, YouTube is one of the largest online communities on the world wide web. For a user to upload a video on YouTube, they can create a channel. These channels serve as the home page for that account, displaying the account's name, description, and public videos that have been up- loaded to YouTube. In addition to this content, channels can recommend other channels. This can be done in two ways: the user can choose to feature a channel or YouTube can recommend a channel whose content is similar to the current channel. YouTube features both of these types of recommendations in separate sidebars on the user's channel. We are interested analyzing in the structure of this potential network. We have crawled the YouTube site and obtained a dataset totaling 228575 distinct user channels, 400249 user recommendations, and 400249 YouTube recommendations. In this paper, we present a systematic and in-depth analysis on the structure of this network. With this data, we have created detailed visualizations, analyzed different centrality measures on the network, compared their community structures, and performed motif analysis. II. Literature Review As YouTube has been rising in popularity since its creation in 2005, there has been research on the topic of YouTube and discovering the structure behind its network. Thus, there exists much research analyzing YouTube as a social network. Cheng looks at videos as nodes and recommendations to other videos as links [1].
    [Show full text]
  • Seo On-Page Optimization
    SEO ON-PAGE OPTIMIZATION Equipping Your Website to Become a Powerful Marketing Platform Copyright 2016 SEO GLOBAL MEDIA TABLE OF CONTENTS Introduction 01 Chapter I Codes, Tags, and Metadata 02 Chapter II Landing Pages and Buyer Psychology 06 Chapter III Conclusion 08 Copyright 2016 SEO GLOBAL MEDIA INTRODUCTION Being indexed and ranked on the search engine results pages (SERPs) depends on many factors, beginning with the different elements on each of your website. Optimizing these factors helps search engine crawlers find your website, index the pages appropriately , and rank it according to your desired keywords. On-page optimization plays a big role in ensuring your online marketing campaign's success. In this definitive guide, you will learn how to successfully optimize your website to ensure that your website is indexed and ranked on the SERPs. In addition you’ll learn how to make your web pages convert. Copyright 2016 SEO GLOBAL MEDIA SEO On-Page Optimization | 1 CODES, MARK-UPS, AND METADATA Let’s get technical Having clean codes, optimized HTML tags and metadata helps search engines crawl your site better and index and rank your pages according to the relevant search terms. Make sure to check the following: Source Codes Your source code is the backbone of your website. The crawlers finds everything it needs in order to index your website here. Make sure your source code is devoid of any problems by checking the following: INCORRECTLY IMPLEMENTED TAGS: Examples of these are re=canonical tags, authorship mark-up, or redirects. These could prove catastrophic, especially the canonical code, which can end up in duplicate content penalties.
    [Show full text]
  • Human Computation Luis Von Ahn
    Human Computation Luis von Ahn CMU-CS-05-193 December 7, 2005 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Manuel Blum, Chair Takeo Kanade Michael Reiter Josh Benaloh, Microsoft Research Jitendra Malik, University of California, Berkeley Copyright © 2005 by Luis von Ahn This work was partially supported by the National Science Foundation (NSF) grants CCR-0122581 and CCR-0085982 (The Aladdin Center), by a Microsoft Research Fellowship, and by generous gifts from Google, Inc. The views and conclusions contained in this document are those of the author and should not be interpreted as representing official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: CAPTCHA, the ESP Game, Peekaboom, Verbosity, Phetch, human computation, automated Turing tests, games with a purpose. 2 Abstract Tasks like image recognition are trivial for humans, but continue to challenge even the most sophisticated computer programs. This thesis introduces a paradigm for utilizing human processing power to solve problems that computers cannot yet solve. Traditional approaches to solving such problems focus on improving software. I advocate a novel approach: constructively channel human brainpower using computer games. For example, the ESP Game, introduced in this thesis, is an enjoyable online game — many people play over 40 hours a week — and when people play, they help label images on the Web with descriptive keywords. These keywords can be used to significantly improve the accuracy of image search. People play the game not because they want to help, but because they enjoy it. I introduce three other examples of “games with a purpose”: Peekaboom, which helps determine the location of objects in images, Phetch, which collects paragraph descriptions of arbitrary images to help accessibility of the Web, and Verbosity, which collects “common-sense” knowledge.
    [Show full text]
  • Pagerank Best Practices Neus Ferré Aug08
    Search Engine Optimisation. PageRank best Practices Graduand: Supervisor: Neus Ferré Viñes Florian Heinemann [email protected] [email protected] UPC Barcelona RWTH Aachen RWTH Aachen Business Sciences for Engineers Business Sciences for Engineers and Natural Scientists and Natural Scientists July 2008 Abstract Since the explosion of the Internet age the need of search online information has grown as well at the light velocity. As a consequent, new marketing disciplines arise in the digital world. This thesis describes, in the search engine marketing framework, how the ranking in the search engine results page (SERP) can be influenced. Wikipedia describes search engine marketing or SEM as a form of Internet marketing that seeks to promote websites by increasing their visibility in search engine result pages (SERPs). Therefore, the importance of being searchable and visible to the users reveal needs of improvement for the website designers. Different factors are used to produce search rankings. One of them is PageRank. The present thesis focuses on how PageRank of Google makes use of the linking structure of the Web in order to maximise relevance of the results in a web search. PageRank used to be the jigsaw of webmasters because of the secrecy it used to have. The formula that lies behind PageRank enabled the founders of Google to convert a PhD into one of the most successful companies ever. The uniqueness of PageRank in contrast to other Web Search Engines consist in providing the user with the greatest relevance of the results for a specific query, thus providing the most satisfactory user experience.
    [Show full text]
  • The Pagerank Algorithm and Application on Searching of Academic Papers
    The PageRank algorithm and application on searching of academic papers Ping Yeh Google, Inc. 2009/12/9 Department of Physics, NTU Disclaimer (legal) The content of this talk is the speaker's personal opinion and is not the opinion or policy of his employer. Disclaimer (content) You will not hear physics. You will not see differential equations. You will: ● get a review of PageRank, the algorithm used in Google's web search. It has been applied to evaluate journal status and influence of nodes in a graph by researchers, ● see some linear algebra and Markov chains associated with it, and ● see some results of applying it to journal status. Outline Introduction Google and Google search PageRank algorithm for ranking web pages Using MapReduce to calculate PageRank for billions of pages Impact factor of journals and PageRank Conclusion Google The name: homophone to the word “Googol” which means 10100. The company: ● founded by Larry Page and Sergey Brin in 1998, ● ~20,000 employees as of 2009, ● spread in 68 offices around the world (23 in N. America, 3 in Latin America, 14 in Asia Pacific, 23 in Europe, 5 in Middle East and Africa). The mission: “to organize the world's information and make it universally accessible and useful.” Google Services Sky YouTube iGoogle web search talk book search Chrome calendar scholar translate blogger.com Android product news search maps picasaweb video groups Gmail desktop reader Earth Photo by mr.hero on panoramio (http://www.panoramio.com/photo/1127015)‏ 6 Google Search http://www.google.com/ or http://www.google.com.tw/ The abundance problem Quote Langville and Meyer's nice book “Google's PageRank and beyond: the science of search engine rankings”: The men in Jorge Luis Borges’ 1941 short story, “The Library of Babel”, which describes an imaginary, infinite library.
    [Show full text]
  • The Pagerank Algorithm Is One Way of Ranking the Nodes in a Graph by Importance
    The PageRank 13 Algorithm Lab Objective: Many real-world systemsthe internet, transportation grids, social media, and so oncan be represented as graphs (networks). The PageRank algorithm is one way of ranking the nodes in a graph by importance. Though it is a relatively simple algorithm, the idea gave birth to the Google search engine in 1998 and has shaped much of the information age since then. In this lab we implement the PageRank algorithm with a few dierent approaches, then use it to rank the nodes of a few dierent networks. The PageRank Model The internet is a collection of webpages, each of which may have a hyperlink to any other page. One possible model for a set of n webpages is a directed graph, where each node represents a page and node j points to node i if page j links to page i. The corresponding adjacency matrix A satises Aij = 1 if node j links to node i and Aij = 0 otherwise. b c abcd a 2 0 0 0 0 3 b 6 1 0 1 0 7 A = 6 7 c 4 1 0 0 1 5 d 1 0 1 0 a d Figure 13.1: A directed unweighted graph with four nodes, together with its adjacency matrix. Note that the column for node b is all zeros, indicating that b is a sinka node that doesn't point to any other node. If n users start on random pages in the network and click on a link every 5 minutes, which page in the network will have the most views after an hour? Which will have the fewest? The goal of the PageRank algorithm is to solve this problem in general, therefore determining how important each webpage is.
    [Show full text]
  • SEO - What Are the BIG Things to Get Right? Keyword Research – You Must Focus on What People Search for - Not What You Want Them to Search For
    Search Engine Optimization SEO - What are the BIG things to get right? Keyword Research – you must focus on what people search for - not what you want them to search for. Content – Extremely important - #2. Page Title Tags – Clearly a strong contributor to good rankings - #3 Usability – Obviously search engine robots do not know colors or identify a ‘call to action’ statement (which are both very important to visitors), but they do see broken links, HTML coding errors, contextual links between pages and download times. Links – Numbers are good; but Quality, Authoritative, Relevant links on Keyword-Rich anchor text will provide you the best improvement in your SERP rankings. #1 Keyword Research: The foundation of a SEO campaign is accurate keyword research. When doing keyword research: Speak the user's language! People search for terms like "cheap airline tickets," not "value-priced travel experience." Often, a boring keyword is a known keyword. Supplement brand names with generic terms. Avoid "politically correct" terminology. (example: use ‘blind users’ not ‘visually challenged users’). Favor legacy terms over made-up words, marketese and internal vocabulary. Plural verses singular form of words – search it yourself! Keywords: Use keyword search results to select multi-word keyword phrases for the site. Formulate Meta Keyword tag to focus on these keywords. Google currently ignores the tag as does Bing, however Yahoo still does index the keyword tag despite some press indicating otherwise. I believe it still is important to include one. Why – what better place to document your work and it might just come back into vogue someday. List keywords in order of importance for that given page.
    [Show full text]
  • Personalized Pagerank Estimation and Search: a Bidirectional Approach
    Personalized PageRank Estimation and Search: A Bidirectional Approach Peter Lofgren Siddhartha Banerjee Ashish Goel Department of CS School of ORIE Department of MS&E Stanford University Cornell University Stanford University [email protected] [email protected] [email protected] ABSTRACT 1. INTRODUCTION We present new algorithms for Personalized PageRank es- On social networks, personalization is necessary for re- timation and Personalized PageRank search. First, for the turning relevant results for a query. For example, if a user problem of estimating Personalized PageRank (PPR) from searches for a common name like John on a social network a source distribution to a target node, we present a new like Facebook, the results should depend on who is doing the bidirectional estimator with simple yet strong guarantees on search and who their friends are. A good personalized model correctness and performance, and 3x to 8x speedup over ex- for measuring the importance of a node t to a searcher s is isting estimators in experiments on a diverse set of networks. Personalized PageRank πs(t)[20, 13, 12] { this motivates a Moreover, it has a clean algebraic structure which enables natural Personalized PageRank Search Problem: Given it to be used as a primitive for the Personalized PageRank • a network with nodes V (each associated with a set of Search problem: Given a network like Facebook, a query keywords) and edges E (possibly weighted and directed), like \people named John," and a searching user, return the • a keyword inducing a set of targets: top nodes in the network ranked by PPR from the perspec- T = ft 2 V : t is relevant to the keywordg tive of the searching user.
    [Show full text]
  • Centrality Metrics
    Centrality Metrics Dr. Natarajan Meghanathan Professor of Computer Science Jackson State University E-mail: [email protected] Centrality • Tells us which nodes are important in a network based on the topological structure of the network (instead of using the offline information about the nodes: e.g., popularity of nodes) – How influential a person is within a social network – Which genes play a crucial role in regulating systems and processes – Infrastructure networks: if the node is removed, it would critically impede the functioning of the network. Nodes X and Z have higher Degree Node Y is more central from X YZ the point of view of Betweenness – to reach from one end to the other Closeness – can reach every other vertex in the fewest number of hops Centrality Metrics • Degree-based Centrality Metrics – Degree Centrality : measure of the number of vertices adjacent to a vertex (degree) – Eigenvector Centrality : measure of the degree of the vertex as well as the degree of its neighbors • Shortest-path based Centrality Metrics – Betweeness Centrality : measure of the number of shortest paths a node is part of – Closeness Centrality : measure of how close is a vertex to the other vertices [sum of the shortest path distances] – Farness Centrality: captures the variation of the shortest path distances of a vertex to every other vertex Degree Centrality Time Complexity: Θ(V 2) Weakness : Very likely that more than one vertex has the same degree and not possible to uniquely rank the vertices Eigenvector Power Iteration Method
    [Show full text]
  • Feasibility Study of a Wiki Collaboration Platform for Systematic Reviews
    Methods Research Report Feasibility Study of a Wiki Collaboration Platform for Systematic Reviews Methods Research Report Feasibility Study of a Wiki Collaboration Platform for Systematic Reviews Prepared for: Agency for Healthcare Research and Quality U.S. Department of Health and Human Services 540 Gaither Road Rockville, MD 20850 www.ahrq.gov Contract No. 290-02-0019 Prepared by: ECRI Institute Evidence-based Practice Center Plymouth Meeting, PA Investigator: Eileen G. Erinoff, M.S.L.I.S. AHRQ Publication No. 11-EHC040-EF September 2011 This report is based on research conducted by the ECRI Institute Evidence-based Practice Center in 2008 under contract to the Agency for Healthcare Research and Quality (AHRQ), Rockville, MD (Contract No. 290-02-0019 -I). The findings and conclusions in this document are those of the author(s), who are responsible for its content, and do not necessarily represent the views of AHRQ. No statement in this report should be construed as an official position of AHRQ or of the U.S. Department of Health and Human Services. The information in this report is intended to help clinicians, employers, policymakers, and others make informed decisions about the provision of health care services. This report is intended as a reference and not as a substitute for clinical judgment. This report may be used, in whole or in part, as the basis for the development of clinical practice guidelines and other quality enhancement tools, or as a basis for reimbursement and coverage policies. AHRQ or U.S. Department of Health and Human Services endorsement of such derivative products or actions may not be stated or implied.
    [Show full text]