Utah State University DigitalCommons@USU
Physics Capstone Project Physics Student Research
5-1-2015
Centrality Measures of Graphs utilizing Continuous Walks in Hilbert Space
Jarod P. Benowitz Utah State University
Follow this and additional works at: https://digitalcommons.usu.edu/phys_capstoneproject
Part of the Physics Commons
Recommended Citation Benowitz, Jarod P., "Centrality Measures of Graphs utilizing Continuous Walks in Hilbert Space" (2015). Physics Capstone Project. Paper 18. https://digitalcommons.usu.edu/phys_capstoneproject/18
This Report is brought to you for free and open access by the Physics Student Research at DigitalCommons@USU. It has been accepted for inclusion in Physics Capstone Project by an authorized administrator of DigitalCommons@USU. For more information, please contact [email protected]. Centrality Measures of Graphs utilizing Continuous Walks in Hilbert Space
Physics 4900 Research Project
Jarod P. Benowitz. Mentor: David Peak, PhD. Physics Department, Utah State University, Logan UT, 84322 [email protected], [email protected]
Abstract
Centrality is most commonly thought of as a measure in which we assign a ranking of the vertices from most important to least important. The importance of a vertex is relative to the underlying process being carried out on the network. This is why there is a diverse amount of centrality measures addressing many such processes. We propose a measure that assigns a ranking in which interference is a property of the underlying process being carried out on the network.
1
Centrality Measures of Graphs utilizing Continuous Walks in Hilbert Space
Jarod P. Benowitz
Abstract
Centrality is most commonly thought of as a measure in which we assign a ranking of the vertices from most important to least important. The importance of a vertex is relative to the underlying process being carried out on the network. This is why there is a diverse amount of centrality measures addressing many such processes. We propose a measure that assigns a ranking in which interference is a property of the underlying process being carried out on the network.
Introduction
Networks are perhaps one of the most ubiquitous structures in nature. They arise for example in cellular biology connecting genes and proteins [5, 1, 6], in neuroscience connecting neurological regions of the brain [1, 3, 7], in sociology connecting the interactions of people, and recently in quantum computing. The analysis of the underlying topology of these discrete structures has thus gained widespread attention. Likewise, there has been a significant focus on designing measures to asses certain topological features of a network by assigning quantitative values to the nodes. These quantitative values have a subtle interpretation insofar as there are implicit assumptions of the underlying process being carried out on the network. Borgatti has identified a typology of flow processes with specific trajectories that use trails, geodesics, paths, or walks. In this framework the flow has a specific type of transmission corresponding to some concrete application. Borgatti gives examples such as used goods, currency, infections, and gossip. Suppose we want to model a flow process in which the flow may interfere with itself. This interference may be the result of collisions in the network where oppositely oriented flows may annihilate. How then can we model such a flow? Our proposition is to model continuous walks on the network insofar as interference becomes an emergent property.
Definition: Centrality is a measure in which the nodes of a network are assigned a ranking with respect to an implicit assumption of the flow characteristics of the network.
2
The adjacency matrix of a simple graph is defined as follows,
1, if � is adjacent to � a = { �� 0, otherwise
where both � and � are nodes in the graph. One of the earliest and simplest centrality measures, Degree Centrality, was developed by Freeman [11]. It is defined as,
�
deg(�) = ∑ a�� = (��)� �=1 which simply counts all nodes adjacent to �. Here, � is the vector of all one’s. In this manner degree centrality measures the local connectivity of the graph. To this end degree centrality can be intuitively thought of as the immediate risk of coming into contact with what is flowing through the network.
The generalization of degree centrality is Katz Centrality [4] where both the immediate neighbors and their connections are counted under the influence of an attenuation parameter . Katz centrality is then defined as,
� ∞ (�) = ∑ ∑ �(��) = (((� − �)−1 − �)�) �� � �=1 �=0 where � is the N-dimensional identity matrix and where is an arbitrary constant which must be smaller than the reciprocal of the largest eigenvalue of �. Katz centrality describes the long term behavior of random walks on the network where every additional step is attenuated by . Katz centrality can then be thought of as the long-term risk of coming into contact with what is flowing through the network. Another important centrality measure is closeness centrality, [4] defined as,
−1 � �(�) = [∑ �(�, �)] �=1 where �(�, �) is the graph-theoretic distance from nodes � to �. Closeness centrality is then the sum of the inverse of the shortest path between nodes � and �. This measure is intuitively thought of as the expected time it takes for something to flow through the network [9].
3
All centrality measures may be classified as either being radial or medial. Radial measures count specific walks between nodes whereas medial measures count specific walks that pass through some given node. A walk of length n is an ordered list of nodes
�, �1, �2, … ��−1, � such that all nodes in the list are connected. Furthermore the nodes need not be distinct and may include both � and �. A closed walk is a walk that both begins and ends at the same node. We may enumerate all walks of length n by taking successive powers of the adjacency matrix,
� � � � � (� )�� = ∑ ∑ … ∑ ∑ ��,�1��1,�2 … ���−2,��−1���−1,� �1=1 �2=1 ��−2=1 ��−1=1
The rest of this paper focuses on lifting the restriction of � strictly being an integer. Of course then the above equation is useless so we introduce the matrix function �(�) = �� where � ∈ � and apply it to the adjacency matrix.
Results
lemma 1.1: The power-law function �(�) = ��, � ∈ � of the adjacency matrix produces complex functions as the entries.
Proof: Let A be the adjacency matrix of a simple nonempty graph. A is a traceless symmetric matrix, Tr[�] = [�, �T] = 0. Since it is symmetric it is always diagonalizable, we then have � = ���−� where � are the eigenvectors collected as a matrix and � is the diagonal matrix consisting of the eigenvalues of A. We then have Tr[�] = Tr[���−�] = Tr[�(��−�)] = Tr[�] = ∑i i = 0. Since the graph is nonempty and the sum of the eigenvalues is zero we are therefore guaranteed to have at least one negative eigenvalue. The function of a matrix can be expressed as �(�) = ��(�)�−1, where the spectral decomposition is,
� � −� � � � = �� � = ∑ � ������ = ∑ � ��� � �
and where (− )� = �����. We then can express every entry of �� as,
� ��� � ���( , �) = ∑ � ��� + � ∑ �′ ��′� − �∈ +\{0} �′∈
4 where +\{0} is the multiset of all positive eigenvalues not including zero and − is the multiset of all negative eigenvalues. Since we are guaranteed at least one negative eigenvalue ��� is complex always.
Definition: Hilbert Space 1 is given by the set of functions defined on the interval (0 ≤ � ≤ L) with finite norm and Hilbert Space 2 is given by the set of functions defined on the whole � interval.
� 2 ∗ ‖�‖ = ∫ � ��� < ∞ 1 0
∞ 2 ∗ ‖�‖ = ∫ � ��� < ∞ 2 −∞
Properties: i) Hilbert Space is linear. A function space is linear when the two following conditions hold: 1) If � is a constant and � is an element of the space then �� is also an element of the space and 2) if and � are any two elements of the space then + � is also an elements of the space. ii) There exists an inner product, ⟨ |�⟩, for any two elements of the space defined over the appropriate interval. iii) Any element of the space has a norm (length) 2 defined as ‖�‖ = ⟨�|�⟩. iv) is complete. Every Cauchy sequence of functions in converges to an element of . A Cauchy sequence {��} is such that ‖�� − ��‖ → 0 as � and � approaches infinity [12].
Theorem 1.1: ��� is an element of Hilbert Space.
Proof: Taking the complex conjugate we get,
∗ � −��� � � ��� � ��� ��� = ( ∑ � ��� + � ∑ �′ ��′�) ( ∑ � ��� + � ∑ �′ ��′�) �∈ +\{0} �′∈ − �∈ +\{0} �′∈ −
2 2 � � � � = [ ∑ � ���] + [ ∑ �′ ��′�] + 2Cos(πx) [ ∑ � ���] [ ∑ �′ ��′�] − − �∈ +\{0} �∈ �∈ +\{0} �′∈
2 � � � + − = [∑ � ���] + 2Cos(πx) [ ∑ � ���] [ ∑ �′ ��′�] where = \{0} ∪ − � ∈ �∈ +\{0} �′∈
2� 2 � � = ∑ ( �) ��� + 2 ∑ ( � �) ��� + 2Cos(πx) ∑ ∑ ( � �′) ���′ − � ∈ �<� �∈ +\{0} �′∈ � ∈ where ��� = ������ and ���′ = ��′���� respectively. Finally upon integration we have,
5
� � � � ∗ 2 2� � � ∫ ��� ����� = ∑ ��� ∫ ( �) �� + 2 ∑ ��� ∫ ( � �) �� + 2 ∑ ∑ ���′ ∫ Cos(πx) ( � �′) �� 0 0 0 − 0 � ∈ �<� �∈ +\{0} �′∈
2L ( )L 2 � − 1 � � − 1 = ∑ ��� + 2 ∑ ��� 2 ln( �) ln( � �) � ∈ �<�
L ln( � �′) − ( � �′ ) (Cos(πL) ln( i i′) + πSin(πL)) + 2 ∑ ∑ ���′ 2 2 − � + (ln ( � �′)) �∈ +\{0} �′∈
0 On the right-hand side of the integral we have two indeterminates of the form when 0 when � → 1 and when � � → 1. Upon a change of variable the limit is,
�L − 1 �2L − 1 lim = lim = L �→1 ln(�) �→1 2ln(�)
The integral then converges over the interval and we have the desired result, ��� ∈ 1. This is an interesting mathematical result that relates the combinatorial structure of graphs to elements of Hilbert Space. Essentially, what we have done is allowed continuous walks on a discrete structure which has the equivalence of taking a vector space and completing it. Recall that a walk is an ordered list of nodes where the number of nodes is the length of the walk. By allowing walks to be any real length we lose the ability of listing the nodes in any order. By labeling the nodes of a graph they can be considered discrete locations within the graph. When asking how many ways there are to go from node’s � to � in � steps (� ∈ �) we can explicitly list every location involved from � to �. However, when we ask how many ways are there to go from node � to � in � steps (� ∈ �) we cannot explicitly list any location involved from � to �.
Below we plot a few ���′� for several graphs.
Figure 1A. Real and Imaginary parts of � for �2 (���ℎ ����ℎ �� ��� �����)
6
Figure 1B. Real and Imaginary parts of � for �3 (�������� ����ℎ �� �ℎ��� �����)
Figure 1C. Real and Imaginary parts of � for �4 (���ℎ ����ℎ �� 4 �����)
A natural question to ask is what structural features of the graph are expressed in �(�) and more importantly which of these features can be obtained from operations on �(�)? First and foremost let’s recover the adjacency relation. Since �(�) counts all walks of length � and the adjacency relation is given by all walks of length 1 then ���(1) = ���, written in terms of an inner product we have ⟨�(� − 1)|���(�)⟩ = ���. This begs the question, does there exist functions for which the inner product produces specific graph parameters?
7
Figure 1D. Real and Imaginary parts of � for �4 (����� ����ℎ �� 4 �����)
To assign quantitative values to the nodes we introduce the ensemble vector,
�