Hitting Set Enumeration with Partial Information for Unique Column Combination Discovery

Total Page:16

File Type:pdf, Size:1020Kb

Hitting Set Enumeration with Partial Information for Unique Column Combination Discovery Hitting Set Enumeration with Partial Information for Unique Column Combination Discovery Johann BirnickF, Thomas Bläsius, Tobias Friedrich, Felix Naumann, Thorsten Papenbrock, Martin Schirneck FETH Zurich, Zurich, Switzerland Hasso Plattner Institute, University of Potsdam, Germany fi[email protected] ABSTRACT that fulfill the necessary properties of a key, whereby null Unique column combinations (UCCs) are a fundamental con- values require special consideration, they serve to define key cept in relational databases. They identify entities in the constraints and are, hence, also called key candidates. data and support various data management activities. Still, In practice, key discovery is a recurring activity as keys are UCCs are usually not explicitly defined and need to be dis- often missing for various reasons: on freshly recorded data, covered. State-of-the-art data profiling algorithms are able keys may have never been defined, on archived and transmit- to efficiently discover UCCs in moderately sized datasets, ted data, keys sometimes are lost due to format restrictions, but they tend to fail on large and, in particular, on wide on evolving data, keys may be outdated if constraint en- datasets due to run time and memory limitations. forcement is lacking, and on fused and integrated data, keys In this paper, we introduce HPIValid, a novel UCC discov- often invalidate in the presence of key collisions. Because ery algorithm that implements a faster and more resource- UCCs are also important for many data management tasks saving search strategy. HPIValid models the metadata dis- other than key discovery, such as anomaly detection, data covery as a hitting set enumeration problem in hypergraphs. integration, data modeling, query optimization, and index- In this way, it combines efficient discovery techniques from ing, the ability to discover them efficiently is crucial. data profiling research with the most recent theoretical in- Data profiling algorithms automatically discover meta- sights into enumeration algorithms. Our evaluation shows data, like unique column combinations, from raw data and that HPIValid is not only orders of magnitude faster than provide this metadata to any downstream data management related work, it also has a much smaller memory footprint. task. Research in this area has led to various UCC discovery algorithms, such as GORDIAN [34], HCA [2], DUCC [22], Swan [3], PVLDB Reference Format: and HyUCC [33]. Out of these algorithms, only HyUCC can Johann Birnick, Thomas Bläsius, Tobias Friedrich, Felix Nau- process datasets of multiple gigabytes in size in reasonable mann, Thorsten Papenbrock, Martin Schirneck. Hitting Set Enu- time, because it combines the strengths of all previous algo- meration with Partial Information for Unique Column Combina- tion Discovery. PVLDB, 13(11): 2270-2283, 2020. rithms (lattice search, candidate inference, and paralleliza- DOI: https://doi.org/10.14778/3407790.3407824 tion). At some point, however, even HyUCC fails to process certain datasets, because the approach needs to maintain an exponentially growing search space in main memory, which 1. INTRODUCTION eventually either exhausts the available memory or, due to Keys are among the most fundamental type of constraint in search space maintenance, the execution time. relational database theory. A key is a set of attributes whose There is a close connection between the UCC discovery values uniquely identify every record in a given relational in- problem and the enumeration of hitting sets in hypergraphs. stance. This uniqueness property is necessary to determine Hitting set-based techniques were among the first tried for entities in the data. Keys serve to query, link, and merge UCC discovery [29] and still play a role today in the form of entities. A unique column combination (UCC) is the obser- row-based algorithms. To utilize the hitting set connection vation that in a given relational instance a certain set of at- directly, one needs to feed the algorithm with information on tributes S does not contain any duplicate entries. In other all pairs of records of the database [1, 35]. This is prohibitive words, the multiset projection of schema R on S contains for larger datasets due to the quadratic scaling. only unique entries. Because UCCs describe attribute sets In this paper, we propose a novel approach that automat- ically discovers all unique column combinations in any given relational dataset. The algorithm is based on the connec- This work is licensed under the Creative Commons Attribution- tion to hitting set enumeration, but additionally uses the NonCommercial-NoDerivatives 4.0 International License. To view a copy key insight that most of the information in the record pairs of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. For is redundant and that the following strategy of utilizing the any use beyond those covered by this license, obtain permission by emailing state-of-the-art hitting set enumeration algorithm MMCS [31] [email protected]. Copyright is held by the owner/author(s). Publication rights finds the relevant information much more efficiently. We licensed to the VLDB Endowment. Proceedings of the VLDB Endowment, Vol. 13, No. 11 start the discovery with only partial (or even no) informa- ISSN 2150-8097. tion about the database. Due to this lack of information, the DOI: https://doi.org/10.14778/3407790.3407824 2270 resulting solution candidates might not actually be UCCs. A exists a key of cardinality less than a given value in a given validation then checks whether the hitting set is a true UCC. relational instance is an NP-complete problem [5]. More- If not, the validation additionally provides a reason for its over, the discovery of a key of minimum size in the database incorrectness, pointing directly to the part of the database is also likely not fixed-parameter tractable as it is W [2]- that MMCS needs to avoid the mistake. Moreover, we show complete when parameterized by the size [18]. Finding all that all previous decisions of MMCS would have been exactly keys or unique column combinations is computationally even the same, even if MMCS had full information about all row harder. For this reason, only few automatic data profiling pairs of the database. Thus, our new approach can include algorithms exist that can discover all unique column combi- the new information on the fly and simply resume the enu- nations. The discovery problem is closely related to the enu- meration where it was before. Instead of providing full and meration of hitting sets (a.k.a. the transversal hypergraph probably redundant information about the database to the problem, frequent itemset mining, or monotone dualization, enumeration subroutine, the enumeration decides for itself see [15]). Many data profiling algorithms, including the first which information is necessary to make the right decisions. UCC discovery algorithms, rely on the hitting sets of certain Due to its hitting set-based nature, we named our algo- hypergraphs; the process of deriving complete hypergraphs, rithm Hitting set enumeration with Partial Information and however, is a bottleneck in the computation [30]. Validation, HPIValid for short. Our approach adopts state- In modern data profiling, one usually distinguishes two of-the-art techniques from both data profiling and hitting types of profiling algorithms: row and column-based ap- set enumeration bringing together the two research commu- proaches. For a comprehensive overview and detailed dis- nities. We add the following contributions: cussions of the different approaches, we refer to [1] and [28]. (1) UCC discovery. We introduce HPIValid, a novel UCC Row-based algorithms, such as GORDIAN [34], advance the discovery algorithm that outperforms state-of-the-art al- initial hitting set idea. They compare all records in the gorithms in terms of run time and memory consumption. input dataset pair-wise to derive all valid, minimal UCCs. (2) Hitting set enumeration with partial information. We Column-based algorithms, such as HCA [2], DUCC [22], and prove that the hitting set enumeration algorithm MMCS Swan [3], in contrast, systematically enumerate and test in- remains correct when run on a partial hypergraph, pro- dividual UCC candidates while using intermediate results vided that there is a validation procedure for candidate to prune the search space. The algorithms vary mostly in solutions. We believe that this insight can be key to their traversal strategies, which are breadth-first bottom-up also solve similar task like the discovery of functional for HCA and a depth-first random-walk for DUCC. Both row dependencies or denial constraints. and column-based approaches have their strengths: record (3) Subset closedness. We introduce the concept of subset comparisons scale well with the number of attributes, and closedness of hitting set enumeration algorithms, and systematic candidate tests scale well with the number of prove that this property is sufficient for enumeration records. Hybrid algorithms aim to combine both aspects. with partial information to succeed. This makes it easy The one proposed in [25] exploits the duality between mini- to replace MMCS with a different enumeration procedure, mal difference sets and minimal UCCs to mutually grow as long as it is also subset-closed. the available information about the search and the solution (4) Sampling strategy. We propose a robust strategy how space. HyUCC [33] switches between column and row-based to extract the relevant information from the database parts heuristically whenever the progress of the current ap- efficiently in the presence of redundancies. proach is low. Hyb [35] is a hybrid algorithm for the discovery (5) Exhaustive evaluation. We evaluate our algorithm on of embedded uniqueness constraints (eUCs), an extension of dozens of real-world datasets and compare it to the cur- UCCs to incomplete data. It proposes special ideas tailored rent state-of-the-art UCC discovery algorithm HyUCC. to incomplete data, but is based on the same discovery ap- Besides being able to solve instances that were previously proach as HyUCC.
Recommended publications
  • The Matroid Theorem We First Review Our Definitions: a Subset System Is A
    CMPSCI611: The Matroid Theorem Lecture 5 We first review our definitions: A subset system is a set E together with a set of subsets of E, called I, such that I is closed under inclusion. This means that if X ⊆ Y and Y ∈ I, then X ∈ I. The optimization problem for a subset system (E, I) has as input a positive weight for each element of E. Its output is a set X ∈ I such that X has at least as much total weight as any other set in I. A subset system is a matroid if it satisfies the exchange property: If i and i0 are sets in I and i has fewer elements than i0, then there exists an element e ∈ i0 \ i such that i ∪ {e} ∈ I. 1 The Generic Greedy Algorithm Given any finite subset system (E, I), we find a set in I as follows: • Set X to ∅. • Sort the elements of E by weight, heaviest first. • For each element of E in this order, add it to X iff the result is in I. • Return X. Today we prove: Theorem: For any subset system (E, I), the greedy al- gorithm solves the optimization problem for (E, I) if and only if (E, I) is a matroid. 2 Theorem: For any subset system (E, I), the greedy al- gorithm solves the optimization problem for (E, I) if and only if (E, I) is a matroid. Proof: We will show first that if (E, I) is a matroid, then the greedy algorithm is correct. Assume that (E, I) satisfies the exchange property.
    [Show full text]
  • Useful Relations in Permutations and Combination 1. Useful Relations
    Useful Relations in Permutations and Combination 1. Useful Relations - Factorial n! = n.(n-1)! 2. n퐶푟= n푃푟/r! n n-1 3. Pr = n( Pr-1) 4. Useful Relations - Combinations n n 1. Cr = C(n - r) Example 8 8 C6 = C2 = 8×72×1 = 28 n 2. Cn = 1 n 3. C0 = 1 n n n n n 4. C0 + C1 + C2 + ... + Cn = 2 Example 4 4 4 4 4 4 C0 + C1 + C2 + C3+ C4 = (1 + 4 + 6 + 4 + 1) = 16 = 2 n n (n+1) Cr-1 + Cr = Cr (Pascal's Law) n퐶푟 =n/퐶푟−1=n-r+1/r n n If Cx = Cy then either x = y or (n-x) = y. 5. Selection from identical objects: Some Basic Facts The number of selections of r objects out of n identical objects is 1. Total number of selections of zero or more objects from n identical objects is n+1. 6. Permutations of Objects when All Objects are Not Distinct The number of ways in which n things can be arranged taking them all at a time, when st nd 푃1 of the things are exactly alike of 1 type, 푃2 of them are exactly alike of a 2 type, and th 푃푟of them are exactly alike of r type and the rest of all are distinct is n!/ 푃1! 푃2! ... 푃푟! 1 Example: how many ways can you arrange the letters in the word THESE? 5!/2!=120/2=60 Example: how many ways can you arrange the letters in the word REFERENCE? 9!/2!.4!=362880/2*24=7560 7.Circular Permutations: Case 1: when clockwise and anticlockwise arrangements are different Number of circular permutations (arrangements) of n different things is (n-1)! 1.
    [Show full text]
  • Primitive Recursive Functions Are Recursively Enumerable
    AN ENUMERATION OF THE PRIMITIVE RECURSIVE FUNCTIONS WITHOUT REPETITION SHIH-CHAO LIU (Received April 15,1900) In a theorem and its corollary [1] Friedberg gave an enumeration of all the recursively enumerable sets without repetition and an enumeration of all the partial recursive functions without repetition. This note is to prove a similar theorem for the primitive recursive functions. The proof is only a classical one. We shall show that the theorem is intuitionistically unprovable in the sense of Kleene [2]. For similar reason the theorem by Friedberg is also intuitionistical- ly unprovable, which is not stated in his paper. THEOREM. There is a general recursive function ψ(n, a) such that the sequence ψ(0, a), ψ(l, α), is an enumeration of all the primitive recursive functions of one variable without repetition. PROOF. Let φ(n9 a) be an enumerating function of all the primitive recursive functions of one variable, (See [3].) We define a general recursive function v(a) as follows. v(0) = 0, v(n + 1) = μy, where μy is the least y such that for each j < n + 1, φ(y, a) =[= φ(v(j), a) for some a < n + 1. It is noted that the value v(n + 1) can be found by a constructive method, for obviously there exists some number y such that the primitive recursive function <p(y> a) takes a value greater than all the numbers φ(v(0), 0), φ(y(Ϋ), 0), , φ(v(n\ 0) f or a = 0 Put ψ(n, a) = φ{v{n), a).
    [Show full text]
  • Polya Enumeration Theorem
    Polya Enumeration Theorem Sasha Patotski Cornell University [email protected] December 11, 2015 Sasha Patotski (Cornell University) Polya Enumeration Theorem December 11, 2015 1 / 10 Cosets A left coset of H in G is gH where g 2 G (H is on the right). A right coset of H in G is Hg where g 2 G (H is on the left). Theorem If two left cosets of H in G intersect, then they coincide, and similarly for right cosets. Thus, G is a disjoint union of left cosets of H and also a disjoint union of right cosets of H. Corollary(Lagrange's theorem) If G is a finite group and H is a subgroup of G, then the order of H divides the order of G. In particular, the order of every element of G divides the order of G. Sasha Patotski (Cornell University) Polya Enumeration Theorem December 11, 2015 2 / 10 Applications of Lagrange's Theorem Theorem n! For any integers n ≥ 0 and 0 ≤ m ≤ n, the number m!(n−m)! is an integer. Theorem (ab)! (ab)! For any positive integers a; b the ratios (a!)b and (a!)bb! are integers. Theorem For an integer m > 1 let '(m) be the number of invertible numbers modulo m. For m ≥ 3 the number '(m) is even. Sasha Patotski (Cornell University) Polya Enumeration Theorem December 11, 2015 3 / 10 Polya's Enumeration Theorem Theorem Suppose that a finite group G acts on a finite set X . Then the number of colorings of X in n colors inequivalent under the action of G is 1 X N(n) = nc(g) jGj g2G where c(g) is the number of cycles of g as a permutation of X .
    [Show full text]
  • Cardinality of Sets
    Cardinality of Sets MAT231 Transition to Higher Mathematics Fall 2014 MAT231 (Transition to Higher Math) Cardinality of Sets Fall 2014 1 / 15 Outline 1 Sets with Equal Cardinality 2 Countable and Uncountable Sets MAT231 (Transition to Higher Math) Cardinality of Sets Fall 2014 2 / 15 Sets with Equal Cardinality Definition Two sets A and B have the same cardinality, written jAj = jBj, if there exists a bijective function f : A ! B. If no such bijective function exists, then the sets have unequal cardinalities, that is, jAj 6= jBj. Another way to say this is that jAj = jBj if there is a one-to-one correspondence between the elements of A and the elements of B. For example, to show that the set A = f1; 2; 3; 4g and the set B = {♠; ~; }; |g have the same cardinality it is sufficient to construct a bijective function between them. 1 2 3 4 ♠ ~ } | MAT231 (Transition to Higher Math) Cardinality of Sets Fall 2014 3 / 15 Sets with Equal Cardinality Consider the following: This definition does not involve the number of elements in the sets. It works equally well for finite and infinite sets. Any bijection between the sets is sufficient. MAT231 (Transition to Higher Math) Cardinality of Sets Fall 2014 4 / 15 The set Z contains all the numbers in N as well as numbers not in N. So maybe Z is larger than N... On the other hand, both sets are infinite, so maybe Z is the same size as N... This is just the sort of ambiguity we want to avoid, so we appeal to the definition of \same cardinality." The answer to our question boils down to \Can we find a bijection between N and Z?" Does jNj = jZj? True or false: Z is larger than N.
    [Show full text]
  • Combinatorics
    Combinatorics Problem: How to count without counting. I How do you figure out how many things there are with a certain property without actually enumerating all of them. Sometimes this requires a lot of cleverness and deep mathematical insights. But there are some standard techniques. I That's what we'll be studying. We sometimes use the bijection rule without even realizing it: I count how many people voted are in favor of something by counting the number of hands raised: I I'm hoping that there's a bijection between the people in favor and the hands raised! Bijection Rule The Bijection Rule: If f : A ! B is a bijection, then jAj = jBj. I We used this rule in defining cardinality for infinite sets. I Now we'll focus on finite sets. Bijection Rule The Bijection Rule: If f : A ! B is a bijection, then jAj = jBj. I We used this rule in defining cardinality for infinite sets. I Now we'll focus on finite sets. We sometimes use the bijection rule without even realizing it: I count how many people voted are in favor of something by counting the number of hands raised: I I'm hoping that there's a bijection between the people in favor and the hands raised! Answer: 26 choices for the first letter, 26 for the second, 10 choices for the first number, the second number, and the third number: 262 × 103 = 676; 000 Example 2: A traveling salesman wants to do a tour of all 50 state capitals. How many ways can he do this? Answer: 50 choices for the first place to visit, 49 for the second, .
    [Show full text]
  • Inclusion‒Exclusion Principle
    Inclusionexclusion principle 1 Inclusion–exclusion principle In combinatorics, the inclusion–exclusion principle (also known as the sieve principle) is an equation relating the sizes of two sets and their union. It states that if A and B are two (finite) sets, then The meaning of the statement is that the number of elements in the union of the two sets is the sum of the elements in each set, respectively, minus the number of elements that are in both. Similarly, for three sets A, B and C, This can be seen by counting how many times each region in the figure to the right is included in the right hand side. More generally, for finite sets A , ..., A , one has the identity 1 n This can be compactly written as The name comes from the idea that the principle is based on over-generous inclusion, followed by compensating exclusion. When n > 2 the exclusion of the pairwise intersections is (possibly) too severe, and the correct formula is as shown with alternating signs. This formula is attributed to Abraham de Moivre; it is sometimes also named for Daniel da Silva, Joseph Sylvester or Henri Poincaré. Inclusion–exclusion illustrated for three sets For the case of three sets A, B, C the inclusion–exclusion principle is illustrated in the graphic on the right. Proof Let A denote the union of the sets A , ..., A . To prove the 1 n Each term of the inclusion-exclusion formula inclusion–exclusion principle in general, we first have to verify the gradually corrects the count until finally each identity portion of the Venn Diagram is counted exactly once.
    [Show full text]
  • Enumeration of Finite Automata 1 Z(A) = 1
    INFOI~MATION AND CONTROL 10, 499-508 (1967) Enumeration of Finite Automata 1 FRANK HARARY AND ED PALMER Department of Mathematics, University of Michigan, Ann Arbor, Michigan Harary ( 1960, 1964), in a survey of 27 unsolved problems in graphical enumeration, asked for the number of different finite automata. Re- cently, Harrison (1965) solved this problem, but without considering automata with initial and final states. With the aid of the Power Group Enumeration Theorem (Harary and Palmer, 1965, 1966) the entire problem can be handled routinely. The method involves a confrontation of several different operations on permutation groups. To set the stage, we enumerate ordered pairs of functions with respect to the product of two power groups. Finite automata are then concisely defined as certain ordered pah's of functions. We review the enumeration of automata in the natural setting of the power group, and then extend this result to enumerate automata with initial and terminal states. I. ENUMERATION THEOREM For completeness we require a number of definitions, which are now given. Let A be a permutation group of order m = ]A I and degree d acting on the set X = Ix1, x~, -.. , xa}. The cycle index of A, denoted Z(A), is defined as follows. Let jk(a) be the number of cycles of length k in the disjoint cycle decomposition of any permutation a in A. Let al, a2, ... , aa be variables. Then the cycle index, which is a poly- nomial in the variables a~, is given by d Z(A) = 1_ ~ H ~,~(°~ . (1) ~$ a EA k=l We sometimes write Z(A; al, as, ..
    [Show full text]
  • The Axiom of Choice and Its Implications
    THE AXIOM OF CHOICE AND ITS IMPLICATIONS KEVIN BARNUM Abstract. In this paper we will look at the Axiom of Choice and some of the various implications it has. These implications include a number of equivalent statements, and also some less accepted ideas. The proofs discussed will give us an idea of why the Axiom of Choice is so powerful, but also so controversial. Contents 1. Introduction 1 2. The Axiom of Choice and Its Equivalents 1 2.1. The Axiom of Choice and its Well-known Equivalents 1 2.2. Some Other Less Well-known Equivalents of the Axiom of Choice 3 3. Applications of the Axiom of Choice 5 3.1. Equivalence Between The Axiom of Choice and the Claim that Every Vector Space has a Basis 5 3.2. Some More Applications of the Axiom of Choice 6 4. Controversial Results 10 Acknowledgments 11 References 11 1. Introduction The Axiom of Choice states that for any family of nonempty disjoint sets, there exists a set that consists of exactly one element from each element of the family. It seems strange at first that such an innocuous sounding idea can be so powerful and controversial, but it certainly is both. To understand why, we will start by looking at some statements that are equivalent to the axiom of choice. Many of these equivalences are very useful, and we devote much time to one, namely, that every vector space has a basis. We go on from there to see a few more applications of the Axiom of Choice and its equivalents, and finish by looking at some of the reasons why the Axiom of Choice is so controversial.
    [Show full text]
  • 17 Axiom of Choice
    Math 361 Axiom of Choice 17 Axiom of Choice De¯nition 17.1. Let be a nonempty set of nonempty sets. Then a choice function for is a function f sucFh that f(S) S for all S . F 2 2 F Example 17.2. Let = (N)r . Then we can de¯ne a choice function f by F P f;g f(S) = the least element of S: Example 17.3. Let = (Z)r . Then we can de¯ne a choice function f by F P f;g f(S) = ²n where n = min z z S and, if n = 0, ² = min z= z z = n; z S . fj j j 2 g 6 f j j j j j 2 g Example 17.4. Let = (Q)r . Then we can de¯ne a choice function f as follows. F P f;g Let g : Q N be an injection. Then ! f(S) = q where g(q) = min g(r) r S . f j 2 g Example 17.5. Let = (R)r . Then it is impossible to explicitly de¯ne a choice function for . F P f;g F Axiom 17.6 (Axiom of Choice (AC)). For every set of nonempty sets, there exists a function f such that f(S) S for all S . F 2 2 F We say that f is a choice function for . F Theorem 17.7 (AC). If A; B are non-empty sets, then the following are equivalent: (a) A B ¹ (b) There exists a surjection g : B A. ! Proof. (a) (b) Suppose that A B.
    [Show full text]
  • STRUCTURE ENUMERATION and SAMPLING Chemical Structure Enumeration and Sampling Have Been Studied by Mathematicians, Computer
    STRUCTURE ENUMERATION AND SAMPLING MARKUS MERINGER To appear in Handbook of Chemoinformatics Algorithms Chemical structure enumeration and sampling have been studied by 5 mathematicians, computer scientists and chemists for quite a long time. Given a molecular formula plus, optionally, a list of structural con- straints, the typical questions are: (1) How many isomers exist? (2) Which are they? And, especially if (2) cannot be answered completely: (3) How to get a sample? 10 In this chapter we describe algorithms for solving these problems. The techniques are based on the representation of chemical compounds as molecular graphs (see Chapter 2), i.e. they are mainly applied to constitutional isomers. The major problem is that in silico molecular graphs have to be represented as labeled structures, while in chemical 15 compounds, the atoms are not labeled. The mathematical concept for approaching this problem is to consider orbits of labeled molecular graphs under the operation of the symmetric group. We have to solve the so–called isomorphism problem. According to our introductory questions, we distinguish several dis- 20 ciplines: counting, enumerating and sampling isomers. While counting only delivers the number of isomers, the remaining disciplines refer to constructive methods. Enumeration typically encompasses exhaustive and non–redundant methods, while sampling typically lacks these char- acteristics. However, sampling methods are sometimes better suited to 25 solve real–world problems. There is a wide range of applications where counting, enumeration and sampling techniques are helpful or even essential. Some of these applications are closely linked to other chapters of this book. Counting techniques deliver pure chemical information, they can help to estimate 30 or even determine sizes of chemical databases or compound libraries obtained from combinatorial chemistry.
    [Show full text]
  • Enumeration of Structure-Sensitive Graphical Subsets: Calculations (Graph Theory/Combinatorics) R
    Proc. Nati Acad. Sci. USA Vol. 78, No. 3, pp. 1329-1332, March 1981 Mathematics Enumeration of structure-sensitive graphical subsets: Calculations (graph theory/combinatorics) R. E. MERRIFIELD AND H. E. SIMMONS Central Research and Development Department, E. I. du Pont de Nemours and Company, Experimental Station, Wilmington, Delaware 19898 Contributed by H. E. Simmons, October 6, 1980 ABSTRACT Numerical calculations are presented, for all Table 2. Qualitative properties of subset counts connected graphs on six and fewer vertices, of the -lumbers of hidepndent sets, connected sets;point and line covers, externally Quantity Branching Cyclization stable sets, kernels, and irredundant sets. With the exception of v, xIncreases Decreases the number of kernels, the numbers of such sets are al highly p Increases Increases structure-sensitive quantities. E Variable Increases K Insensitive Insensitive The necessary mathematical machinery has recently been de- Variable Variable veloped (1) for enumeration of the following types of special cr Decreases Increases subsets of the vertices or edges of a general graph: (i) indepen- p Increases Increases dent sets, (ii) connected sets, (iii) point and line covers, (iv;' ex- X Decreases Increases ternally stable sets, (v) kernels, and (vi) irredundant sets. In e Increases Increases order to explore the sensitivity of these quantities to graphical K Insensitive Insensitive structure we have employed this machinery to calculate some Decreases Increases explicit subset counts. (Since no more efficient algorithm is known for irredundant sets, they have been enumerated by the brute-force method of testing every subset for irredundance.) 1. e and E are always odd. The 10 quantities considered in ref.
    [Show full text]