Open Main.Pdf

Total Page:16

File Type:pdf, Size:1020Kb

Open Main.Pdf The Pennsylvania State University The Graduate School EFFICIENT COMBINATORIAL METHODS IN SPARSIFICATION, SUMMARIZATION AND TESTING OF LARGE DATASETS A Dissertation in Computer Science and Engineering by Grigory Yaroslavtsev c 2014 Grigory Yaroslavtsev Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy May 2014 The dissertation of Grigory Yaroslavtsev was reviewed and approved∗ by the following: Sofya Raskhodnikova Associate Professor of Computer Science and Engineering Chair of Committee and Dissertation Advisor Piotr Berman Associate Professor of Computer Science and Engineering Jason Morton Assistant Professor of Mathematics and Statistics Adam D. Smith Associate Professor of Computer Science and Engineering Raj Acharya Head of Department of Computer Science and Engineering ∗Signatures are on file in the Graduate School. Abstract Increasingly large amounts of structured data are being collected by personal computers, mobile devices, personal gadgets, sensors, etc., and stored in data centers operated by the government and private companies. Processing of such data to extract key information is one of the main challenges faced by computer scientists. Developing methods for construct- ing compact representations of large data is a natural way to approach this challenge. This thesis is focused on rigorous mathematical and algorithmic solutions for sparsi- fication and summarization of large amounts of information using discrete combinatorial methods. Areas of mathematics most closely related to it are graph theory, information theory and analysis of real-valued functions over discrete domains. These areas, somewhat surprisingly, turn out to be related when viewed through the computational lens. In this thesis we illustrate the power and limitations of methods for constructing small represen- tations of large data sets, such as graphs and databases, using a variety methods drawn from these areas. The primary goal of sparsification, summarization and sketching methods discussed here is to remove redundancy in distributed systems, reduce storage space and save other resources by compressing the representation, while preserving the key structural properties of the original data or system. For example, the size of the network can be reduced by removing redundant nodes and links, while (approximately) preserving the distances or connectivities between the nodes. This thesis serves as an overview of results in [31, 30, 152, 38, 171, 33]. It also contains an extension of results in [152] obtained jointly with David Woodruff. iii Table of Contents List of Figures viii List of Tables ix Acknowledgmentsx Chapter 1 Introduction1 1.1 Organization ................................... 1 1.2 Overview ..................................... 2 I Approximate Sparse Network Design6 Chapter 2 Approximate Sparsification of Directed Graphs7 2.1 Introduction.................................... 7 2.1.1 Relation to Previous Work ....................... 8 2.1.2 Our Techniques.............................. 8 2.1.3 Directed Steiner Forest ......................... 9 2.1.4 Organization ............................... 10 p 2.2 An O~( n)-Approximation for Directed k-Spanner . 10 2.2.1 Sampling ................................. 11 2.2.2 Randomized Rounding.......................... 11 2.2.3 Antispanners............................... 12 2.2.4 LP, Separation Oracle and Overall Algorithm............. 13 2.2.4.1 Separation Oracle ....................... 14 2.2.4.2 Overall Algorithm for Directed k-Spanner . 16 2.3 LP and Rounding for Graphs with Unit-Length Edges............ 16 iv 2.4 An O~(n1=3)-Approximation for Directed 3-Spanner with Unit-Length Edges ....................................... 18 2.5 An O(n2=3+)-Approximation for Directed Steiner Forest . 24 2.6 Conclusion .................................... 27 Chapter 3 Sparsification of Node-Weighted Planar Graphs 28 3.1 Preliminaries ................................... 30 3.1.1 Uncrossable families of cycles and proper functions.......... 31 3.2 Algorithm..................................... 33 3.2.1 Generic local-ratio algorithm ...................... 33 3.2.2 Minimal pocket violation oracles .................... 34 3.3 Proof of 18/7 approximation ratio with pocket oracle............. 35 3.3.1 Complex witness cycles and decomposition of the debit graph . 36 3.3.2 Pruning.................................. 37 3.3.3 Envelopes................................. 38 3.3.4 Tight examples.............................. 40 II Concise Representations of Real Functions in Property Testing 42 Chapter 4 Concise Representations of Submodular Functions 43 4.1 Introduction.................................... 43 4.1.1 Related work............................... 46 4.2 Structural result ................................. 48 4.3 Generalized switching lemma for pseudo-Boolean DNFs ........... 50 4.4 Learning pseudo-Boolean DNFs......................... 54 Chapter 5 Transitive-Closure Spanners and Testing Functions on Hypergrids 59 5.1 Introduction.................................... 59 5.1.1 Results .................................. 61 5.1.1.1 Steiner 2-TC-spanners of Directed d-dimensional Grids. 61 5.1.2 Applications ............................... 63 5.2 Definitions and Observations .......................... 64 5.3 Lower Bound for 2-TC-spanners of the Hypergrid............... 67 5.4 Our Lower Bound for k-TC-spanners of d-dimensional Posets for k > 2 . 70 5.4.1 The Case of d = 2 ............................ 71 5.4.2 The Case of Constant d ......................... 73 v III Communication Complexity Methods in Summarization 76 Chapter 6 Lower bounds for Testing of Functions on Hypergrids 77 6.1 Introduction.................................... 77 6.2 Preliminaries ................................... 81 6.3 Lower bounds on the line ............................ 82 6.3.1 Monotonicity............................... 83 6.3.2 Convexity................................. 84 6.3.3 The Lipschitz property.......................... 86 6.4 Lower bounds on the hypergrid......................... 87 6.4.1 Monotonicity............................... 89 6.4.2 Convexity................................. 90 6.4.3 The Lipschitz property.......................... 92 Chapter 7 Beyond Direct-Sum with Application to Sketching 94 7.1 Introduction.................................... 94 7.2 The Direct Sum Theorem ............................100 7.3 Lower Bounds for Protocols with Abortion ..................105 7.3.1 Equality Problem.............................105 7.3.2 Augmented Indexing...........................106 7.4 Applications....................................109 7.4.1 Hard Problem...............................109 7.4.2 Estimating Multiple `p Distances....................110 7.4.3 Other Applications............................113 Chapter 8 Optimal Round-Complexity of the Set Intersection Problem 116 8.1 Definitions and preliminaries ..........................120 8.2 Upper bound ...................................121 8.2.1 Auxiliary protocols............................121 8.2.2 Main protocol...............................122 Bibliography 125 Chapter A Appendix 139 A.1 Concentration results...............................139 A.2 Omitted proofs..................................140 A.2.1 Lemma A.2.1...............................140 A.2.2 Analysis of the generic local-ratio algorithm . 141 A.2.3 Uncrossing proper sets (Lemma 3.1.2) . 141 vi A.3 Proof of 12/5 approximation ratio with triple pocket oracle . 142 A.4 Converting a learner into a proper learner...................143 A.5 Information Cost When Amplifying Success Probability . 143 A.6 Auxiliary Results for Lower Bounding Applications . 144 A.6.1 Generic Indexing problems .......................144 A.6.2 Encoding of Indexing Over Augmented Set Indexing . 146 A.7 Proof for Other Applications ..........................148 A.7.1 Proof of Theorem 7.4.10.........................148 A.7.2 Proof of Theorem 7.4.11.........................149 A.7.3 Proof of Theorem 7.4.12.........................150 A.7.4 Proof of Theorem 7.4.13.........................150 A.8 Lowerp bound ...................................151 A.9 O( k)-round protocol with linear communication . 153 A.10 Communication in the Set Intersection protocol . 154 vii List of Figures 2.1 Linear program for the arbitrary-length case, LP-A. A is the set of all minimal antispanners for thin edges....................... 14 2.2 Linear program for the unit-length case, LP-U................. 17 2.3 Linear program LP-DSF for the case jD − Cj > jDj=2 ............ 26 3.1 Family of instances of Subset Feedback Vertex Set with approximation factor 18=7 for the primal-dual algorithm with oracle Minimal-Pocket- Violation .................................... 41 3.2 Family of instances of Subset Feedback Vertex Set with approxima- tion factor 12=5 for primal-dual algorithm with oracle Minimal-3-Pocket- Violation .................................... 41 5.1 Box partition BP(2) and jumps it generates................... 72 viii List of Tables 3.1 General graphs.................................. 29 3.2 Planar graphs................................... 29 5.1 The size of the sparsest Steiner k-TC-spanner for d-dimensional posets on n vertices for d ≥ 2 ............................... 62 d 6.1 Query complexity bounds for testing properties of the function f :[n] ! Z (top) and of the function f :[n] ! [r] (bottom). All the bounds are for nonadaptive tests with two-sided error unless marked otherwise. 78 ix Acknowledgments First and
Recommended publications
  • Going Beyond Dual Execution: MPC for Functions with Efficient Verification
    Going Beyond Dual Execution: MPC for Functions with Efficient Verification Carmit Hazay abhi shelat ∗ Bar-Ilan University Northeastern Universityy Muthuramakrishnan Venkitasubramaniamz University of Rochester Abstract The dual execution paradigm of Mohassel and Franklin (PKC’06) and Huang, Katz and Evans (IEEE ’12) shows how to achieve the notion of 1-bit leakage security at roughly twice the cost of semi-honest security for the special case of two-party secure computation. To date, there are no multi-party compu- tation (MPC) protocols that offer such a strong trade-off between security and semi-honest performance. Our main result is to address this shortcoming by designing 1-bit leakage protocols for the multi- party setting, albeit for a special class of functions. We say that function f (x, y) is efficiently verifiable by g if the running time of g is always smaller than f and g(x, y, z) = 1 if and only if f (x, y) = z. In the two-party setting, we first improve dual execution by observing that the “second execution” can be an evaluation of g instead of f , and that by definition, the evaluation of g is asymptotically more efficient. Our main MPC result is to construct a 1-bit leakage protocol for such functions from any passive protocol for f that is secure up to additive errors and any active protocol for g. An important result by Genkin et al. (STOC ’14) shows how the classic protocols by Goldreich et al. (STOC ’87) and Ben-Or et al. (STOC ’88) naturally support this property, which allows to instantiate our compiler with two-party and multi-party protocols.
    [Show full text]
  • 2020 SIGACT REPORT SIGACT EC – Eric Allender, Shuchi Chawla, Nicole Immorlica, Samir Khuller (Chair), Bobby Kleinberg September 14Th, 2020
    2020 SIGACT REPORT SIGACT EC – Eric Allender, Shuchi Chawla, Nicole Immorlica, Samir Khuller (chair), Bobby Kleinberg September 14th, 2020 SIGACT Mission Statement: The primary mission of ACM SIGACT (Association for Computing Machinery Special Interest Group on Algorithms and Computation Theory) is to foster and promote the discovery and dissemination of high quality research in the domain of theoretical computer science. The field of theoretical computer science is the rigorous study of all computational phenomena - natural, artificial or man-made. This includes the diverse areas of algorithms, data structures, complexity theory, distributed computation, parallel computation, VLSI, machine learning, computational biology, computational geometry, information theory, cryptography, quantum computation, computational number theory and algebra, program semantics and verification, automata theory, and the study of randomness. Work in this field is often distinguished by its emphasis on mathematical technique and rigor. 1. Awards ▪ 2020 Gödel Prize: This was awarded to Robin A. Moser and Gábor Tardos for their paper “A constructive proof of the general Lovász Local Lemma”, Journal of the ACM, Vol 57 (2), 2010. The Lovász Local Lemma (LLL) is a fundamental tool of the probabilistic method. It enables one to show the existence of certain objects even though they occur with exponentially small probability. The original proof was not algorithmic, and subsequent algorithmic versions had significant losses in parameters. This paper provides a simple, powerful algorithmic paradigm that converts almost all known applications of the LLL into randomized algorithms matching the bounds of the existence proof. The paper further gives a derandomized algorithm, a parallel algorithm, and an extension to the “lopsided” LLL.
    [Show full text]
  • Program Sunday Evening: Welcome Recep- Tion from 7Pm to 9Pm at the Staff Lounge of the Department of Computer Science, Ny Munkegade, Building 540, 2Nd floor
    Computational ELECTRONIC REGISTRATION Complexity The registration for CCC’03 is web based. Please register at http://www.brics.dk/Complexity2003/. Registration Fees (In Danish Kroner) Eighteenth Annual IEEE Conference Advance† Late Members‡∗ 1800 DKK 2200 DKK ∗ Sponsored by Nonmembers 2200 DKK 2800 DKK Students+ 500 DKK 600 DKK The IEEE Computer Society ∗The registration fee includes a copy of the proceedings, Technical Committee on receptions Sunday and Monday, the banquet Wednesday, and lunches Monday, Tuesday and Wednesday. Mathematical Foundations +The registration fee includes a copy of the proceedings, of Computing receptions Sunday and Monday, and lunches Monday, Tuesday and Wednesday. The banquet Wednesday is not included. †The advance registration deadline is June 15. ‡ACM, EATCS, IEEE, or SIGACT members. Extra proceedings/banquet tickets Extra proceedings are 350 DKK. Extra banquet tick- ets are 300 DKK. Both can be purchased when reg- istering and will also be available for sale on site. Alternative registration If electronic registration is not possible, please con- tact the organizers at one of the following: E-mail: [email protected] Mail: Complexity 2003 c/o Peter Bro Miltersen In cooperation with Department of Computer Science University of Aarhus ACM-SIGACT and EATCS Ny Munkegade, Building 540 DK 8000 Aarhus C, Denmark Fax: (+45) 8942 3255 July 7–10, 2003 Arhus,˚ Denmark Conference homepage Conference Information Information about this year’s conference is available Location All sessions of the conference and the on the Web at Kolmogorov workshop will be held in Auditorium http://www.brics.dk/Complexity2003/ F of the Department of Mathematical Sciences at Information about the Computational Complexity Aarhus University, Ny Munkegade, building 530, 1st conference is available at floor.
    [Show full text]
  • László Lovász Avi Wigderson De L’Université Eötvös Loránd À De L’Institute for Advanced Study De Budapest, En Hongrie Et À Princeton, Aux États-Unis
    2021 L’Académie des sciences et des lettres de Norvège a décidé de décerner le prix Abel 2021 à László Lovász Avi Wigderson de l’université Eötvös Loránd à de l’Institute for Advanced Study de Budapest, en Hongrie et à Princeton, aux États-Unis, « pour leurs contributions fondamentales à l’informatique théorique et aux mathématiques discrètes, et pour leur rôle de premier plan dans leur transformation en domaines centraux des mathématiques contemporaines ». L’informatique théorique est l’étude de la puissance croissante sur plusieurs autres sciences, ce et des limites du calcul. Elle trouve son origine qui permet de faire de nouvelles découvertes dans les travaux fondateurs de Kurt Gödel, Alonzo en « chaussant des lunettes d’informaticien ». Church, Alan Turing et John von Neumann, qui ont Les structures discrètes telles que les graphes, conduit au développement de véritables ordinateurs les chaînes de caractères et les permutations physiques. L’informatique théorique comprend deux sont au cœur de l’informatique théorique, et les sous-disciplines complémentaires : l’algorithmique mathématiques discrètes et l’informatique théorique qui développe des méthodes efficaces pour une ont naturellement été des domaines étroitement multitude de problèmes de calcul ; et la complexité, liés. Certes, ces deux domaines ont énormément qui montre les limites inhérentes à l’efficacité des bénéficié des champs de recherche plus traditionnels algorithmes. La notion d’algorithmes en temps des mathématiques, mais on constate également polynomial mise en avant dans les années 1960 par une influence croissante dans le sens inverse. Alan Cobham, Jack Edmonds et d’autres, ainsi que Les applications, concepts et techniques de la célèbre conjecture P≠NP de Stephen Cook, Leonid l’informatique théorique ont généré de nouveaux Levin et Richard Karp ont eu un fort impact sur le défis, ouvert de nouvelles directions de recherche domaine et sur les travaux de Lovász et Wigderson.
    [Show full text]
  • Chicago Journal of Theoretical Computer Science the MIT Press
    Chicago Journal of Theoretical Computer Science The MIT Press Volume 1997, Article 1 12 March 1997 ISSN 1073–0486. MIT Press Journals, 55 Hayward St., Cambridge, MA 02142 USA; (617)253-2889; [email protected], [email protected]. Published one article at a time in LATEX source form on the Internet. Pag- ination varies from copy to copy. For more information and other articles see: http://www-mitpress.mit.edu/jrnls-catalog/chicago.html • http://www.cs.uchicago.edu/publications/cjtcs/ • ftp://mitpress.mit.edu/pub/CJTCS • ftp://cs.uchicago.edu/pub/publications/cjtcs • Feige and Kilian Limited vs. Polynomial Nondeterminism (Info) The Chicago Journal of Theoretical Computer Science is abstracted or in- R R R dexed in Research Alert, SciSearch, Current Contents /Engineering Com- R puting & Technology, and CompuMath Citation Index. c 1997 The Massachusetts Institute of Technology. Subscribers are licensed to use journal articles in a variety of ways, limited only as required to insure fair attribution to authors and the journal, and to prohibit use in a competing commercial product. See the journal’s World Wide Web site for further details. Address inquiries to the Subsidiary Rights Manager, MIT Press Journals; (617)253-2864; [email protected]. The Chicago Journal of Theoretical Computer Science is a peer-reviewed scholarly journal in theoretical computer science. The journal is committed to providing a forum for significant results on theoretical aspects of all topics in computer science. Editor in chief: Janos Simon Consulting
    [Show full text]
  • The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space
    The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space Adam Smith Shuang Song Abhradeep Thakurta Boston University Google Research, Brain Team Google Research, Brain Team [email protected] [email protected] [email protected] Abstract We revisit the problem of counting the number of distinct elements F0(D) in a data stream D, over a domain [u]. We propose an ("; δ)-differentially private algorithm that approximates F0(D) within a factor of (1 ± γ), and with additive error of p O( ln(1/δ)="), using space O(ln(ln(u)/γ)/γ2). We improve on the prior work at least quadratically and up to exponentially, in terms of both space and additive p error. Our additive error guarantee is optimal up to a factor of O( ln(1/δ)), n ln(u) 1 o and the space bound is optimal up to a factor of O min ln γ ; γ2 . We assume the existence of an ideal uniform random hash function, and ignore the space required to store it. We later relax this requirement by assuming pseudo- random functions and appealing to a computational variant of differential privacy, SIM-CDP. Our algorithm is built on top of the celebrated Flajolet-Martin (FM) sketch. We show that FM-sketch is differentially private as is, as long as there are p ≈ ln(1/δ)=(εγ) distinct elements in the data set. Along the way, we prove a structural result showing that the maximum of k i.i.d. random variables is statisti- cally close (in the sense of "-differential privacy) to the maximum of (k + 1) i.i.d.
    [Show full text]
  • Signature Redacted
    On Foundations of Public-Key Encryption and Secret Sharing by Akshay Dhananjai Degwekar B.Tech., Indian Institute of Technology Madras (2014) S.M., Massachusetts Institute of Technology (2016) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2019 @Massachusetts Institute of Technology 2019. All rights reserved. Signature redacted Author ............................................ Department of Electrical Engineering and Computer Science June 28, 2019 Signature redacted Certified by....................................... VWi dVaikuntanathan Associate Professor of Electrical Engineering and Computer Science Thesis Supervisor Signature redacted A ccepted by . ......... ...................... MASSACLislie 6jp lodziejski OF EHs o fTE Professor of Electrical Engineering and Computer Science Students Committee on Graduate OCT Chair, Department LIBRARIES c, On Foundations of Public-Key Encryption and Secret Sharing by Akshay Dhananjai Degwekar Submitted to the Department of Electrical Engineering and Computer Science on June 28, 2019, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Since the inception of Cryptography, Information theory and Coding theory have influenced cryptography in myriad ways including numerous information-theoretic notions of security in secret sharing, multiparty computation and statistical zero knowledge; and by providing a large toolbox used extensively in cryptography. This thesis addresses two questions in this realm: Leakage Resilience of Secret Sharing Schemes. We show that classical secret sharing schemes like Shamir secret sharing and additive secret sharing over prime order fields are leakage resilient. Leakage resilience of secret sharing schemes is closely related to locally repairable codes and our results can be viewed as impossibility results for local recovery over prime order fields.
    [Show full text]
  • (L ,J SEP 3 0 2009 LIBRARIES
    Nearest Neighbor Search: the Old, the New, and the Impossible by Alexandr Andoni Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2009 © Massachusetts Institute of Technology 2009. All rights reserved. Author .............. .. .. ......... .... ..... ... ......... ....... Department of Electrical Engineering and Computer Science September 4, 2009 (l ,J Certified by................ Piotr I/ yk Associate Professor Thesis Supervisor Accepted by ................................ /''~~ Terry P. Orlando Chairman, Department Committee on Graduate Students MASSACHUSETTS PaY OF TECHNOLOGY SEP 3 0 2009 ARCHIVES LIBRARIES Nearest Neighbor Search: the Old, the New, and the Impossible by Alexandr Andoni Submitted to the Department of Electrical Engineering and Computer Science on September 4, 2009, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Over the last decade, an immense amount of data has become available. From collections of photos, to genetic data, and to network traffic statistics, modern technologies and cheap storage have made it possible to accumulate huge datasets. But how can we effectively use all this data? The ever growing sizes of the datasets make it imperative to design new algorithms capable of sifting through this data with extreme efficiency. A fundamental computational primitive for dealing with massive dataset is the Nearest Neighbor (NN) problem. In the NN problem, the goal is to preprocess a set of objects, so that later, given a query object, one can find efficiently the data object most similar to the query. This problem has a broad set of applications in data processing and analysis.
    [Show full text]
  • Are All Distributions Easy?
    Are all distributions easy? Emanuele Viola∗ November 5, 2009 Abstract Complexity theory typically studies the complexity of computing a function h(x): f0; 1gn ! f0; 1gm of a given input x. We advocate the study of the complexity of generating the distribution h(x) for uniform x, given random bits. Our main results are: • There are explicit AC0 circuits of size poly(n) and depth O(1) whose output n P distribution has statistical distance 1=2 from the distribution (X; i Xi) 2 f0; 1gn × f0; 1; : : : ; ng for uniform X 2 f0; 1gn, despite the inability of these P circuits to compute i xi given x. Previous examples of this phenomenon apply to different distributions such as P n+1 (X; i Xi mod 2) 2 f0; 1g . We also prove a lower bound independent from n on the statistical distance be- tween the output distribution of NC0 circuits and the distribution (X; majority(X)). We show that 1 − o(1) lower bounds for related distributions yield lower bounds for succinct data structures. • Uniform randomized AC0 circuits of poly(n) size and depth d = O(1) with error can be simulated by uniform randomized circuits of poly(n) size and depth d + 1 with error + o(1) using ≤ (log n)O(log log n) random bits. Previous derandomizations [Ajtai and Wigderson '85; Nisan '91] increase the depth by a constant factor, or else have poor seed length. Given the right tools, the above results have technically simple proofs. ∗Supported by NSF grant CCF-0845003. Email: [email protected] 1 Introduction Complexity theory, with some notable exceptions, typically studies the complexity of com- puting a function h(x): f0; 1gn ! f0; 1gm of a given input x.
    [Show full text]
  • A Decade of Lattice Cryptography
    Full text available at: http://dx.doi.org/10.1561/0400000074 A Decade of Lattice Cryptography Chris Peikert Computer Science and Engineering University of Michigan, United States Boston — Delft Full text available at: http://dx.doi.org/10.1561/0400000074 Foundations and Trends R in Theoretical Computer Science Published, sold and distributed by: now Publishers Inc. PO Box 1024 Hanover, MA 02339 United States Tel. +1-781-985-4510 www.nowpublishers.com [email protected] Outside North America: now Publishers Inc. PO Box 179 2600 AD Delft The Netherlands Tel. +31-6-51115274 The preferred citation for this publication is C. Peikert. A Decade of Lattice Cryptography. Foundations and Trends R in Theoretical Computer Science, vol. 10, no. 4, pp. 283–424, 2014. R This Foundations and Trends issue was typeset in LATEX using a class file designed by Neal Parikh. Printed on acid-free paper. ISBN: 978-1-68083-113-9 c 2016 C. Peikert All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, mechanical, photocopying, recording or otherwise, without prior written permission of the publishers. Photocopying. In the USA: This journal is registered at the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923. Authorization to photocopy items for in- ternal or personal use, or the internal or personal use of specific clients, is granted by now Publishers Inc for users registered with the Copyright Clearance Center (CCC). The ‘services’ for users can be found on the internet at: www.copyright.com For those organizations that have been granted a photocopy license, a separate system of payment has been arranged.
    [Show full text]
  • The Best Nurturers in Computer Science Research
    The Best Nurturers in Computer Science Research Bharath Kumar M. Y. N. Srikant IISc-CSA-TR-2004-10 http://archive.csa.iisc.ernet.in/TR/2004/10/ Computer Science and Automation Indian Institute of Science, India October 2004 The Best Nurturers in Computer Science Research Bharath Kumar M.∗ Y. N. Srikant† Abstract The paper presents a heuristic for mining nurturers in temporally organized collaboration networks: people who facilitate the growth and success of the young ones. Specifically, this heuristic is applied to the computer science bibliographic data to find the best nurturers in computer science research. The measure of success is parameterized, and the paper demonstrates experiments and results with publication count and citations as success metrics. Rather than just the nurturer’s success, the heuristic captures the influence he has had in the indepen- dent success of the relatively young in the network. These results can hence be a useful resource to graduate students and post-doctoral can- didates. The heuristic is extended to accurately yield ranked nurturers inside a particular time period. Interestingly, there is a recognizable deviation between the rankings of the most successful researchers and the best nurturers, which although is obvious from a social perspective has not been statistically demonstrated. Keywords: Social Network Analysis, Bibliometrics, Temporal Data Mining. 1 Introduction Consider a student Arjun, who has finished his under-graduate degree in Computer Science, and is seeking a PhD degree followed by a successful career in Computer Science research. How does he choose his research advisor? He has the following options with him: 1. Look up the rankings of various universities [1], and apply to any “rea- sonably good” professor in any of the top universities.
    [Show full text]
  • Constraint Based Dimension Correlation and Distance
    Preface The papers in this volume were presented at the Fourteenth Annual IEEE Conference on Computational Complexity held from May 4-6, 1999 in Atlanta, Georgia, in conjunction with the Federated Computing Research Conference. This conference was sponsored by the IEEE Computer Society Technical Committee on Mathematical Foundations of Computing, in cooperation with the ACM SIGACT (The special interest group on Algorithms and Complexity Theory) and EATCS (The European Association for Theoretical Computer Science). The call for papers sought original research papers in all areas of computational complexity. A total of 70 papers were submitted for consideration of which 28 papers were accepted for the conference and for inclusion in these proceedings. Six of these papers were accepted to a joint STOC/Complexity session. For these papers the full conference paper appears in the STOC proceedings and a one-page summary appears in these proceedings. The program committee invited two distinguished researchers in computational complexity - Avi Wigderson and Jin-Yi Cai - to present invited talks. These proceedings contain survey articles based on their talks. The program committee thanks Pradyut Shah and Marcus Schaefer for their organizational and computer help, Steve Tate and the SIGACT Electronic Publishing Board for the use and help of the electronic submissions server, Peter Shor and Mike Saks for the electronic conference meeting software and Danielle Martin of the IEEE for editing this volume. The committee would also like to thank the following people for their help in reviewing the papers: E. Allender, V. Arvind, M. Ajtai, A. Ambainis, G. Barequet, S. Baumer, A. Berthiaume, S.
    [Show full text]