Similarity Search: Algorithms for Sets and Other High Dimensional Data

Total Page:16

File Type:pdf, Size:1020Kb

Similarity Search: Algorithms for Sets and Other High Dimensional Data Similarity Search: Algorithms for Sets and other High Dimensional Data Thomas Dybdahl Ahle Advisor: Rasmus Pagh Submitted: March 2019 ii Abstract We study five fundemental problems in the field of high dimensional algo- rithms, similarity search and machine learning. We obtain many new results, including: • Data structures with Las Vegas guarantees for `1-Approximate Near Neigh- bours and Braun Blanquet Approximate Set Similarity Search with perfor- mance matching state of the art Monte Carlo algorithms up to factors no(1). • “Supermajorities:” The first space/time tradeoff for Approximate Set Sim- ilarity Search, along with a number of lower bounds suggesting that the algorithms are optimal among all Locality Sensitive Hashing (LSH) data structures. • The first lower bounds on approximative data structures based on the Or- thogonal Vectors Conjecture (OVC). • Output Sensitive LSH data structures with query time near t + nr for out- putting t distinct elements, where nr is the state of the art time to output a single element and tnr was the previous best for t items. • A Johnson Lindenstrauss embedding with fast multiplication on tensors (a Tensor Sketch), with high probability guarantees for subspace- and near- neighbour-embeddings. Resumé Vi kommer ind på fem fundementale problemer inden for højdimensionelle algoritmer, similarity search og machine learning. Vi opdager mange nye resultater, blandt andet: • Datastrukturer med Las Vegas garantier for `1-Approksimativ Nær Nabo og Braun Blanquet Approksimativ Mængde-similaritetssøgning der matcher state of the art Monte Carlo algoritmer op til no(1) faktorer i køretid og plads. • “Kvalificerede flertal:” De første tid-/plads-trade-offs for Approksimativ Mængde-similaritetssøgning samt et antal nedre grænser der indikerer at algoritmerne er optimale inden for Locality Sensitive Hashing (LSH) data- strukturer. • De første nedre grænser for approksimative datastrukturer baseret på Ort- hogonal Vectors Conjecture (OVC). • En Output Sensitive LSH datastrukturer med forespørgselstid næsten t + nr når den skal returnere t forskellige punkter. Her er nr den tid det tager state of the art algoritmen at returnere et enkelt element og tnr tiden det før tog at returnere t punkter. • En Johnson Lindenstrauss embedding der tillader hurtig multiplikation på tensorer (en Tensor Sketch). I modsætning til tidligere resultater er den med høj sandsynlighed en subspapce embedding. iii Acknowledgements Starting this thesis four years ago, I thought it would be all about coming up with cool new mathematics. It mostly was, but lately I have I started to suspect that there may be more to it. I want to start by thanking my superb Ph.D. advisor Rasmus Pagh. I always knew I wanted to learn from his conscientiousness and work-life balance. Working with him I have enjoyed his open office door, always ready to listen to ideas, be they good, bad or half baked. Even once he moved to California, he was still easy to reach and quick to give feedback on manuscripts or job search. I may not have been the easiest of students – always testing the limits – but when I decided to shorten my studies to go co-found a startup, Rasmus was still supportive all the way through. I would like to thank all my co-authors and collaborators, past, present and future. The papers I have written with you have always been more fun, and better written, than the once I have done on my own. Academia doesn’t always feel like a team sport, but perhaps it should. The entire Copenhagen algorithms society has been an important part of my Ph.D. life. Mikkel Thorup inspired us with memorable quotes such as “Only work on things that have 1% chance of success,” and “always celebrate breakthroughs immediately, tomorrow they may not be.” Thore Husfeldt made sure we never forgot the ethical ramifications of our work, and Troels Lund reminded us that not everyone understands infinities. At ITU I want to thank Johan Sivertsen, Tobias Christiani and Matteo Dusefante for our great office adventures and the other pagan tribes. In the first year of my Ph.D. I would sometimes stay very long hours at the office and even slept there once. A year into my Ph.D. I bought a house with 9 other people, in one of the best decisions of my life. Throughout my work, the people at Gejst have been a constant source of support, discussions, sanity and introspection. With dinner waiting at home in the evening I finally had a good reason to leave the office. In 2017 I visited Eric Price and the algorithms group in Austin, Texas. He and ev- eryone there were extremely hospitable and friendly. I want to thank John Kallaugher, Adrian Trejo Nuñez and everyone else there for making my stay enjoyable and exciting. In general the international community around Theoretical Computer Science has been nothing but friendly and accepting. Particularly the Locality Sensitive Hashing group around MIT and Columbia University has been greatly inspiring, and I want to thank Ilya Razenshteyn for many illuminating discussions. In Austin I also met Morgan Mingle, who became a beautiful influence on my life. So many of my greatest memories over the last two years focus around her. She is also the only person I have never been able to interest in my research.. 1 Finally I want to thank my parents, siblings and family for their unending support. Even as I have been absent minded and moved between countries, they have always been there, visiting me and inviting me to occasions small as well as big. 1She did help edit this thesis though — like the acknowledgements. Contents Contents v 1 Introduction 1 1.1 Overview of Problems and Contributions ........................ 2 1.2 Similarity Search ................................... 3 1.3 Approximate Similarity Search ............................. 4 1.4 Locality Sensitive Hashing ............................... 5 1.5 Las Vegas Similarity Search .............................. 8 1.6 Output Sensitive Similarity Search ........................... 11 1.7 Hardness of Similarity Search through Orthogonal Vectors ................ 12 1.8 Tensor Sketching ................................... 12 2 Small Sets Need Supermajorities: Towards Optimal Hashing-based Set Similarity 15 2.1 Introduction ..................................... 15 2.2 Preliminaries ..................................... 25 2.3 Upper bounds .................................... 26 2.4 Lower bounds .................................... 32 2.5 Conclusion ...................................... 38 2.6 Appendix ...................................... 39 3 Optimal Las Vegas Locality Sensitive Data Structures 41 3.1 Introduction ..................................... 41 3.2 Overview ...................................... 48 3.3 Hamming Space Data Structure ............................ 50 3.4 Set Similarity Data Structure ............................. 55 3.5 Conclusion and Open Problems ............................ 61 3.6 Appendix ...................................... 63 4 Parameter-free Locality Sensitive Hashing for Spherical Range Reporting 69 4.1 Introduction ..................................... 69 4.2 Preliminaries ..................................... 73 4.3 Data Structure .................................... 75 4.4 Standard LSH, Local Expansion, and Probing the Right Level ............... 75 4.5 Adaptive Query Algorithms .............................. 78 4.6 A Probing Sequence in Hamming Space ......................... 84 4.7 Conclusion ...................................... 87 vi Contents 4.8 Appendix ...................................... 88 5 On the Complexity of Inner Product Similarity Join 95 5.1 Introduction ..................................... 95 5.2 Hardness of IPS join ................................. 102 5.3 Limitations of LSH for IPS ............................... 108 5.4 Upper bounds .................................... 113 5.5 Conclusion ...................................... 116 6 High Probability Tensor Sketch 117 6.1 Introduction ..................................... 117 6.2 Preliminaries ..................................... 122 6.3 Technical Overview .................................. 124 6.4 The High Probability Tensor Sketch ........................... 125 6.5 Fast Constructions .................................. 127 6.6 Applications ..................................... 129 6.7 Appendix ...................................... 131 Bibliography 137 Chapter 1 Introduction Right action is better than knowledge, but in order to do what is right, we must know what is right. Charlemagne As the power of computing has increased, so has the responsibility given to it by society. Machine Learning and algorithms are now asked to identify hate speech on Facebook, piracy on YouTube, and risky borrowers at banks. These tasks all require computations to be done on the meaning behind text, video and data, rather than simply the characters, pixels and bits. Oddly the philosophically complex idea of “meaning” is now commonly represented in Computer Science simply as a vector in Rd. However, this still leaves the questions of what computations can be done, how, and using how many resources. For example, say we want to find duplicate posts in Facebook’s reportedly 300 billion comments a year. A classic but efficient way to represent the meaning of documents such as a Facebook post is the “bag of words” approach. We create a vector in f0, 1gd, indicating for every word in the English Language whether it is present in the document or not. We would then want to find two vectors with a large overlap, however such
Recommended publications
  • Tarjan Transcript Final with Timestamps
    A.M. Turing Award Oral History Interview with Robert (Bob) Endre Tarjan by Roy Levin San Mateo, California July 12, 2017 Levin: My name is Roy Levin. Today is July 12th, 2017, and I’m in San Mateo, California at the home of Robert Tarjan, where I’ll be interviewing him for the ACM Turing Award Winners project. Good afternoon, Bob, and thanks for spending the time to talk to me today. Tarjan: You’re welcome. Levin: I’d like to start by talking about your early technical interests and where they came from. When do you first recall being interested in what we might call technical things? Tarjan: Well, the first thing I would say in that direction is my mom took me to the public library in Pomona, where I grew up, which opened up a huge world to me. I started reading science fiction books and stories. Originally, I wanted to be the first person on Mars, that was what I was thinking, and I got interested in astronomy, started reading a lot of science stuff. I got to junior high school and I had an amazing math teacher. His name was Mr. Wall. I had him two years, in the eighth and ninth grade. He was teaching the New Math to us before there was such a thing as “New Math.” He taught us Peano’s axioms and things like that. It was a wonderful thing for a kid like me who was really excited about science and mathematics and so on. The other thing that happened was I discovered Scientific American in the public library and started reading Martin Gardner’s columns on mathematical games and was completely fascinated.
    [Show full text]
  • Qualitative and Quantitative Security Analyses for Zigbee Wireless Sensor Networks
    Downloaded from orbit.dtu.dk on: Sep 27, 2018 Qualitative and Quantitative Security Analyses for ZigBee Wireless Sensor Networks Yuksel, Ender; Nielson, Hanne Riis; Nielson, Flemming Publication date: 2011 Document Version Publisher's PDF, also known as Version of record Link back to DTU Orbit Citation (APA): Yuksel, E., Nielson, H. R., & Nielson, F. (2011). Qualitative and Quantitative Security Analyses for ZigBee Wireless Sensor Networks. Kgs. Lyngby, Denmark: Technical University of Denmark (DTU). (IMM-PHD-2011; No. 247). General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Qualitative and Quantitative Security Analyses for ZigBee Wireless Sensor Networks Ender Y¨uksel Kongens Lyngby 2011 IMM-PHD-2011-247 Technical University of Denmark Informatics and Mathematical Modelling Building 321, DK-2800 Kongens Lyngby, Denmark Phone +45 45253351, Fax +45 45882673 [email protected] www.imm.dtu.dk IMM-PHD: ISSN 0909-3192 Summary Wireless sensor networking is a challenging and emerging technology that will soon become an inevitable part of our modern society.
    [Show full text]
  • Finding Rising Stars in Co-Author Networks Via Weighted Mutual Influence
    Finding Rising Stars in Co-Author Networks via Weighted Mutual Influence Ali Daud a,b Naif Radi Aljohani a Rabeeh Ayaz Abbasi a,c a Faculty of Computing and a Faculty of Computing and c Department of Computer Science, Information Technology, King Information Technology, King Quaid-i-Azam University, Islamabad, Abdulaziz University, Jeddah, Saudi Abdulaziz University, Jeddah, Saudi Pakistan Arabia Arabia [email protected] [email protected] [email protected] b b d Zahid Rafique Tehmina Amjad Hussain Dawood b Department of Computer Science b Department of Computer Science d Faculty of Computing and and Software Engineering, and Software Engineering, Information Technology, University International Islamic University, International Islamic University, of Jeddah, Jeddah, Saudi Arabia Islamabad, Pakistan Islamabad, Pakistan [email protected] [email protected] [email protected] a Khaled H. Alyoubi a Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia [email protected] ABSTRACT KEYWORDS Finding rising stars is a challenging and interesting task which is Bibliographic databases; Citations; Author Order; Rising Stars; Mutual being investigated recently in co-author networks. Rising stars are Influence; Co-Author Networks authors who have a low research profile in the start of their career but may become experts in the future. This paper introduces a new method Weighted Mutual Influence Rank (WMIRank) for finding 1. INTRODUCTION rising stars. WMIRank exploits influence of co-authors’ citations, The rapid evolution of scientific research has created large order of appearance and publication venues. Comprehensive volumes of publications every year, and is expected to continue in experiments are performed to analyze the performance of near future [21].
    [Show full text]
  • Moshe Y. Vardi
    MOSHE Y. VARDI Address Office Home Department of Computer Science 4515 Merrie Lane Rice University Bellaire, TX 77401 P.O.Box 1892 Houston, TX 77251-1892 Tel: (713) 348-5977 Tel: (713) 665-5900 Fax: (713) 348-5930 Fax: (713) 665-5900 E-mail: [email protected] URL: http://www.cs.rice.edu/∼vardi Research Interests Applications of logic to computer science: • database systems • complexity theory • multi-agent systems • specification and verification of hardware and software Education Sept. 1974 B.Sc. in Physics and Computer Science (Summa cum Laude), Bar-Ilan University, Ramat Gan, Israel May 1980 M.Sc. in Computer Science, The Feinberg Graduate School, The Weizmann Insti- tute of Science, Rehovoth, Israel Thesis: Axiomatization of Functional and Join Dependencies in the Relational Model. Advisors: Prof. C. Beeri (Hebrew University) Prof. P. Rabinowitz Sept 1981 Ph.D. in Computer Science, Hebrew University, Jerusalem, Israel Thesis: The Implication Problem for Data Dependencies in the Relational Model. Advisor: Prof. C. Beeri Professional Experience Nov. 1972 – June 1973 Teaching Assistant, Dept. of Mathematics, Bar-Ilan University. Course: Introduction to computing. 1 Nov. 1978 – June 1979 Programmer, The Weizmann Institute of Science. Assignments: implementing a discrete-event simulation package and writing input-output routines for a database management system. Feb. 1979 – Oct. 1980 Research Assistant, Inst. of Math. and Computer Science, The Hebrew University of Jerusalem. Research subject: Theory of data dependencies. Nov. 1980 – Aug. 1981 Instructor, Inst. of Math. and Computer Science, The Hebrew University of Jerusalem. Course: Advanced topics in database theory. Sep. 1981 – Aug. 1983 Postdoctoral Scholar, Dept. of Computer Science, Stanford Univer- sity.
    [Show full text]
  • Michael Oser Rabin Automata, Logic and Randomness in Computation
    Michael Oser Rabin Automata, Logic and Randomness in Computation Luca Aceto ICE-TCS, School of Computer Science, Reykjavik University Pearls of Computation, 6 November 2015 \One plus one equals zero. We have to get used to this fact of life." (Rabin in a course session dated 30/10/1997) Thanks to Pino Persiano for sharing some anecdotes with me. Luca Aceto The Work of Michael O. Rabin 1 / 16 Michael Rabin's accolades Selected awards and honours Turing Award (1976) Harvey Prize (1980) Israel Prize for Computer Science (1995) Paris Kanellakis Award (2003) Emet Prize for Computer Science (2004) Tel Aviv University Dan David Prize Michael O. Rabin (2010) Dijkstra Prize (2015) \1970 in computer science is not classical; it's sort of ancient. Classical is 1990." (Rabin in a course session dated 17/11/1998) Luca Aceto The Work of Michael O. Rabin 2 / 16 Michael Rabin's work: through the prize citations ACM Turing Award 1976 (joint with Dana Scott) For their joint paper \Finite Automata and Their Decision Problems," which introduced the idea of nondeterministic machines, which has proved to be an enormously valuable concept. ACM Paris Kanellakis Award 2003 (joint with Gary Miller, Robert Solovay, and Volker Strassen) For \their contributions to realizing the practical uses of cryptography and for demonstrating the power of algorithms that make random choices", through work which \led to two probabilistic primality tests, known as the Solovay-Strassen test and the Miller-Rabin test". ACM/EATCS Dijkstra Prize 2015 (joint with Michael Ben-Or) For papers that started the field of fault-tolerant randomized distributed algorithms.
    [Show full text]
  • Datalog Unchained
    Datalog Unchained Victor Vianu UC San Diego & Inria [email protected] ABSTRACT robust class of fixpoint queries, previously defined by extensions of This is the companion paper of a talk in the Gems of PODS se- first-order logic with a fixpoint operator. ries, that reviews the development, starting at PODS 1988, of a The two PODS 1988 articles initiated a line of research [3, 6, 8, family of Datalog-like languages with procedural, forward chain- 14, 15, 104, 116] that extended the forward chaining approach to an ing semantics, providing an alternative to the classical declarative, entire family of Datalog-like languages, providing an alternative model-theoretic semantics. These languages also provide a unified paradigm for database queries, with natural counterparts to classical formalism that can express important classes of queries including query languages including the fixpoint and while queries, and all the fixpoint, while, and all computable queries. They can also incor- way to computable queries. The forward chaining approach turned porate in a natural fashion updates and nondeterminism. Datalog out to also be suitable for studying nondeterministic languages, variants with forward chaining semantics have been adopted in a yielding an appealing unifying formalism. The present paper tells variety of settings, including active databases, production systems, the story of this development. distributed data exchange, and data-driven reactive systems. After a brief review of classical query languages, we turn to Datalog and its extensions. We begin with an informal overview of ACM Reference Format: Datalog and Datalog: with stratified and well-founded semantics. Victor Vianu. 2021. Datalog Unchained.
    [Show full text]
  • Proposal) RA PTIME X⊆ + ⊇ X⊇ X X
    U4U - Taming Uncertainty with Uncertainty-Annotated Databases Division: CISE/IIS/III Boris Glavic, Oliver Kennedy, Atri Rudra Keywords: Uncertainty; Incomplete Databases; Certain Answers survey 1 Overview AgeRange Zip Gender Commute Uncertainty is prevalent in data analysis, no matter 1 18-24 10003 M 1 hour what the size of the data, the application domain, or the 2 24-36 10003 M 90 3 37-45 10007 NULL 50 min type of analysis. Common sources of uncertainty include 4 0-18 10007 F 40 min missing values, sensor errors and noise, bias, outliers, 5 NULL 14260 F 1 mismatched data, and many more. If ignored, data uncer- 6 0-18 14260 Other 70 min tainty can result in hard to trace errors in analytical results, Figure 1: Example input with missing values which in turn can have severe real world implications like unfounded scientific discoveries, financial damages, or even effects on people’s physical well-being (e.g., medical decisions based on incorrect data). Example 1 Consider the example dataset survey in Figure1, which illustrates responses to a survey of com- muters, with one row per response. Several NULL values indicate omitted responses. Commute (duration) is a free-text response and follows multiple response formats. Several of these (in red) lack units, can not be parsed, and are consequently replaced with NULL. A data scientist explores this data with the query: SELECT Zip,AVG(Commute)FROM surveyWHERE Gender = 'F'GROUPBY Zip SQL’s NULL value semantics would silently exclude rows 2, 3 and 5 when computing results. Dropping records that contain missing data is one commonly used approach to managing uncertainty in data science, but by no means the only one.
    [Show full text]
  • SIGACT Viability
    SIGACT Name and Mission: SIGACT: SIGACT Algorithms & Computation Theory The primary mission of ACM SIGACT (Association for Computing Machinery Special Interest Group on Algorithms and Computation Theory) is to foster and promote the discovery and dissemination of high quality research in the domain of theoretical computer science. The field of theoretical computer science is interpreted broadly so as to include algorithms, data structures, complexity theory, distributed computation, parallel computation, VLSI, machine learning, computational biology, computational geometry, information theory, cryptography, quantum computation, computational number theory and algebra, program semantics and verification, automata theory, and the study of randomness. Work in this field is often distinguished by its emphasis on mathematical technique and rigor. Newsletters: Volume Issue Issue Date Number of Pages Actually Mailed 2014 45 01 March 2014 N/A 2013 44 04 December 2013 104 27-Dec-13 44 03 September 2013 96 30-Sep-13 44 02 June 2013 148 13-Jun-13 44 01 March 2013 116 18-Mar-13 2012 43 04 December 2012 140 29-Jan-13 43 03 September 2012 120 06-Sep-12 43 02 June 2012 144 25-Jun-12 43 01 March 2012 100 20-Mar-12 2011 42 04 December 2011 104 29-Dec-11 42 03 September 2011 108 29-Sep-11 42 02 June 2011 104 N/A 42 01 March 2011 140 23-Mar-11 2010 41 04 December 2010 128 15-Dec-10 41 03 September 2010 128 13-Sep-10 41 02 June 2010 92 17-Jun-10 41 01 March 2010 132 17-Mar-10 Membership: based on fiscal year dates 1st Year 2 + Years Total Year Professional
    [Show full text]
  • SNIP, IPP, and SJR 2014: Proceedings
    SNIP, IPP, and SJR 2014: Proceedings Source title SNIP IPP SJR 10AIChE - 2010 AIChE Annual Meeting, Conference Proceedings 10AIChE - 2010 AIChE Spring Meeting and 6th Global Congress on Process Safety 10th ACIS Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2009, In conjunction with IWEA 2009 and WEACR 2009 10th ACM/IEEE International Conference on Formal Methods and Models for Codesign, MEMOCODE 2012 0.020 0.071 10th AIAA Aviation Technology, Integration and Operations Conference 2010, ATIO 2010 10th AIAA Multidisciplinary Design Optimization Specialist Conference 10th AIAA/ASME Joint Thermophysics and Heat Transfer Conference 10th AIAA/NAL-NASDA-ISAS International Space Planes and Hypersonic Systems and Technologies Conference 10th Americas Conference on Wind Engineering, ACWE 2005 10th Annual Applied Ergonomics Conference: Celebrating the Past - Shaping the Future 10th Annual International Energy Conversion Engineering Conference, IECEC 2012 0.045 0.015 10th Australian Conference for Knowledge Management and Intelligent Decision Support 2007: Theory Wedded to Practice, ACKMIDS'07 10th Canadian Workshop on Information Theory, CWIT 2007 10th Cologne-Twente Workshop on Graphs and Combinatorial Optimization, CTW 2011 - Proceedings of the 0.032 0.030 Conference 10th Electronics Packaging Technology Conference, EPTC 2008 10th European Conference on Information Warfare and Security 2011, ECIW 2011 0.055 0.023 10th European Conference on Pattern Languages of Programs, EuroPLoP 2005 - Preliminary Conference Proceedings 10th FPGAworld Conference - Academic Proceedings 2013, FPGAworld 2013 0.066 0.100 10th ICETE 2013; SIGMAP 2013 - 10th Int. Conf. on Signal Processing and Multimedia Applications and WINSYS 2013 - 10th Int. Conf. on Wireless Information Networks and Systems, Proc.
    [Show full text]
  • Middle-Schoolers Are Ready, Ready, Ready for Programming Adventure
    A RESEARCH AND ALUMNI NEWS MAGAZINE DEPARTMENT OF COMPUTER SCIENCE BROWN UNIVERSITY Middle-Schoolers Are Ready, Ready, Ready for Programming Adventure Also inside: ■ CS Women Take Exception To Being Typecast ■ Tunnels, Bunkers and Nukes: My Underground Vacation Spring | Summer 2011 Van Hentenryck on their new books: Tom’s Notes from the Chair: Operating Systems in Depth: Design and Pro- the Latest News from gramming and Pascal’s Hybrid Optimization: The Ten Years of CPAIOR. My book Introduc- 115 Waterman tion to Computer Security (coauthored with Mi- chael Goodrich) was also recently published. We are thrilled to Greetings to all CS alums, supporters and add these three vol- friends. umes to the de- partment’s library The spring semester is flying by and the CIT is of faculty-penned bustling with activity. Wonderful things continue books. More on to happen in the department and I am thrilled Pascal and Tom’s to be able to share the highlights with you. books can be found As part of a review of all departments and cen- on page 14. ters being conducted by the university, our de- It is with mixed partment underwent a review late last semester. feelings that I announce the departures of Mi- We are encouraged that strengths were found in chael Black and Meinolf Sellmann. Both have all major dimensions of research, teaching, ad- moved on to exciting opportunities: Michael ministration, and outreach. Based on feedback is a Director of a newly established Max Planck from the reviewers and from the university’s Ac- Institute in Tübingen, Germany and Meinolf ademic Priorities Committee, we plan to broad- is the Group Leader of AI for Optimization at en the horizon of our strategic planning efforts IBM Research in Yorktown Heights, New York.
    [Show full text]
  • Submission Data for 2020-2021 CORE Conference Ranking Process International Colloquium on Automata Languages and Programming
    Submission Data for 2020-2021 CORE conference Ranking process International Colloquium on Automata Languages and Programming Anca Muscholl, Artur Czumaj Conference Details Conference Title: International Colloquium on Automata Languages and Programming Acronym : ICALP Rank: A Requested Rank Rank: A* Recent Years Proceedings Publishing Style Proceedings Publishing: series Link to most recent proceedings: https://drops.dagstuhl.de/opus/portals/lipics/index.php?semnr=16154 Further details: ICALP proceedings have been published since 2015 in the high-quality open-access series LIPIcs (Leibniz International Proceedings in Informatics), in cooperation with Schloss Dagstuhl-Leibniz Center for Informatics. Until 2015 ICALP proceedings were published with LNCS. Most Recent Years Most Recent Year Year: 2019 URL: https://icalp2019.upatras.gr/ Location: Patras, Greece Papers submitted: 502 Papers published: 147 Acceptance rate: 29 Source for numbers: https://drops.dagstuhl.de/opus/volltexte/2019/10576/pdf/LIPIcs-ICALP-2019-0.pdf General Chairs Name: Christos Zaroliagis Affiliation: Univ. of Patras, Greece Gender: M H Index: 28 GScholar url: https://scholar.google.com/citations?hl=en&user=L3AQ8wkAAAAJ DBLP url: https://dblp.org/pid/z/CDZaroliagis.html Name: Sotiris Nicoletseas Affiliation: Univ. of Patras, Greece Gender: SELECT H Index: 40 GScholar url: https://scholar.google.com/citations?hl=en&user=hk2ylNwAAAAJ DBLP url: https://dblp.org/pid/n/SotirisENikoletseas.html Program Chairs 1 Name: Stefano Leonardi Affiliation: Sapienza University of Rome, Italy Gender: M H Index: 44 GScholar url: https://scholar.google.com/citations?hl=en&user=p5LCHHEAAAAJ DBLP url: https://dblp.org/pid/l/StefanoLeonardi.html Name: Christel Baier Affiliation: TU Dresden, Germany Gender: F H Index: 49 GScholar url: https://scholar.google.com/citations?user=p8sX7r0AAAAJ&hl=en&oi=ao DBLP url: https://dblp.org/pid/b/ChristelBaier.html Name: Paola Flocchini Affiliation: Univ.
    [Show full text]
  • Thanassis Avgerinos
    Thanassis Avgerinos Carnegie Mellon University Citizenship: Greek Department of Electrical and Office: CIC #2131A Computer Engineering Email: [email protected] 5000 Forbes Avenue, Homepage: www.ece.cmu.edu/~aavgerin Pittsburgh, Pennsylvania, USA Work Experience & Education 2014-Present Founder and Software Engineer at ForAllSecure, Inc. 2009-2014 Research Assistant, PhD in Electrical and Computer Engineering, Carnegie Mellon University. • Advisor: Prof. David Brumley. • Areas: Software Security & Program Analysis. • Committee: David Brumley, Virgil Gligor, Andr´ePlatzer, George Candea. • Dissertation Title: Exploiting Tradeoffs in Symbolic Execution for Identifying Security Bugs. 2013 Master of Science in Electrical and Computer Engineering, Carnegie Mellon University. • Areas: Software Security & Program Analysis, GPA 4.0. 2004-2009 Diploma in Electrical and Computer Engineering, National Technical University of Athens. • Major: Computer Science. • Minor: Computer Networks and Telecommunications. • Honors: Summa Cum Laude, GPA 9.86/10 (1st out of 600+ students). Dissertation: • Diploma Thesis: Automatic Refactoring of Erlang Programs. • Advisor: Prof. Kostis Sagonas. • Committee: Kostis Sagonas, Nikolaos Papaspyrou, Stathis Zachos Research Interests • Program Analysis and Verification. • Software Security. • Programming Language Theory, Design and Implementation. • Compilers, compilation techniques and optimizations. • Software Engineering. Publications [1] Alexandre Rebert, Sang Kil Cha, Thanassis Avgerinos, Jonathan Foote, David Warren, Gustavo Grieco, and David Brumley. Optimizing seed selection for fuzzing. In 23rd USENIX Security Symposium (USENIX Security 14). USENIX Association, August 2014. [2] Thanassis Avgerinos, Alexandre Rebert, Sang Kil Cha, and David Brumley. Enhancing symbolic exe- cution with veritesting. In Proceedings of the 36th International Conference on Software Engineering, June 2014. [3] Thanassis Avgerinos, Sang Kil Cha, Alexandre Rebert, Edward J. Schwartz, Maverick Woo, and David Brumley.
    [Show full text]