Practical Authenticated Pattern Matching with Optimal Proof Size
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Angela: a Sparse, Distributed, and Highly Concurrent Merkle Tree
Angela: A Sparse, Distributed, and Highly Concurrent Merkle Tree Janakirama Kalidhindi, Alex Kazorian, Aneesh Khera, Cibi Pari Abstract— Merkle trees allow for efficient and authen- with high query throughput. Any concurrency overhead ticated verification of data contents through hierarchical should not destroy the system’s performance when the cryptographic hashes. Because the contents of parent scale of the merkle tree is very large. nodes in the trees are the hashes of the children nodes, Outside of Key Transparency, this problem is gen- concurrency in merkle trees is a hard problem. The eralizable to other applications that have the need current standard for merkle tree updates is a naive locking of the entire tree as shown in Google’s merkle tree for efficient, authenticated data structures, such as the implementation, Trillian. We present Angela, a concur- Global Data Plane file system. rent and distributed sparse merkle tree implementation. As a result of these needs, we have developed a novel Angela is distributed using Ray onto Amazon EC2 algorithm for updating a batch of updates to a merkle clusters and uses Amazon Aurora to store and retrieve tree concurrently. Run locally on a MacBook Pro with state. Angela is motivated by Google Key Transparency a 2.2GHz Intel Core i7, 16GB of RAM, and an Intel and inspired by it’s underlying merkle tree, Trillian. Iris Pro1536 MB, we achieved a 2x speedup over the Like Trillian, Angela assumes that a large number of naive algorithm mentioned earlier. We expanded this its 2256 leaves are empty and publishes a new root after algorithm into a distributed merkle tree implementation, some amount of time. -
Pattern Matching Using Similarity Measures
Pattern matching using similarity measures Patroonvergelijking met behulp van gelijkenismaten (met een samenvatting in het Nederlands) PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit Utrecht op gezag van Rector Magnificus, Prof. Dr. H. O. Voorma, ingevolge het besluit van het College voor Promoties in het openbaar te verdedigen op maandag 18 september 2000 des morgens te 10:30 uur door Michiel Hagedoorn geboren op 13 juli 1972, te Renkum promotor: Prof. Dr. M. H. Overmars Faculteit Wiskunde & Informatica co-promotor: Dr. R. C. Veltkamp Faculteit Wiskunde & Informatica ISBN 90-393-2460-3 PHILIPS '$ The&&% research% described in this thesis has been made possible by financial support from Philips Research Laboratories. The work in this thesis has been carried out in the graduate school ASCI. Contents 1 Introduction 1 1.1Patternmatching.......................... 1 1.2Applications............................. 4 1.3Obtaininggeometricpatterns................... 7 1.4 Paradigms in geometric pattern matching . 8 1.5Similaritymeasurebasedpatternmatching........... 11 1.6Overviewofthisthesis....................... 16 2 A theory of similarity measures 21 2.1Pseudometricspaces........................ 22 2.2Pseudometricpatternspaces................... 30 2.3Embeddingpatternsinafunctionspace............. 40 2.4TheHausdorffmetric........................ 46 2.5Thevolumeofsymmetricdifference............... 54 2.6 Reflection visibility based distances . 60 2.7Summary.............................. 71 2.8Experimentalresults....................... -
Enabling Authentication of Sliding Window Queries on Streams
ProofInfused Streams: Enabling Authentication of Sliding Window Queries On Streams Feifei Li† Ke Yi‡ Marios Hadjieleftheriou‡ George Kollios† †Computer Science Department Boston University ‡AT&T Labs Inc. lifeifei, gkollios @cs.bu.edu yike, marioh @research.att.com { } { } ABSTRACT As computer systems are essential components of many crit- ical commercial services, the need for secure online transac- . tions is now becoming evident. The demand for such appli- cations, as the market grows, exceeds the capacity of individ- 0110..011 ual businesses to provide fast and reliable services, making Provider Servers outsourcing technologies a key player in alleviating issues Clients of scale. Consider a stock broker that needs to provide a real-time stock trading monitoring service to clients. Since Figure 1: The outsourced stream model. the cost of multicasting this information to a large audi- ence might become prohibitive, the broker could outsource providers, who would be in turn responsible for forwarding the stock feed to third-party providers, who are in turn re- the appropriate sub-feed to clients (see Figure 1). Evidently, sponsible for forwarding the appropriate sub-feed to clients. especially in critical applications, the integrity of the third- Evidently, in critical applications the integrity of the third- party should not be taken for granted, as the latter might party should not be taken for granted. In this work we study have malicious intent or might be temporarily compromised a variety of authentication algorithms for selection and ag- by other malicious entities. Deceiving clients can be at- gregation queries over sliding windows. Our algorithms en- tempted for gaining business advantage over the original able the end-users to prove that the results provided by the service provider, for competition purposes against impor- third-party are correct, i.e., equal to the results that would tant clients, or for easing the workload on the third-party have been computed by the original provider. -
Providing Authentication and Integrity in Outsourced Databases Using Merkle Hash Tree’S
Providing Authentication and Integrity in Outsourced Databases using Merkle Hash Tree's Einar Mykletun Maithili Narasimha University of California Irvine University of California Irvine [email protected] [email protected] Gene Tsudik University of California Irvine [email protected] 1 Introduction In this paper we intend to describe and summarize results relating to the use of authenticated data structures, specifically Merkle Hash Tree's, in the context of the Outsourced Database Model (ODB). In ODB, organizations outsource their data management needs to an external untrusted service provider. The service provider hosts clients' databases and offers seamless mechanisms to create, store, update and access (query) their databases. Due to the service provider being untrusted, it becomes imperative to provide means to ensure authenticity and integrity in the query replies returned by the provider to clients. It is therefore the goal of this paper to outline an applicable data structure that can support authenticated query replies from the service provider to the clients (who issue the queries). Merkle Hash Tree's (introduced in section 3) is such a data structure. 1.1 Outsourced Database Model (ODB) The outsourced database model (ODB) is an example of the well-known Client-Server model. In ODB, the Database Service Provider (also referred to as the Server) has the infrastructure required to host outsourced databases and provides efficient mechanisms for clients to create, store, update and query the database. The owner of the data is identified as the entity who has to access to add, remove, and modify tuples in the database. The owner and the client may or may not be the same entity. -
Pattern Matching Using Fuzzy Methods David Bell and Lynn Palmer, State of California: Genetic Disease Branch
Pattern Matching Using Fuzzy Methods David Bell and Lynn Palmer, State of California: Genetic Disease Branch ABSTRACT The formula is : Two major methods of detecting similarities Euclidean Distance and 2 a "fuzzy Hamming distance" presented in a paper entitled "F %%y Dij : ; ( <3/i - yj 4 Hamming Distance: A New Dissimilarity Measure" by Bookstein, Klein, and Raita, will be compared using entropy calculations to Euclidean distance is especially useful for comparing determine their abilities to detect patterns in data and match data matching whole word object elements of 2 vectors. The records. .e find that both means of measuring distance are useful code in Java is: depending on the context. .hile fuzzy Hamming distance outperforms in supervised learning situations such as searches, the Listing 1: Euclidean Distance Euclidean distance measure is more useful in unsupervised pattern matching and clustering of data. /** * EuclideanDistance.java INTRODUCTION * * Pattern matching is becoming more of a necessity given the needs for * Created: Fri Oct 07 08:46:40 2002 such methods for detecting similarities in records, epidemiological * * @author David Bell: DHS-GENETICS patterns, genetics and even in the emerging fields of criminal * @version 1.0 behavioral pattern analysis and disease outbreak analysis due to */ possible terrorist activity. 0nfortunately, traditional methods for /** Abstact Distance class linking or matching records, data fields, etc. rely on exact data * @param None matches rather than looking for close matches or patterns. 1f course * @return Distance Template proximity pattern matches are often necessary when dealing with **/ messy data, data that has inexact values and/or data with missing key abstract class distance{ values. -
The SCADS Toolkit Nick Lanham, Jesse Trutna
The SCADS Toolkit Nick Lanham, Jesse Trutna (it’s catching…) SCADS Overview SCADS is a scalable, non-relational datastore for highly concurrent, interactive workloads. Scale Independence - as new users join No changes to application Cost per user doesn’t increase Request latency doesn’t change Key Innovations 1. Performance safe query language 2. Declarative performance/consistency tradeoffs 3. Automatic scale up and down using machine learning Toolkit Motivation Investigated other open-source distributed key-value stores Cassandra, Hypertable, CouchDB Monolithic, opaque point solutions Make many decisions about how to architect the system a -prori Want set of components to rapidly explore the space of systems’ design Extensible components communicate over established APIs Understand the implications and performance bottlenecks of different designs SCADS Components Component Responsibilities Storage Layer Persist and serve data for a specified key responsibility Copy and sync data between storage nodes Data Placement Layer Assign node key responsibilities Manage replication and partition factors Provide clients with key to node mapping Provides mechanism for machine learning policies Client Library Hides client interaction with distributed storage system Provides higher-level constructs like indexes and query language Storage Layer Key-value store that supports range queries built on BDB API Record get(NameSpace ns, RecordKey key) list<Record> get_set (NameSpace ns, RecordSet rs ) bool put(NameSpace ns, Record rec) i32 count_set(NameSpace ns, RecordSet rs) bool set_responsibility_policy(NameSpace ns,RecordSet policy) RecordSet get_responsibility_policy(NameSpace ns) bool sync_set(NameSpace ns, RecordSet rs, Host h, ConflictPolicy policy) bool copy_set(NameSpace ns, RecordSet rs, Host h) bool remove_set(NameSpace ns, RecordSet rs) Storage Layer Storage Layer Storage Layer: Synchronize Replicas may diverge during network partitions (in order to preserve availability). -
Authentication of Moving Range Queries
Authentication of Moving Range Queries Duncan Yung Eric Lo Man Lung Yiu ∗ Department of Computing Hong Kong Polytechnic University {cskwyung, ericlo, csmlyiu}@comp.polyu.edu.hk ABSTRACT the user leaves the safe region, Thus, safe region is a powerful op- A moving range query continuously reports the query result (e.g., timization for significantly reducing the communication frequency restaurants) that are within radius r from a moving query point between the user and the service provider. (e.g., moving tourist). To minimize the communication cost with The query results and safe regions returned by LBS, however, the mobile clients, a service provider that evaluates moving range may not always be accurate. For instance, a hacker may infiltrate queries also returns a safe region that bounds the validity of query the LBS’s servers [25] so that results of queries all include a partic- results. However, an untrustworthy service provider may report in- ular location (e.g., the White House). Furthermore, the LBS may be correct safe regions to mobile clients. In this paper, we present effi- self-compromised and thus ranks sponsored facilities higher than cient techniques for authenticating the safe regions of moving range actual query result. The LBS may also return an overly large safe queries. We theoretically proved that our methods for authenticat- region to the clients for the sake of saving computing resources and ing moving range queries can minimize the data sent between the communication bandwidth [22, 16, 30]. On the other hand, the LBS service provider and the mobile clients. Extensive experiments are may opt to return overly small safe regions so that the clients have carried out using both real and synthetic datasets and results show to request new safe regions more frequently, if the LBS charges fee that our methods incur small communication costs and overhead. -
CSCI 2041: Pattern Matching Basics
CSCI 2041: Pattern Matching Basics Chris Kauffman Last Updated: Fri Sep 28 08:52:58 CDT 2018 1 Logistics Reading Assignment 2 I OCaml System Manual: Ch I Demo in lecture 1.4 - 1.5 I Post today/tomorrow I Practical OCaml: Ch 4 Next Week Goals I Mon: Review I Code patterns I Wed: Exam 1 I Pattern Matching I Fri: Lecture 2 Consider: Summing Adjacent Elements 1 (* match_basics.ml: basic demo of pattern matching *) 2 3 (* Create a list comprised of the sum of adjacent pairs of 4 elements in list. The last element in an odd-length list is 5 part of the return as is. *) 6 let rec sum_adj_ie list = 7 if list = [] then (* CASE of empty list *) 8 [] (* base case *) 9 else 10 let a = List.hd list in (* DESTRUCTURE list *) 11 let atail = List.tl list in (* bind names *) 12 if atail = [] then (* CASE of 1 elem left *) 13 [a] (* base case *) 14 else (* CASE of 2 or more elems left *) 15 let b = List.hd atail in (* destructure list *) 16 let tail = List.tl atail in (* bind names *) 17 (a+b) :: (sum_adj_ie tail) (* recursive case *) The above function follows a common paradigm: I Select between Cases during a computation I Cases are based on structure of data I Data is Destructured to bind names to parts of it 3 Pattern Matching in Programming Languages I Pattern Matching as a programming language feature checks that data matches a certain structure the executes if so I Can take many forms such as processing lines of input files that match a regular expression I Pattern Matching in OCaml/ML combines I Case analysis: does the data match a certain structure I Destructure Binding: bind names to parts of the data I Pattern Matching gives OCaml/ML a certain "cool" factor I Associated with the match/with syntax as follows match something with | pattern1 -> result1 (* pattern1 gives result1 *) | pattern2 -> (* pattern 2.. -
Compiling Pattern Matching to Good Decision Trees
Submitted to ML’08 Compiling Pattern Matching to good Decision Trees Luc Maranget INRIA Luc.marangetinria.fr Abstract In this paper we study compilation to decision tree, whose We address the issue of compiling ML pattern matching to efficient primary advantage is never testing a given subterm of the subject decisions trees. Traditionally, compilation to decision trees is op- value more than once (and whose primary drawback is potential timized by (1) implementing decision trees as dags with maximal code size explosion). Our aim is to refine naive compilation to sharing; (2) guiding a simple compiler with heuristics. We first de- decision trees, and to compare the output of such an optimizing sign new heuristics that are inspired by necessity, a notion from compiler with optimized backtracking automata. lazy pattern matching that we rephrase in terms of decision tree se- Compilation to decision can be very sensitive to the testing mantics. Thereby, we simplify previous semantical frameworks and order of subject value subterms. The situation can be explained demonstrate a direct connection between necessity and decision by the example of an human programmer attempting to translate a ML program into a lower-level language without pattern matching. tree runtime efficiency. We complete our study by experiments, 1 showing that optimized compilation to decision trees is competi- Let f be the following function defined on triples of booleans : tive. We also suggest some heuristics precisely. l e t f x y z = match x,y,z with | _,F,T -> 1 Categories and Subject Descriptors D 3. 3 [Programming Lan- | F,T,_ -> 2 guages]: Language Constructs and Features—Patterns | _,_,F -> 3 | _,_,T -> 4 General Terms Design, Performance, Sequentiality. -
Executing Large Scale Processes in a Blockchain
The current issue and full text archive of this journal is available on Emerald Insight at: www.emeraldinsight.com/2514-4774.htm JCMS 2,2 Executing large-scale processes in a blockchain Mahalingam Ramkumar Department of Computer Science and Engineering, Mississippi State University, 106 Starkville, Mississippi, USA Received 16 May 2018 Revised 16 May 2018 Abstract Accepted 1 September 2018 Purpose – The purpose of this paper is to examine the blockchain as a trusted computing platform. Understanding the strengths and limitations of this platform is essential to execute large-scale real-world applications in blockchains. Design/methodology/approach – This paper proposes several modifications to conventional blockchain networks to improve the scale and scope of applications. Findings – Simple modifications to cryptographic protocols for constructing blockchain ledgers, and digital signatures for authentication of transactions, are sufficient to realize a scalable blockchain platform. Originality/value – The original contributions of this paper are concrete steps to overcome limitations of current blockchain networks. Keywords Cryptography, Blockchain, Scalability, Transactions, Blockchain networks Paper type Research paper 1. Introduction A blockchain broadcast network (Bozic et al., 2016; Croman et al., 2016; Nakamoto, 2008; Wood, 2014) is a mechanism for creating a distributed, tamper-proof, append-only ledger. Every participant in the broadcast network maintains a copy, or some representation, of the ledger. Ledger entries are made by consensus on the states of a process “executed” on the blockchain. As an example, in the Bitcoin (Nakamoto, 2008) process: (1) Bitcoin transactions transfer Bitcoins from a wallet to one or more other wallets. (2) Bitcoins created by “mining” are added to the miner’s wallet. -
IB Case Study Vocabulary a Local Economy Driven by Blockchain (2020) Websites
IB Case Study Vocabulary A local economy driven by blockchain (2020) Websites Merkle Tree: https://blockonomi.com/merkle-tree/ Blockchain: https://unwttng.com/what-is-a-blockchain Mining: https://www.buybitcoinworldwide.com/mining/ Attacks on Cryptocurrencies: https://blockgeeks.com/guides/hypothetical-attacks-on-cryptocurrencies/ Bitcoin Transaction Life Cycle: https://ducmanhphan.github.io/2018-12-18-Transaction-pool-in- blockchain/#transaction-pool 51 % attack - a potential attack on a blockchain network, where a single entity or organization can control the majority of the hash rate, potentially causing a network disruption. In such a scenario, the attacker would have enough mining power to intentionally exclude or modify the ordering of transactions. Block - records, which together form a blockchain. ... Blocks hold all the records of valid cryptocurrency transactions. They are hashed and encoded into a hash tree or Merkle tree. In the world of cryptocurrencies, blocks are like ledger pages while the whole record-keeping book is the blockchain. A block is a file that stores unalterable data related to the network. Blockchain - a data structure that holds transactional records and while ensuring security, transparency, and decentralization. You can also think of it as a chain or records stored in the forms of blocks which are controlled by no single authority. Block header – main way of identifying a block in a blockchain is via its block header hash. The block hash is responsible for block identification within a blockchain. In short, each block on the blockchain is identified by its block header hash. Each block is uniquely identified by a hash number that is obtained by double hashing the block header with the SHA256 algorithm. -
Combinatorial Pattern Matching
Combinatorial Pattern Matching 1 A Recurring Problem Finding patterns within sequences Variants on this idea Finding repeated motifs amoungst a set of strings What are the most frequent k-mers How many time does a specific k-mer appear Fundamental problem: Pattern Matching Find all positions of a particular substring in given sequence? 2 Pattern Matching Goal: Find all occurrences of a pattern in a text Input: Pattern p = p1, p2, … pn and text t = t1, t2, … tm Output: All positions 1 < i < (m – n + 1) such that the n-letter substring of t starting at i matches p def bruteForcePatternMatching(p, t): locations = [] for i in xrange(0, len(t)-len(p)+1): if t[i:i+len(p)] == p: locations.append(i) return locations print bruteForcePatternMatching("ssi", "imissmissmississippi") [11, 14] 3 Pattern Matching Performance Performance: m - length of the text t n - the length of the pattern p Search Loop - executed O(m) times Comparison - O(n) symbols compared Total cost - O(mn) per pattern In practice, most comparisons terminate early Worst-case: p = "AAAT" t = "AAAAAAAAAAAAAAAAAAAAAAAT" 4 We can do better! If we preprocess our pattern we can search more effciently (O(n)) Example: imissmissmississippi 1. s 2. s 3. s 4. SSi 5. s 6. SSi 7. s 8. SSI - match at 11 9. SSI - match at 14 10. s 11. s 12. s At steps 4 and 6 after finding the mismatch i ≠ m we can skip over all positions tested because we know that the suffix "sm" is not a prefix of our pattern "ssi" Even works for our worst-case example "AAAAT" in "AAAAAAAAAAAAAAT" by recognizing the shared prefixes ("AAA" in "AAAA").