Composite Subset Measures

Total Page:16

File Type:pdf, Size:1020Kb

Composite Subset Measures Composite Subset Measures Lei Chen1, Raghu Ramakrishnan1,2, Paul Barford1, Bee-Chung Chen1, Vinod Yegneswaran1 1 Computer Sciences Department, University of Wisconsin, Madison, WI, USA 2 Yahoo! Research, Santa Clara, CA, USA {chenl, pb, beechung, vinod}@cs.wisc.edu [email protected] ABSTRACT into a hierarchy, leading to a very natural notion of multidimen- Measures are numeric summaries of a collection of data records sional regions. Each region represents a data subset. Summarizing produced by applying aggregation functions. Summarizing a col- the records that belong to a region by applying an aggregate opera- lection of subsets of a large dataset, by computing a measure for tor, such as SUM, to a measure attribute (thereby computing a new each subset in the (typically, user-specified) collection is a funda- measure that is associated with the entire region) is a fundamental mental problem. The multidimensional data model, which treats operation in OLAP. records as points in a space defined by dimension attributes, offers Often, however, more sophisticated analysis can be carried out a natural space of data subsets to be considered as summarization based on the computed measures, e.g., identifying regions with candidates, and traditional SQL and OLAP constructs, such as abnormally high measures. For such analyses, it is necessary to GROUP BY and CUBE, allow us to compute measures for subsets compute the measure for a region by also considering other re- drawn from this space. However, GROUP BY only allows us to gions (which are, intuitively, “related” to the given region) and summarize a limited collection of subsets, and CUBE summarizes their measures. In this paper, we introduce composite subset meas- all subsets in this space. Further, they restrict the measure used to ures, which differ from the traditional GROUP BY and CUBE summarize a data subset to be a one-step aggregation, using func- approaches in three ways: tions such as SUM, of field-values in the data records. 1) The measures for a region can be computed as the summaries In this paper, we introduce composite subset measures, computed of other “related” regions in a compositional manner. The re- by aggregating not only data records but also the measures of other lationships capture various types of proximity in related subsets. We allow summarization of naturally related re- multidimensional space. gions in the multidimensional space, offering more flexibility than 2) In contrast to the CUBE construct, we do not offer a way to either GROUP BY or CUBE in the choice of what data subsets to compute the summary of every region; this is typically over- summarize. Thus, our framework allows more meaningful sum- kill for the kinds of complex measures we seek to compute. maries to be computed for a targeted collection of data subsets. 3) The language and algebra are carefully designed to enable We propose an algebra called AW-RA and an equivalent pictorial highly scalable, parallelizable, and distributed evaluation language called aggregation workflows. Aggregation workflows strategies based on streaming the data in one or more passes, allow for intuitive expression of composite measure queries, and possibly with interleaved sorts. the underlying algebra is designed to facilitate efficient multi-scan execution. We describe an evaluation framework based on multi- This study is motivated by our ongoing work in two different ap- ple passes of sorting and scanning over the original dataset. In plication domains: environmental monitoring [18] and analysis of each pass, several measures are evaluated simultaneously, and network traffic data [7]. Similar problems have been faced by dependencies between these measures and containment relation- researchers dealing with data at Internet companies such as Google ships between the underlying subsets of data are orchestrated to and Yahoo!, leading them to also develop systems for scalable reduce the memory footprint of the computation. We present a aggregation of tabular data [10,17,22]. In contrast to the proposal performance evaluation that demonstrates the benefits of our ap- in [22], we explore a declarative, query-style approach in the spirit proach. of OLAP constructs such as [4, 14, 19,20, 26]. Further, the focus of [10] is a highly parallelizable, distributed evaluation framework 1. INTRODUCTION for simple one-step aggregation queries. This is an issue that we do not tackle in this paper, but such highly distributed computation In the multidimensional model of data, records in a central fact has been a strong consideration in the design of our language, and table are viewed as points in a multidimensional space. Attributes we intend to address it in future work. In keeping with this goal, are divided into dimension attributes, which are the coordinates of we have avoided implementation choices that require us to assign the data point, and measure attributes, which are values associated unique ids to records or to maintain indexes over (potentially with points. The domain of values for each dimension is organized widely distributed) tables, and focused on evaluation strategies Permission to copy without fee all or part of this material is granted pro- that orchestrate aggregation steps across one or more scans of data vided that the copies are not made or distributed for direct commercial (partitions). advantage, the VLDB copyright notice and the title of the publication and Consider the following network traffic analysis example. Self- its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, to propagating worm outbreaks have the potential to wreak havoc in post on servers or to redistribute to lists, requires a fee and/or special per- the Internet, and can take place on a variety of time scales. We can mission from the publisher, ACM. potentially identify new outbreaks based on the escalation of the VLDB ‘06, September 12–15, 2006, Seoul, Korea. traffic into a network from one time period to the next. This kind Copyright 2006 VLDB Endowment, ACM 1-59593-385-9/06/09 of escalation, which is defined on a per-time period and sub- 403 network basis, is a composite measure built on the traffic measures attack packets identified in each network. Table 1 lists the attrib- for two adjacent time periods. utes used in our examples. When used to compute composite measures, existing tools, such as The schema of a multidimensional dataset with d dimension relation algebra or multidimensional query languages, frequently attributes has a dimension vector X = (X1, X2,…, Xd), and possibly result in nested expressions that are hard for human analysts to additional measure attributes. While there are no explicit measure understand and for the processing system to optimize. Further, attributes in the Dshield dataset, they are typical in multidimen- their use requires us to import data into a DBMS, which can itself sional datasets. Each record r in is denoted as a tuple of be a challenge for very large datasets [10]. Our goal is to develop a dimension values followed by measure values, if any: r = standalone, lightweight yet highly scalable analysis system that (x ,x ,…, x ,m,… ), where x is the value for dimension attribute supports composite measure queries. 1 2 d 1 i Xi. In the Dshield dataset, X = (Time, Source, Target, TargetPort), Our contributions are as follows: which is abbreviated as X ={t,U,T,P}. 1) We propose a pictorial language (called aggregation work- 2.1 Domains and Domain Hierarchies flows) and algebra for expressing composite aggregation A base domain D (X ) is associated with each dimension attrib- queries and representing streaming plans for such queries. base i ute Xi. For example, the base domain for attributes Source and 2) We show that any query in our language can be expressed in Target is 4-byte integers, and the base domain for Time is the our algebra, and present a comprehensive framework for number of seconds since the first second of 1970 (UNIX time). highly scalable, scan-based evaluation in one or more passes. A base domain can be generalized. For example, we can We show how to exploit sorting between passes and orches- generalize the base domain of Source IP into the /24 subnet tration of dependencies between different aggregation steps. domain (256 contiguous IP addresses). Each value in this domain 3) We present an evaluation that demonstrates the potential of is a 3-byte integer representing one /24 subset. Given two domains our methods for significant performance gains over tradi- Di and Dj, Di <D Dj indicates that Dj is a domain generalization of tional relational approaches. Di; we also say that Di is more specific than Dj. This work is a first step in our broader research agenda to develop All the domains associated with a given dimension attribute form a efficient, streamlined tools for domain specialists to mine large, domain generalization hierarchy, which is a directed acyclic graph complex datasets. The complete, technical report version of this (DAG). Each node in this graph represents one domain. The rela- paper [8] discusses the optimization problem of finding good tionship <D defines a partial order in the graph. Di<DDj, if there is multi-pass streaming plans and describes a greedy optimizer. an arc chain from Dj to Di. A domain hierarchy is linear when the Beyond that, the approach offers potentially unlimited parallelism domain hierarchy graph forms a single path. For any dimension and ability to distribute computation, but our current implementa- attribute, there is a special domain called DALL with a single value tion does not take advantage of these opportunities. ALL, which is the generalization of all possible values for the The rest of the paper is organized as follows. In Section 2, we given dimension.
Recommended publications
  • 1 Elementary Set Theory
    1 Elementary Set Theory Notation: fg enclose a set. f1; 2; 3g = f3; 2; 2; 1; 3g because a set is not defined by order or multiplicity. f0; 2; 4;:::g = fxjx is an even natural numberg because two ways of writing a set are equivalent. ; is the empty set. x 2 A denotes x is an element of A. N = f0; 1; 2;:::g are the natural numbers. Z = f:::; −2; −1; 0; 1; 2;:::g are the integers. m Q = f n jm; n 2 Z and n 6= 0g are the rational numbers. R are the real numbers. Axiom 1.1. Axiom of Extensionality Let A; B be sets. If (8x)x 2 A iff x 2 B then A = B. Definition 1.1 (Subset). Let A; B be sets. Then A is a subset of B, written A ⊆ B iff (8x) if x 2 A then x 2 B. Theorem 1.1. If A ⊆ B and B ⊆ A then A = B. Proof. Let x be arbitrary. Because A ⊆ B if x 2 A then x 2 B Because B ⊆ A if x 2 B then x 2 A Hence, x 2 A iff x 2 B, thus A = B. Definition 1.2 (Union). Let A; B be sets. The Union A [ B of A and B is defined by x 2 A [ B if x 2 A or x 2 B. Theorem 1.2. A [ (B [ C) = (A [ B) [ C Proof. Let x be arbitrary. x 2 A [ (B [ C) iff x 2 A or x 2 B [ C iff x 2 A or (x 2 B or x 2 C) iff x 2 A or x 2 B or x 2 C iff (x 2 A or x 2 B) or x 2 C iff x 2 A [ B or x 2 C iff x 2 (A [ B) [ C Definition 1.3 (Intersection).
    [Show full text]
  • The Matroid Theorem We First Review Our Definitions: a Subset System Is A
    CMPSCI611: The Matroid Theorem Lecture 5 We first review our definitions: A subset system is a set E together with a set of subsets of E, called I, such that I is closed under inclusion. This means that if X ⊆ Y and Y ∈ I, then X ∈ I. The optimization problem for a subset system (E, I) has as input a positive weight for each element of E. Its output is a set X ∈ I such that X has at least as much total weight as any other set in I. A subset system is a matroid if it satisfies the exchange property: If i and i0 are sets in I and i has fewer elements than i0, then there exists an element e ∈ i0 \ i such that i ∪ {e} ∈ I. 1 The Generic Greedy Algorithm Given any finite subset system (E, I), we find a set in I as follows: • Set X to ∅. • Sort the elements of E by weight, heaviest first. • For each element of E in this order, add it to X iff the result is in I. • Return X. Today we prove: Theorem: For any subset system (E, I), the greedy al- gorithm solves the optimization problem for (E, I) if and only if (E, I) is a matroid. 2 Theorem: For any subset system (E, I), the greedy al- gorithm solves the optimization problem for (E, I) if and only if (E, I) is a matroid. Proof: We will show first that if (E, I) is a matroid, then the greedy algorithm is correct. Assume that (E, I) satisfies the exchange property.
    [Show full text]
  • Algebra I Chapter 1. Basic Facts from Set Theory 1.1 Glossary of Abbreviations
    Notes: c F.P. Greenleaf, 2000-2014 v43-s14sets.tex (version 1/1/2014) Algebra I Chapter 1. Basic Facts from Set Theory 1.1 Glossary of abbreviations. Below we list some standard math symbols that will be used as shorthand abbreviations throughout this course. means “for all; for every” • ∀ means “there exists (at least one)” • ∃ ! means “there exists exactly one” • ∃ s.t. means “such that” • = means “implies” • ⇒ means “if and only if” • ⇐⇒ x A means “the point x belongs to a set A;” x / A means “x is not in A” • ∈ ∈ N denotes the set of natural numbers (counting numbers) 1, 2, 3, • · · · Z denotes the set of all integers (positive, negative or zero) • Q denotes the set of rational numbers • R denotes the set of real numbers • C denotes the set of complex numbers • x A : P (x) If A is a set, this denotes the subset of elements x in A such that •statement { ∈ P (x)} is true. As examples of the last notation for specifying subsets: x R : x2 +1 2 = ( , 1] [1, ) { ∈ ≥ } −∞ − ∪ ∞ x R : x2 +1=0 = { ∈ } ∅ z C : z2 +1=0 = +i, i where i = √ 1 { ∈ } { − } − 1.2 Basic facts from set theory. Next we review the basic definitions and notations of set theory, which will be used throughout our discussions of algebra. denotes the empty set, the set with nothing in it • ∅ x A means that the point x belongs to a set A, or that x is an element of A. • ∈ A B means A is a subset of B – i.e.
    [Show full text]
  • Ergodicity and Metric Transitivity
    Chapter 25 Ergodicity and Metric Transitivity Section 25.1 explains the ideas of ergodicity (roughly, there is only one invariant set of positive measure) and metric transivity (roughly, the system has a positive probability of going from any- where to anywhere), and why they are (almost) the same. Section 25.2 gives some examples of ergodic systems. Section 25.3 deduces some consequences of ergodicity, most im- portantly that time averages have deterministic limits ( 25.3.1), and an asymptotic approach to independence between even§ts at widely separated times ( 25.3.2), admittedly in a very weak sense. § 25.1 Metric Transitivity Definition 341 (Ergodic Systems, Processes, Measures and Transfor- mations) A dynamical system Ξ, , µ, T is ergodic, or an ergodic system or an ergodic process when µ(C) = 0 orXµ(C) = 1 for every T -invariant set C. µ is called a T -ergodic measure, and T is called a µ-ergodic transformation, or just an ergodic measure and ergodic transformation, respectively. Remark: Most authorities require a µ-ergodic transformation to also be measure-preserving for µ. But (Corollary 54) measure-preserving transforma- tions are necessarily stationary, and we want to minimize our stationarity as- sumptions. So what most books call “ergodic”, we have to qualify as “stationary and ergodic”. (Conversely, when other people talk about processes being “sta- tionary and ergodic”, they mean “stationary with only one ergodic component”; but of that, more later. Definition 342 (Metric Transitivity) A dynamical system is metrically tran- sitive, metrically indecomposable, or irreducible when, for any two sets A, B n ∈ , if µ(A), µ(B) > 0, there exists an n such that µ(T − A B) > 0.
    [Show full text]
  • POINT-COFINITE COVERS in the LAVER MODEL 1. Background Let
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 138, Number 9, September 2010, Pages 3313–3321 S 0002-9939(10)10407-9 Article electronically published on April 30, 2010 POINT-COFINITE COVERS IN THE LAVER MODEL ARNOLD W. MILLER AND BOAZ TSABAN (Communicated by Julia Knight) Abstract. Let S1(Γ, Γ) be the statement: For each sequence of point-cofinite open covers, one can pick one element from each cover and obtain a point- cofinite cover. b is the minimal cardinality of a set of reals not satisfying S1(Γ, Γ). We prove the following assertions: (1) If there is an unbounded tower, then there are sets of reals of cardinality b satisfying S1(Γ, Γ). (2) It is consistent that all sets of reals satisfying S1(Γ, Γ) have cardinality smaller than b. These results can also be formulated as dealing with Arhangel’ski˘ı’s property α2 for spaces of continuous real-valued functions. The main technical result is that in Laver’s model, each set of reals of cardinality b has an unbounded Borel image in the Baire space ωω. 1. Background Let P be a nontrivial property of sets of reals. The critical cardinality of P , denoted non(P ), is the minimal cardinality of a set of reals not satisfying P .A natural question is whether there is a set of reals of cardinality at least non(P ), which satisfies P , i.e., a nontrivial example. We consider the following property. Let X be a set of reals. U is a point-cofinite cover of X if U is infinite, and for each x ∈ X, {U ∈U: x ∈ U} is a cofinite subset of U.1 Having X fixed in the background, let Γ be the family of all point- cofinite open covers of X.
    [Show full text]
  • A TEXTBOOK of TOPOLOGY Lltld
    SEIFERT AND THRELFALL: A TEXTBOOK OF TOPOLOGY lltld SEI FER T: 7'0PO 1.OG 1' 0 I.' 3- Dl M E N SI 0 N A I. FIRERED SPACES This is a volume in PURE AND APPLIED MATHEMATICS A Series of Monographs and Textbooks Editors: SAMUELEILENBERG AND HYMANBASS A list of recent titles in this series appears at the end of this volunie. SEIFERT AND THRELFALL: A TEXTBOOK OF TOPOLOGY H. SEIFERT and W. THRELFALL Translated by Michael A. Goldman und S E I FE R T: TOPOLOGY OF 3-DIMENSIONAL FIBERED SPACES H. SEIFERT Translated by Wolfgang Heil Edited by Joan S. Birman and Julian Eisner @ 1980 ACADEMIC PRESS A Subsidiary of Harcourr Brace Jovanovich, Publishers NEW YORK LONDON TORONTO SYDNEY SAN FRANCISCO COPYRIGHT@ 1980, BY ACADEMICPRESS, INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER. ACADEMIC PRESS, INC. 11 1 Fifth Avenue, New York. New York 10003 United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road, London NWI 7DX Mit Genehmigung des Verlager B. G. Teubner, Stuttgart, veranstaltete, akin autorisierte englische Ubersetzung, der deutschen Originalausgdbe. Library of Congress Cataloging in Publication Data Seifert, Herbert, 1897- Seifert and Threlfall: A textbook of topology. Seifert: Topology of 3-dimensional fibered spaces. (Pure and applied mathematics, a series of mono- graphs and textbooks ; ) Translation of Lehrbuch der Topologic. Bibliography: p. Includes index. 1.
    [Show full text]
  • The Hardness of the Hidden Subset Sum Problem and Its Cryptographic Implications
    The Hardness of the Hidden Subset Sum Problem and Its Cryptographic Implications Phong Nguyen and Jacques Stern Ecole´ Normale Sup´erieure Laboratoire d’Informatique 45 rue d’Ulm, 75230 Paris Cedex 05 France fPhong.Nguyen,[email protected] http://www.dmi.ens.fr/~fpnguyen,sterng/ Abstract. At Eurocrypt’98, Boyko, Peinado and Venkatesan presented simple and very fast methods for generating randomly distributed pairs of the form (x; gx mod p) using precomputation. The security of these methods relied on the potential hardness of a new problem, the so-called hidden subset sum problem. Surprisingly, apart from exhaustive search, no algorithm to solve this problem was known. In this paper, we exhibit a security criterion for the hidden subset sum problem, and discuss its implications on the practicability of the precomputation schemes. Our results are twofold. On the one hand, we present an efficient lattice-based attack which is expected to succeed if and only if the parameters satisfy a particular condition that we make explicit. Experiments have validated the theoretical analysis, and show the limitations of the precomputa- tion methods. For instance, any realistic smart-card implementation of Schnorr’s identification scheme using these precomputations methods is either vulnerable to the attack, or less efficient than with traditional pre- computation methods. On the other hand, we show that, when another condition is satisfied, the pseudo-random generator based on the hidden subset sum problem is strong in some precise sense which includes at- tacks via lattice reduction. Namely, using the discrete Fourier transform, we prove that the distribution of the generator’s output is indistinguish- able from the uniform distribution.
    [Show full text]
  • The Axiom of Choice and Its Implications
    THE AXIOM OF CHOICE AND ITS IMPLICATIONS KEVIN BARNUM Abstract. In this paper we will look at the Axiom of Choice and some of the various implications it has. These implications include a number of equivalent statements, and also some less accepted ideas. The proofs discussed will give us an idea of why the Axiom of Choice is so powerful, but also so controversial. Contents 1. Introduction 1 2. The Axiom of Choice and Its Equivalents 1 2.1. The Axiom of Choice and its Well-known Equivalents 1 2.2. Some Other Less Well-known Equivalents of the Axiom of Choice 3 3. Applications of the Axiom of Choice 5 3.1. Equivalence Between The Axiom of Choice and the Claim that Every Vector Space has a Basis 5 3.2. Some More Applications of the Axiom of Choice 6 4. Controversial Results 10 Acknowledgments 11 References 11 1. Introduction The Axiom of Choice states that for any family of nonempty disjoint sets, there exists a set that consists of exactly one element from each element of the family. It seems strange at first that such an innocuous sounding idea can be so powerful and controversial, but it certainly is both. To understand why, we will start by looking at some statements that are equivalent to the axiom of choice. Many of these equivalences are very useful, and we devote much time to one, namely, that every vector space has a basis. We go on from there to see a few more applications of the Axiom of Choice and its equivalents, and finish by looking at some of the reasons why the Axiom of Choice is so controversial.
    [Show full text]
  • Equivalents to the Axiom of Choice and Their Uses A
    EQUIVALENTS TO THE AXIOM OF CHOICE AND THEIR USES A Thesis Presented to The Faculty of the Department of Mathematics California State University, Los Angeles In Partial Fulfillment of the Requirements for the Degree Master of Science in Mathematics By James Szufu Yang c 2015 James Szufu Yang ALL RIGHTS RESERVED ii The thesis of James Szufu Yang is approved. Mike Krebs, Ph.D. Kristin Webster, Ph.D. Michael Hoffman, Ph.D., Committee Chair Grant Fraser, Ph.D., Department Chair California State University, Los Angeles June 2015 iii ABSTRACT Equivalents to the Axiom of Choice and Their Uses By James Szufu Yang In set theory, the Axiom of Choice (AC) was formulated in 1904 by Ernst Zermelo. It is an addition to the older Zermelo-Fraenkel (ZF) set theory. We call it Zermelo-Fraenkel set theory with the Axiom of Choice and abbreviate it as ZFC. This paper starts with an introduction to the foundations of ZFC set the- ory, which includes the Zermelo-Fraenkel axioms, partially ordered sets (posets), the Cartesian product, the Axiom of Choice, and their related proofs. It then intro- duces several equivalent forms of the Axiom of Choice and proves that they are all equivalent. In the end, equivalents to the Axiom of Choice are used to prove a few fundamental theorems in set theory, linear analysis, and abstract algebra. This paper is concluded by a brief review of the work in it, followed by a few points of interest for further study in mathematics and/or set theory. iv ACKNOWLEDGMENTS Between the two department requirements to complete a master's degree in mathematics − the comprehensive exams and a thesis, I really wanted to experience doing a research and writing a serious academic paper.
    [Show full text]
  • On the Cardinality of a Topological Space
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 66, Number 1, September 1977 ON THE CARDINALITY OF A TOPOLOGICAL SPACE ARTHUR CHARLESWORTH Abstract. In recent papers, B. Sapirovskiï, R. Pol, and R. E. Hodel have used a transfinite construction technique of Sapirovskil to provide a unified treatment of fundamental inequalities in the theory of cardinal functions. Sapirovskil's technique is used in this paper to establish an inequality whose countable version states that the continuum is an upper bound for the cardinality of any Lindelöf space having countable pseudocharacter and a point-continuum separating open cover. In addition, the unified treatment is extended to include a recent theorem of Sapirovskil concerning the cardinality of T3 spaces. 1. Introduction. Throughout this paper a and ß denote ordinals, m and n denote infinite cardinals, and \A\ denotes the cardinality of a set A. We let d, c, L, w, it, and \¡>denote the density, cellularity, Lindelöf degree, weight, 77-weight, and pseudocharacter. (For definitions see Juhász [5].) We also let nw, 77% and psw denote the net weight, vT-character, and point separating weight. A collection "31 of subsets of a topological space X is a network for X if whenever p E V, with V open in X, we have p E N C F for some N in CX. A collection % of nonempty open sets is a tr-local base at/7 if whenever p E V, with V open in X, we have U C V for some U in %. A separating open cover § for X is an open cover for X having the property that for each distinct x and y in X there is an S in S such that x E S and y G S.
    [Show full text]
  • Chapter I Sums of Independent Random Variables
    Chapter I Sums of Independent Random Variables In one way or another, most probabilistic analysis entails the study of large families of random variables. The key to such analysis is an understanding of the relations among the family members; and of all the possible ways in which members of a family can be related, by far the simplest is when there is no relationship at all! For this reason, I will begin by looking at families of independent random variables. x1.1 Independence In this section I will introduce Kolmogorov's way of describing independence and prove a few of its consequences. x1.1.1. Independent Sigma Algebras. Let (Ω; F; P) be a probability space (i.e., Ω is a nonempty set, F is a σ-algebra over Ω, and P is a non- negative measure on the measurable space (Ω; F) having total mass 1); and, for each i from the (nonempty) index set I, let Fi be a sub σ-algebra of F. I will say that the σ-algebras Fi; i 2 I, are mutually P-independent or, less precisely, P-independent, if, for every finite subset fi1; : : : ; ing of distinct elements of I and every choice of Aim 2 Fim ; 1 ≤ m ≤ n, (1.1.1) P Ai1 \···\ Ain = P Ai1 ··· P Ain : In particular, if fAi : i 2 Ig is a family of sets from F, I will say that Ai; i 2 I; are P-independent if the associated σ-algebras Fi = f;;Ai;Ai{; Ωg; i 2 I; are. To gain an appreciation for the intuition on which this definition is based, it is important to notice that independence of the pair A1 and A2 in the present sense is equivalent to P A1 \ A2 = P A1 P A2 ; the classical definition which one encounters in elementary treatments.
    [Show full text]
  • Dynamical Systems and Ergodic Theory
    MATH36206 - MATHM6206 Dynamical Systems and Ergodic Theory Teaching Block 1, 2017/18 Lecturer: Prof. Alexander Gorodnik PART III: LECTURES 16{30 course web site: people.maths.bris.ac.uk/∼mazag/ds17/ Copyright c University of Bristol 2010 & 2016. This material is copyright of the University. It is provided exclusively for educational purposes at the University and is to be downloaded or copied for your private study only. Chapter 3 Ergodic Theory In this last part of our course we will introduce the main ideas and concepts in ergodic theory. Ergodic theory is a branch of dynamical systems which has strict connections with analysis and probability theory. The discrete dynamical systems f : X X studied in topological dynamics were continuous maps f on metric spaces X (or more in general, topological→ spaces). In ergodic theory, f : X X will be a measure-preserving map on a measure space X (we will see the corresponding definitions below).→ While the focus in topological dynamics was to understand the qualitative behavior (for example, periodicity or density) of all orbits, in ergodic theory we will not study all orbits, but only typical1 orbits, but will investigate more quantitative dynamical properties, as frequencies of visits, equidistribution and mixing. An example of a basic question studied in ergodic theory is the following. Let A X be a subset of O+ ⊂ the space X. Consider the visits of an orbit f (x) to the set A. If we consider a finite orbit segment x, f(x),...,f n−1(x) , the number of visits to A up to time n is given by { } Card 0 k n 1, f k(x) A .
    [Show full text]