(more info)

Math for the people, by the people. Encyclopedia | Requests | Forums | Docs | Wiki | Random | RSS Advanced search

(Definition)

"topological space" is owned by djao. [ full author list (2) ] (more info)

Math for the people, by the people. Encyclopedia | Requests | Forums | Docs | Wiki | Random | RSS Advanced search

compact (Definition)

"compact" is owned by djao. [ full author list (2) ] Dense 1

In and related areas of , a A of a topological space X is called dense (in X) if any point x in X belongs to A or is a point of A.[1] Informally, for every point in X, the point is either in A or arbitrarily "close" to a member of A - for instance, every is either a or has one arbitrarily close to it (see Diophantine approximation). Formally, a subset A of a topological space X is dense in X if for any point x in X, any neighborhood of x contains at least one point from A. Equivalently, A is dense in X if and only if the only closed subset of X containing A is X itself. This can also be expressed by saying that the of A is X, or that the of the of A is empty. The density of a topological space X is the least of a dense subset of X.

Density in spaces An alternative definition of dense set in the case of metric spaces is the following. When the topology of X is given by a metric, the closure of A in X is the of A and the set of all limits of of elements in A (its limit points),

Then A is dense in X if

Note that . If is a of dense open sets in a complete , X, then is also dense in X. This fact is one of the equivalent forms of the Baire theorem.

Examples The real numbers with the usual topology have the rational numbers as a countable dense subset which shows that the cardinality of a dense subset of a topological space may be strictly smaller than the cardinality of the space itself. The irrational numbers are another dense subset which shows that a topological space may have several disjoint dense . By the Weierstrass approximation theorem, any given complex-valued continuous defined on a closed [a,b] can be uniformly approximated as closely as desired by a function. In other words, the polynomial functions are dense in the space C[a,b] of continuous complex-valued functions on the interval [a,b], equipped with the supremum . Every metric space is dense in its completion.

Properties Every topological space is dense in itself. For a set X equipped with the discrete topology the whole space is the only dense set. Every non-empty subset of a set X equipped with the is dense, and every topology for which every non-empty subset is dense must be trivial. Denseness is transitive: Given three subsets A, B and C of a topological space X with A ⊆ B ⊆ C such that A is dense in B and B is dense in C (in the respective ) then A is also dense in C. The of a dense subset under a surjective is again dense. The density of a topological space is a topological invariant. A topological space with a connected dense subset is necessarily connected itself. Dense set 2

Continuous functions into Hausdorff spaces are determined by their values on dense subsets: if two continuous functions f, g : X → Y into a Y agree on a dense subset of X then they agree on all of X.

Related notions A point x of a subset A of a topological space X is called a of A (in X) if every neighbourhood of x also contains a point of A other than x itself, and an of A otherwise. A subset without isolated points is said to be dense-in-itself. A subset A of a topological space X is called nowhere dense (in X) if there is no neighborhood in X on which A is dense. Equivalently, a subset of a topological space is nowhere dense if and only if the interior of its closure is empty. The interior of the complement of a is always dense. The complement of a closed nowhere dense set is a dense . A topological space with a countable dense subset is called separable. A topological space is a if and only if the intersection of countably many dense open sets is always dense. A topological space is called resolvable if it is the union of two disjoint dense subsets. More generally, a topological space is called κ-resolvable if it contains κ pairwise disjoint dense sets. An of a topological space X as a dense subset of a is called a compactification of X. A linear operator between topological vector spaces X and Y is said to be densely defined if its domain is a dense subset of X and if its range is contained within Y. See also continuous linear extension. A topological space X is hyperconnected if and only if every nonempty open set is dense in X. A topological space is submaximal if and only if every dense subset is open.

References

Notes [1] Steen, L. A.; Seebach, J. A. (1995), Counterexamples in Topology, Dover, ISBN 048668735X

General references • (1989) [1971]. , Chapters 1–4. Elements of Mathematics. Springer-Verlag. ISBN 3-540-64241-2. • Steen, Lynn Arthur; Seebach, J. Arthur Jr. (1995) [1978], Counterexamples in Topology (Dover reprint of 1978 ed.), Berlin, New York: Springer-Verlag, ISBN 978-0-486-68735-3, MR507446 Article Sources and Contributors 3 Article Sources and Contributors

Dense set Source: http://en.wikipedia.org/w/index.php?oldid=456661398 Contributors: A19grey, Aleph4, Archelon, Austinmohr, Banus, Bdmy, Bluestarlight37, Calle, Can't sleep, clown will eat me, CiaPan, Cuaxdon, Denis.arnaud, Dino, ELLusKa 86, Erzbischof, Fly by Night, Giftlite, Graham87, Haemo, Headbomb, Isnow, Kae1is, Macrakis, Maksim-e, MathMartin, Michael Hardy, anonymous edits דניאל ב., דניאל צבי, Mks004, Nguyen Thanh Quang, Oleg Alexandrov, Physicistjedi, Poulpy, Tobias Bergemann, Utkwes, Vundicind, 25 License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ 1 Homeomorphism

In the mathematical field of topology, a homeomorphism or topological isomorphism or bicontinuous function is a continuous function between topological spaces that has a continuous inverse function. are the isomorphisms in the category of topological spaces—that is, they are the mappings that preserve all the topological properties of a given space. Two spaces with a homeomorphism between them are called homeomorphic, and from a topological viewpoint they are the same.

Roughly speaking, a topological space is a geometric object, and the homeomorphism is a continuous stretching and bending of the object into a new shape. Thus, a square and a circle are homeomorphic to each other, but a sphere and a donut are not. An often-repeated A continuous deformation between a coffee mug mathematical joke is that topologists can't tell their coffee cup from and a donut illustrating that they are homeomorphic. But there need not be a their donut,[1] since a sufficiently pliable donut could be reshaped to continuous deformation for two spaces to be the form of a coffee cup by creating a dimple and progressively homeomorphic—only a continuous mapping with enlarging it, while shrinking the hole into a handle. a continuous inverse.

Topology is the study of those properties of objects that do not change when homeomorphisms are applied. As Henri Poincaré famously said, mathematics is not the study of objects, but instead, the relations (isomorphisms for instance) between them.

Definition A function f: X → Y between two topological spaces (X, T ) and (Y, T ) is called a homeomorphism if it has the X Y following properties: • f is a bijection (one-to-one and onto), • f is continuous, • the inverse function f −1 is continuous (f is an open mapping). A function with these three properties is sometimes called bicontinuous. If such a function exists, we say X and Y are homeomorphic. A self-homeomorphism is a homeomorphism of a topological space and itself. The homeomorphisms form an equivalence relation on the class of all topological spaces. The resulting equivalence classes are called homeomorphism classes. Homeomorphism 2

Examples

• The unit 2-disc D2 and the unit square in R2 are homeomorphic. • The open interval (a, b) is homeomorphic to the real numbers R for any a < b. • The product space S1 × S1 and the two-dimensional torus are homeomorphic. • Every uniform isomorphism and isometric isomorphism is a homeomorphism. • Any 2-sphere with a single point removed is homeomorphic to the set of all points in R2 (a 2-dimensional plane). (Here we use A trefoil knot is homeomorphic to a circle. 2-sphere in the sense of a physical beach , not a circle or 4-ball.) Continuous mappings are not always realizable as • Let A be a commutative ring with unity and let S be a multiplicative deformations. Here the knot has been thickened subset of A. Then Spec(A ) is homeomorphic to {p ∈ Spec(A) : p ∩ to make the image understandable. S S = ∅}. • Rm and Rn are not homeomorphic for m ≠ n. • The Euclidean is not homeomorphic to the unit circle as a subspace of R2 as the unit circle is compact as a subspace of Euclidean R2 but the real line is not compact.

Notes The third requirement, that f −1 be continuous, is essential. Consider for instance the function f: [0, 2π) → S1 defined by f(φ) = (cos(φ), sin(φ)). This function is bijective and continuous, but not a homeomorphism (S1 is compact but [0, 2π) is not). Homeomorphisms are the isomorphisms in the category of topological spaces. As such, the composition of two homeomorphisms is again a homeomorphism, and the set of all self-homeomorphisms X → X forms a group, called the homeomorphism group of X, often denoted Homeo(X); this group can be given a topology, such as the compact-open topology, making it a . For some purposes, the homeomorphism group happens to be too big, but by means of the isotopy relation, one can reduce this group to the mapping class group. Similarly, as usual in category theory, given two spaces that are homeomorphic, the space of homeomorphisms between them, Homeo(X, Y), is a torsor for the homeomorphism groups Homeo(X) and Homeo(Y), and given a specific homeomorphism between X and Y, all three sets are identified.

Properties • Two homeomorphic spaces share the same topological properties. For example, if one of them is compact, then the other is as well; if one of them is connected, then the other is as well; if one of them is Hausdorff, then the other is as well; their homotopy & homology groups will coincide. Note however that this does not extend to properties defined via a metric; there are metric spaces that are homeomorphic even though one of them is complete and the other is not. • A homeomorphism is simultaneously an open mapping and a closed mapping; that is, it maps open sets to open sets and closed sets to closed sets. • Every self-homeomorphism in can be extended to a self-homeomorphism of the whole disk (Alexander's trick). Homeomorphism 3

Informal discussion The intuitive criterion of stretching, bending, cutting and gluing back together takes a certain amount of practice to apply correctly—it may not be obvious from the description above that deforming a line segment to a point is impermissible, for instance. It is thus important to realize that it is the formal definition given above that counts. This characterization of a homeomorphism often leads to confusion with the concept of homotopy, which is actually defined as a continuous deformation, but from one function to another, rather than one space to another. In the case of a homeomorphism, envisioning a continuous deformation is a mental tool for keeping track of which points on space X correspond to which points on Y—one just follows them as X deforms. In the case of homotopy, the continuous deformation from one map to the other is of the essence, and it is also less restrictive, since none of the maps involved need to be one-to-one or onto. Homotopy does lead to a relation on spaces: homotopy equivalence. There is a name for the kind of deformation involved in visualizing a homeomorphism. It is (except when cutting and regluing are required) an isotopy between the identity map on X and the homeomorphism from X to Y.

References [1] Hubbard, John H.; West, Beverly H. (1995). Differential Equations: A Dynamical Systems Approach. Part II: Higher-Dimensional Systems

(http:/ / books. google. com/ books?id=SHBj2oaSALoC& pg=PA204& dq="coffee+ cup"+ topologist+ joke#v=onepage& q="coffee cup"

topologist joke& f=false). Texts in Applied Mathematics. 18. Springer. p. 204. ISBN 978-0387943770. .

External links

• Homeomorphism (http:/ / planetmath. org/ ?op=getobj& amp;from=objects& amp;id=912) on PlanetMath Article Sources and Contributors 4 Article Sources and Contributors

Homeomorphism Source: http://en.wikipedia.org/w/index.php?oldid=462374431 Contributors: 9258fahsflkh917fas, Ahoerstemeier, Andrew Delong, Arjun r acharya, Asztal, AxelBoldt, BenFrantzDale, Brians, Charles Matthews, ClamDip, Conversion script, Crasshopper, DekuDekuplex, Dominus, Dr.K., Dysprosia, Elwikipedista, EmilJ, Explorall, Frencheigh, Fropuff, Giftlite, Glenn, Grondemar, Hadal, Hakeem.gadi, Henry Delforn, Joule36e5, Juan Marquez, Kieff, Linas, Lorem Ip, Mark Foskey, MathKnight, MathMartin, Mchipofya, Michael Angelkovich, Michael Hardy, Miguel, MisfitToys, Ms2ger, Msh210, Myasuda, Nbarth, Ojigiri, Oleg Alexandrov, Orionus, Orthografer, Pandaritrl, Parudox, Paul August, Pengo, PhilKnight, Prijutme4ty, Renamed user 1, Salix alba, Seqsea, Set theorist, SingingDragon, Steelpillow, The cattr, TheObtuseAngleOfDoom, ThomasWinwood, Toby Bartels, Topology Expert, Tosha, Vonkje, Weialawaga, Wikomidia, anonymous edits 68 ,ﻣﺎﻧﻲ ,Wjmallard, XJamRastafire, Xmlizer, Youandme, Zero sharp, Zhaoway, Zundark Image Sources, Licenses and Contributors

Image:Mug and Torus morph.gif Source: http://en.wikipedia.org/w/index.php?title=File:Mug_and_Torus_morph.gif License: Public Domain Contributors: Abnormaal, Durova, Howcheng, Kieff, Kri, Manco Capac, Maximaximax, Rovnet, SharkD, Takabeg, 13 anonymous edits Image:Trefoil knot arb.png Source: http://en.wikipedia.org/w/index.php?title=File:Trefoil_knot_arb.png License: GNU Free Documentation License Contributors: Editor at Large, Ylebru, 2 anonymous edits License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ (mathematics) 1 Measure (mathematics)

In mathematical analysis, a measure on a set is a systematic way to assign to each suitable subset a number, intuitively interpreted as the size of the subset. In this sense, a measure is a generalization of the concepts of length, area, and volume. A particularly important example is the on a , which assigns the conventional length, area, and volume of Euclidean geometry to suitable subsets of the n-dimensional Euclidean space Rn. For instance, the Lebesgue measure of the interval [0, 1] in the real numbers is its length in the everyday sense of the word, specifically 1.

To qualify as a measure (see Definition below), a function that assigns a non-negative real number or +∞ to a set's subsets must satisfy a few conditions. One important condition is countable additivity. This condition states that the size of the union of a sequence of disjoint subsets is equal to the sum of the sizes of the subsets. However, it is in general impossible to associate a consistent size to each subset of a given set and also satisfy the other axioms of a measure. This problem was resolved by defining measure only on a sub-collection of all subsets; the subsets on which the measure is to be defined are called measurable and they are required to form a σ-algebra, meaning that Informally, a measure has the property of being monotone in the sense that if A is a subset of B, unions, intersections and complements of sequences of measurable the measure of A is less than or equal to the subsets are measurable. Non-measurable sets in a Euclidean space, on measure of B. Furthermore, the measure of the which the Lebesgue measure cannot be defined consistently, are is required to be 0. necessarily complicated, in the sense of being badly mixed up with their complements; indeed, their existence is a non-trivial consequence of the .

Measure theory was developed in successive stages during the late 19th and early 20th centuries by Émile Borel, Henri Lebesgue, Johann Radon and Maurice Fréchet, among others. The main applications of measures are in the foundations of the Lebesgue integral, in Andrey Kolmogorov's axiomatisation of probability theory and in ergodic theory. In integration theory, specifying a measure allows one to define integrals on spaces more general than subsets of Euclidean space; moreover, the integral with respect to the Lebesgue measure on Euclidean spaces is more general and has a richer theory than its predecessor, the . Probability theory considers measures that assign to the whole set the size 1, and considers measurable subsets to be events whose probability is given by the measure. Ergodic theory considers measures that are invariant under, or arise naturally from, a dynamical system.

Definition Let Σ be a σ-algebra over a set X. A function μ from Σ to the extended real number line is called a measure if it satisfies the following properties: • Non-negativity: for all • Countable additivity (or σ-additivity): For all countable collections of pairwise in Σ:

• Null empty set: Measure (mathematics) 2

One may require that at least one set E has . Then the automatically has measure zero because

of countable additivity, because and is

finite if and only if the empty set has measure zero. The pair (X, Σ) is called a measurable space, the members of Σ are called measurable sets, and the triple (X, Σ, μ) is called a measure space. If only the second and third conditions of the definition of measure above are met, and μ takes on at most one of the values ±∞, then μ is called a signed measure. A probability measure is a measure with total measure one (i.e., μ(X) = 1); a probability space is a measure space with a probability measure. For measure spaces that are also topological spaces various compatibility conditions can be placed for the measure and the topology. Most measures met in practice in analysis (and in many cases also in probability theory) are Radon measures. Radon measures have an alternative definition in terms of linear functionals on the locally convex space of continuous functions with compact . This approach is taken by Bourbaki (2004) and a number of other authors. For more details see .

Properties Several further properties can be derived from the definition of a countably additive measure.

Monotonicity A measure μ is monotonic: If E and E are measurable sets with E ⊆ E then 1 2 1 2

Measures of infinite unions of measurable sets A measure μ is countably subadditive: If E , E , E , … is a countable sequence of sets in Σ, not necessarily disjoint, 1 2 3 then

A measure μ is continuous from below: If E , E , E , … are measurable sets and E is a subset of E for all n, then 1 2 3 n n + 1 the union of the sets E is measurable, and n Measure (mathematics) 3

Measures of infinite intersections of measurable sets A measure μ is continuous from above: If E , E , E , … are measurable sets and E is a subset of E for all n, then 1 2 3 n + 1 n the intersection of the sets E is measurable; furthermore, if at least one of the E has finite measure, then n n

This property is false without the assumption that at least one of the E has finite measure. For instance, for each n ∈ n N, let

which all have infinite Lebesgue measure, but the intersection is empty.

Sigma-finite measures A measure space (X, Σ, μ) is called finite if μ(X) is a finite real number (rather than ∞). It is called σ-finite if X can be decomposed into a countable union of measurable sets of finite measure. A set in a measure space has σ-finite measure if it is a countable union of sets with finite measure. For example, the real numbers with the standard Lebesgue measure are σ-finite but not finite. Consider the closed intervals [k,k+1] for all k; there are countably many such intervals, each has measure 1, and their union is the entire real line. Alternatively, consider the real numbers with the counting measure, which assigns to each of reals the number of points in the set. This measure space is not σ-finite, because every set with finite measure contains only finitely many points, and it would take uncountably many such sets to the entire real line. The σ-finite measure spaces have some very convenient properties; σ-finiteness can be compared in this respect to the Lindelöf property of topological spaces. They can be also thought of as a vague generalization of the idea that a measure space may have 'uncountable measure'.

Completeness A measurable set X is called a null set if μ(X)=0. A subset of a null set is called a . A negligible set need not be measurable, but every measurable negligible set is automatically a null set. A measure is called complete if every negligible set is measurable. A measure can be extended to a complete one by considering the σ-algebra of subsets Y which differ by a negligible set from a measurable set X, that is, such that the symmetric difference of X and Y is contained in a null set. One defines μ(Y) to equal μ(X).

Additivity Measures are required to be countably additive. However, the condition can be strengthened as follows. For any set I and any set of nonnegative r , define: i

A measure on is -additive if for any and any family , the following hold:

1.

2.

Note that the second condition is equivalent to the statement that the of null sets is -complete. Measure (mathematics) 4

Examples Some important measures are listed here. • The counting measure is defined by μ(S) = number of elements in S. • The Lebesgue measure on R is a complete translation-invariant measure on a σ-algebra containing the intervals in R such that μ([0,1]) = 1; and every other measure with these properties extends Lebesgue measure. • Circular angle measure is invariant under rotation, and hyperbolic angle measure is invariant under squeeze mapping. • The Haar measure for a locally compact topological group is a generalization of the Lebesgue measure (and also of counting measure and circular angle measure) and has similar uniqueness properties. • The Hausdorff measure is a generalization of the Lebesgue measure to sets with non- dimension, in particular, fractal sets. • Every probability space gives rise to a measure which takes the value 1 on the whole space (and therefore takes all its values in the [0,1]). Such a measure is called a probability measure. See probability axioms. • The Dirac measure δ (cf. ) is given by δ (S) = χ (a), where χ is the characteristic function of a a S S S. The measure of a set is 1 if it contains the point a and 0 otherwise. Other 'named' measures used in various theories include: Borel measure, , ergodic measure, Euler measure, Gaussian measure, Baire measure, Radon measure and Young measure. In physics an example of a measure is spatial distribution of mass (see e.g., gravity potential), or another non-negative extensive property, conserved (see conservation law for a list of these) or not. Negative values lead to signed measures, see "generalizations" below. Liouville measure, known also as the natural volume form on a symplectic , is useful in classical statistical and Hamiltonian mechanics. Gibbs measure is widely used in statistical mechanics, often under the name canonical ensemble.

Non-measurable sets If the axiom of choice is assumed to be true, not all subsets of Euclidean space are Lebesgue measurable; examples of such sets include the , and the non-measurable sets postulated by the Hausdorff paradox and the Banach–Tarski paradox.

Generalizations For certain purposes, it is useful to have a "measure" whose values are not restricted to the non-negative reals or infinity. For instance, a countably additive set function with values in the (signed) real numbers is called a signed measure, while such a function with values in the complex numbers is called a . Measures that take values in Banach spaces have been studied extensively. A measure that takes values in the set of self-adjoint projections on a is called a projection-valued measure; these are used in for the spectral theorem. When it is necessary to distinguish the usual measures which take non-negative values from generalizations, the term positive measure is used. Positive measures are closed under but not general , while signed measures are the linear closure of positive measures. Another generalization is the finitely additive measure, which are sometimes called contents. This is the same as a measure except that instead of requiring countable additivity we require only finite additivity. Historically, this definition was used first, but proved to be not so useful. It turns out that in general, finitely additive measures are connected with notions such as Banach limits, the dual of L∞ and the Stone–Čech compactification. All these are linked in one way or another to the axiom of choice. A charge is a generalization in both directions: it is a finitely additive, signed measure. Measure (mathematics) 5

References • Robert G. Bartle (1995) The Elements of Integration and Lebesgue Measure, Wiley Interscience. • Bourbaki, Nicolas (2004), Integration I, Springer Verlag, ISBN 3-540-41129-1 Chapter III. • R. M. Dudley, 2002. and Probability. Cambridge University Press. • Folland, Gerald B. (1999), Real Analysis: Modern Techniques and Their Applications, John Wiley and Sons, ISBN 0-471-317160-0 Second edition. • D. H. Fremlin, 2000. Measure Theory [1]. Torres Fremlin. • Paul Halmos, 1950. Measure theory. Van Nostrand and Co. • R. Duncan Luce and Louis Narens (1987). "measurement, theory of," The New Palgrave: A Dictionary of Economics, v. 3, pp. 428–32. • M. E. Munroe, 1953. Introduction to Measure and Integration. Addison Wesley. • K. P. S. Bhaskara Rao and M. Bhaskara Rao (1983), Theory of Charges: A Study of Finitely Additive Measures, London: Academic Press, pp. x + 315, ISBN 0-1209-5780-9 • Shilov, G. E., and Gurevich, B. L., 1978. Integral, Measure, and Derivative: A Unified Approach, Richard A. Silverman, trans. Dover Publications. ISBN 0-486-63519-8. Emphasizes the Daniell integral. • Jech, Thomas (2003), Set Theory: The Third Millennium Edition, Revised and Expanded, Springer Verlag, ISBN 3-540-44085-2

External links • Tutorial: Measure Theory for Dummies [2] • Yet another, yet very reader-friendly, introduction to the measure theory (with application to quantitative finance in mind) [3]

References

[1] http:/ / www. essex. ac. uk/ maths/ people/ fremlin/ mt. htm

[2] http:/ / www. ee. washington. edu/ techsite/ papers/ documents/ UWEETR-2006-0008. pdf

[3] http:/ / www. yetanotherquant. de Article Sources and Contributors 6 Article Sources and Contributors

Measure (mathematics) Source: http://en.wikipedia.org/w/index.php?oldid=461329451 Contributors: 16@r, 3mta3, 7&6=thirteen, ABCD, Aetheling, AiusEpsi, Akulo, Alansohn, Albmont, AleHitch, Andre Engels, Arvinder.virk, Ashigabou, Avaya1, AxelBoldt, Baaaaaaar, Bdmy, Beaumont, BenFrantzDale, Benandorsqueaks, Bgpaulus, Boobahmad101, Boplin, Brad7777, Brian Tvedt, BrianS36, CRGreathouse, CSTAR, Caesura, Cdamama, Charles Matthews, Charvest, Conversion script, Crasshopper, DVdm, Danielbojczuk, Daniele.tampieri, Dark Charles, Dave Ordinary, DealPete, Digby Tantrum, Dino, Discospinster, Dowjgyta, Dpv, Dysprosia, EIFY, Edokter, Elwikipedista, Empty Buffer, Everyking, Fibonacci, Finanzmaster, Finell, Foxjwill, GIrving, Gabbe, Gadykozma, Gaius Cornelius, Gandalf61, Gar37bic, Gauge, Geevee, Geometry guy, Giftlite, Gilliam, Googl, Harriv, Henning Makholm, Hesam7, Irvin83, Isnow, Iwnbap, Jackzhp, Jay Gatsby, Jheald, Jorgen W, Joriki, Juliancolton, Jóna Þórunn, Keenanpepper, Kiefer.Wolfowitz, Lambiam, Le Docteur, Lethe, Levineps, Linas, Loisel, Loren Rosen, Lupin, MABadger, MER-C, Manop, MarSch, Markjoseph125, Masterpiece2000, Mat cross, MathKnight, MathMartin, Matthew Auger, Mebden, Melcombe, Michael Hardy, Michael P. Barnett, Miguel, Mike Segal, Mimihitam, Mousomer, MrRage, Msh210, Myasuda, Nbarth, Nicoguaro, Obradovic Goran, Ocsenave, Oleg Alexandrov, OverlordQ, Paolo.dL, Patrick, Paul August, PaulTanenbaum, Pdenapo, PhotoBox, Pmanderson, Point-set topologist, Pokus9999, Prumpf, Ptrf, RMcGuigan, Rat144, RayAYang, Revolver, Rgdboer, Richard L. Peterson, Rktect, S2000magician, Salgueiro, Salix alba, SchfiftyThree, Semistablesystem, Stca74, Sullivan.t.j, Sverdrup, Sławomir Biały, TakuyaMurata, Takwan, Tcnuk, The Infidel, The Thing That Should Not Be, Thehotelambush, Thomasmeeks, Tobias Bergemann, Toby, Toby Bartels, Tosha, Tsirel, Turms, Uranographer, Vivacissamamente, Weialawaga, Xantharius, Zero sharp, Zhangkai Jason Cheng, Zundark, Zvika, 145 anonymous edits Image Sources, Licenses and Contributors

Image:Measure illustration.png Source: http://en.wikipedia.org/w/index.php?title=File:Measure_illustration.png License: Public Domain Contributors: Oleg Alexandrov License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ 1 Measurable function

In mathematics, particularly in measure theory, measurable functions are structure-preserving functions between measurable spaces; as such, they form a natural context for the theory of integration. Specifically, a function between measurable spaces is said to be measurable if the preimage of each measurable set is measurable, analogous to the situation of continuous functions between topological spaces. This definition can be deceptively simple, however, as special care must be taken regarding the -algebras involved. In particular, when a function is said to be Lebesgue measurable what is actually meant is that is a measurable function—that is, the domain and range represent different -algebras on the same underlying set (here is the sigma algebra of Lebesgue measurable sets, and is the Borel algebra on ). As a result, the composition of Lebesgue-measurable functions need not be Lebesgue-measurable. By convention a topological space is assumed to be equipped with the Borel algebra generated by its open subsets unless otherwise specified. Most commonly this space will be the real or complex numbers. For instance, a real-valued measurable function is a function for which the preimage of each Borel set is measurable. A complex-valued measurable function is defined analogously. In practice, some authors use measurable functions to refer only to real-valued measurable functions with respect to the Borel algebra.[1] If the values of the function lie in an infinite-dimensional instead of R or C, usually other definitions of measurability are used, such as weak measurability and Bochner measurability. In probability theory, the sigma algebra often represents the set of available information, and a function (in this context a random variable) is measurable if and only if it represents an outcome that is knowable based on the available information. In contrast, functions that are not Lebesgue measurable are generally considered pathological, at least in the field of analysis.

Formal definition Let and be measurable spaces, meaning that and are sets equipped with respective sigma algebras and . A function

is said to be measurable if for every . The notion of measurability depends on the sigma algebras and . To emphasize this dependency, if is a measurable function, we will write

Special measurable functions • If and are Borel spaces, a measurable function is also called a Borel function. Continuous functions are Borel functions but not all Borel functions are continuous. However, a measurable function is nearly a continuous function; see Luzin's theorem. If a Borel function happens to be a section of some map , it is called a Borel section. • A Lebesgue measurable function is a measurable function , where is the sigma algebra of Lebesgue measurable sets, and is the Borel algebra on the complex numbers . Lebesgue measurable functions are of interest in mathematical analysis because they can be integrated. • Random variables are by definition measurable functions defined on sample spaces. Measurable function 2

Properties of measurable functions • The sum and product of two complex-valued measurable functions are measurable.[2] So is the quotient, so long as there is no division by zero.[1] • The composition of measurable functions is measurable; i.e., if and are measurable functions, then so is .[1] But see the caveat regarding Lebesgue-measurable functions in the introduction. • The (pointwise) supremum, infimum, limit superior, and limit inferior of a sequence (viz., countably many) of real-valued measurable functions are all measurable as well.[1] [3] • The pointwise of measurable functions is measurable; note that the corresponding statement for continuous functions requires stronger conditions than pointwise convergence, such as . (This is correct when the counter domain of the elements of the sequence is a metric space. It is false in general; see pages 125 and 126 of.[4] )

Non-measurable functions Real-valued functions encountered in applications tend to be measurable; however, it is not difficult to find non-measurable functions. • So long as there are non-measurable sets in a measure space, there are non-measurable functions from that space. If is some measurable space and is a non-measurable set, i.e. if , then the indicator function is non-measurable (where is equipped with the Borel algebra as usual), since the preimage of the measurable set is the non-measurable set . Here is given by

• Any non-constant function can be made non-measurable by equipping the domain and range with appropriate -algebras. If is an arbitrary non-constant, real-valued function, then is non-measurable if is equipped with the indiscrete algebra , since the preimage of any point in the range is some proper, nonempty subset of , and therefore does not lie in .

Notes [1] Strichartz, Robert (2000). The Way of Analysis. Jones and Bartlett. ISBN 0-7637-1497-6. [2] Folland, Gerald B. (1999). Real Analysis: Modern Techniques and their Applications. Wiley. ISBN 0471317160. [3] Royden, H. L. (1988). Real Analysis. Prentice Hall. ISBN 0-02-404151-3. [4] Dudley, R. M. (2002). Real Analysis and Probability (2 ed.). Cambridge University Press. ISBN 0521007542. Article Sources and Contributors 3 Article Sources and Contributors

Measurable function Source: http://en.wikipedia.org/w/index.php?oldid=446563435 Contributors: AxelBoldt, Babcockd, Bdmy, CSTAR, Calle, Charles Matthews, Chungc, CàlculIntegral, Dan131m, Dingenis, Dysprosia, Fibonacci, Fropuff, Gala.martin, GeeJo, GiM, Giftlite, Ht686rg90, Imran, Jahredtobin, Jka02, Linas, Loewepeter, Mandarax, Maneesh, MarSch, Miguel, Mike Segal, Msh210, OdedSchramm, Oleg Alexandrov, PV=nRT, Parodi, Pascal.Tesson, Paul August, Richard L. Peterson, Rinconsoleao, Ron asquith, Rs2, Salgueiro, Salix alba, Silverfish, Sławomir Biały, Takwan, Tosha, Vivacissamamente, Walterfm, Weialawaga, Zundark, Zvika, 37 anonymous edits License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ Operator (mathematics) 1 Operator (mathematics)

In basic mathematics, an operator is a symbol or function representing a mathematical operation. In terms of vector spaces, an operator is a mapping from one vector space or module to another. Operators are of critical importance to both and functional analysis, and they find application in many other fields of pure and applied mathematics. For example, in classical mechanics, the derivative is used ubiquitously, and in quantum mechanics, observables are represented by linear operators. Important properties that various operators may exhibit include , continuity, and boundedness.

Definitions Let U, V be two vector spaces. The mapping from U to V is called an operator. Let V be a vector space over the field K. We can define the structure of a vector space on the set of all operators from U to V:

for all A, B: U → V, for all x in U and for all α in K. Additionally, operators from any vector space to itself form a unital associative algebra:

with the identity mapping (usually denoted E, I or id) being the unit.

Bounded operators and operator norm Let U and V be two vector spaces over the same ordered field (for example, ), and they are equipped with norms. Then a linear operator from U to V is called bounded if there exists C > 0 such that

for all x in U. It is trivial to show that bounded operators form a vector space. On this vector space we can introduce a norm that is compatible with the norms of U and V: . In case of operators from U to itself it can be shown that . Any unital normed algebra with this property is called a . It is possible to generalize spectral theory to such algebras. C*-algebras, which are Banach algebras with some additional structure, play an important role in quantum mechanics. Operator (mathematics) 2

Special cases

Functionals A functional is an operator that maps a vector space to its underlying field. Important applications of functionals are the theories of generalized functions and calculus of variations. Both are of great importance to theoretical physics.

Linear operators The most common kind of operator encountered are linear operators. Let U and V be vector spaces over a field K. Operator A: U → V is called linear if

for all x, y in U and for all α, β in K. The importance of linear operators is partially because they are morphisms between vector spaces. In finite-dimensional case linear operators can be represented by matrices in the following way. Let be a field, and and be finite-dimensional vector spaces over . Let us select a in and in . Then let be an arbitrary vector in (assuming Einstein convention), and be a linear operator. Then . Then is the matrix of the operator in fixed bases. It is easy to see that does not depend on the choice of , and that iff . Thus in fixed bases n-by-m matrices are in bijective correspondence to linear operators from to . The important concepts directly related to operators between finite-dimensional vector spaces are the ones of rank, determinant, inverse operator, and eigenspace. In infinite-dimensional case linear operators also play a great role. The concepts of rank and determinant cannot be extended to infinite-dimensional case, and although there are infinite matrices, they cease to be a useful tool. This is why a very different techniques are employed when studying linear operators (and operators in general) in infinite-dimensional case. They form a field of functional analysis (named such because various classes of functions form interesting examples of infinite-dimensional vector spaces). Bounded linear operators over form a Banach algebra in respect to the standard operator norm. The theory of Banach algebras develops a very general concept of spectra that elegantly generalizes the theory of eigenspaces.

Examples

Geometry In geometry (particularly differential geometry), additional structures on vector spaces are sometimes studied. Operators that map such vector spaces to themselves bijectively are very useful in these studies, they naturally form groups by composition. For example, bijective operators preserving the structure of linear space are precisely invertible linear operators. They form general linear group; notice that they do not form a vector space, e.g. both id and -id are invertible, but 0 is not. Operators preserving euclidean metric on such a space form , and operators that also preserve orientation of vector tuples form special orthogonal group, or the group of rotations. Operator (mathematics) 3

Probability theory Operators are also involved in probability theory, such as expectation, variance, covariance, factorials, etc.

Calculus From the point of view of functional analysis, calculus is the study of two linear operators: the differential operator , and the indefinite integral operator .

Fourier series and Fourier transform The Fourier transform is used in many areas, not only in mathematics, but in physics and in signal processing, to name a few. It is another integral operator; it is useful mainly because it converts a function on one (temporal) domain to a function on another (frequency) domain, in a way effectively invertible. Nothing significant is lost, because there is an inverse transform operator. In the simple case of periodic functions, this result is based on the theorem that any continuous periodic function can be represented as the sum of a series of sine waves and cosine waves:

Coefficients (a , a , b , a , b , ...) are in fact an element of an infinite-dimensional vector space ℓ2, and thus Fourier 0 1 1 2 2 series is a linear operator. When dealing with general function R → C, the transform takes on an integral form:

Laplace transform The Laplace transform is another integral operator and is involved in simplifying the process of solving differential equations. Given f = f(s), it is defined by:

Fundamental operators on scalar and vector fields Three operators are key to vector calculus: • Grad (gradient), (with operator symbol ∇) assigns a vector at every point in a scalar field that points in the direction of greatest rate of change of that field and whose norm measures the absolute value of that greatest rate of change. • Div (divergence) is a vector operator that measures a vector field's divergence from or convergence towards a given point. • Curl is a vector operator that measures a vector field's curling (winding around, rotating around) trend about a given point.

References Article Sources and Contributors 4 Article Sources and Contributors

Operator (mathematics) Source: http://en.wikipedia.org/w/index.php?oldid=457190264 Contributors: A.K., AK Auto, Ae-a, Anonymous Dissident, Arthena, AuburnPilot, Aulis Eskola, AxelBoldt, Batmantheman, BenFrantzDale, Bensaccount, Bryan Derksen, Burton Radons, Casper2k3, Charles Matthews, Chenxlee, CloudNine, CodeCat, Collegebookworm, Complexica, Conversion script, Crust, Cybercobra, Dbenzhuser, Diotti, Diriangem bravo, Dirnstorfer, Dysprosia, Edokter, Eekerz, Eggor3, Emperorbma, Enchanter, Erydo, Flewis, Franl, Fresheneesz, Frieda, FrozenMan, GTBacchus, Gardar Rurak, Gauge, Gepwiki, Giftlite, GregorB, Gu1dry, Gubbubu, Hans Adler, Helgus, Hongooi, J04n, JHunterJ, JLaTondre, JamesBWatson, Jeff3000, Joanjoc, Jon Awbrey, Jrdioko, Julian Mendez, Juliancolton, Kallikanzarid, KrakatoaKatie, Kulmalukko, LOL, Lexor, Light , LilHelpa, Ling Kah Jai, Lupin, MathKnight, MattGiuca, Metacarpus, Metacomet, Mets501, Michael Hardy, Michael Slone, MrSomeone, Neurolysis, Nguyen Thanh Quang, Nikai, Ojigiri, Omegatron, Paolo.dL, Patrick, Paul August, Penhollow, Pilatus, Pjetter, Pmetzger, Rchase, Revolver, Rex the first, Rossami, SD6-Agent, Sdw25, SeanProctor, Simetrical, Snoyes, Spraggin, Srleffler, Stevensenger, Syp, Tardis, Tarquin, Terry Bollinger, The Anome, TheKMan, Thenub314, Tlroche, Tom Morris, Vegaswikian, VoidWarrior, Vsst, Wik, XJamRastafire, Zaslav, 108 anonymous edits License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ Linear functional 1 Linear functional

This article deals with linear maps from a vector space to its field of scalars. These maps may be functionals in the traditional sense of functions of functions, but this is not necessarily the case. In linear algebra, a linear functional or (also called a one-form or covector) is a from a vector space to its field of scalars. In Rn, if vectors are represented as column vectors, then linear functionals are represented as row vectors, and their action on vectors is given by the dot product, or the matrix product with the row vector on the left and the column vector on the right. In general, if V is a vector space over a field k, then a linear functional ƒ is a function from V to k, which is linear: for all for all The set of all linear functionals from V to k, Hom (V,k), is itself a vector space over k. This space is called the dual k space of V, or sometimes the algebraic , to distinguish it from the continuous dual space. It is often written V* or when the field k is understood.

Continuous linear functionals If V is a , the space of continuous linear functionals — the continuous dual — is often simply called the dual space. If V is a Banach space, then so is its (continuous) dual. To distinguish the ordinary dual space from the continuous dual space, the former is sometimes called the algebraic dual. In finite dimensions, every linear functional is continuous, so the continuous dual is the same as the algebraic dual, although this is not true in infinite dimensions.

Examples and applications

Linear functionals in Rn Suppose that vectors in the Rn are represented as column vectors

Then any linear functional can be written in these coordinates as a sum of the form:

This is just the matrix product of the row vector [a ... a ] and the column vector : 1 n Linear functional 2

Integration Linear functionals first appeared in functional analysis, the study of vector spaces of functions. A typical example of a linear functional is integration: the linear transformation defined by the Riemann integral

is a linear functional from the vector space C[a,b] of continuous functions on the interval [a, b] to the real numbers. The linearity of I(ƒ) follows from the standard facts about the integral:

Evaluation Let P denote the vector space of real-valued of degree ≤n defined on an interval [a,b]. If c ∈ [a, b], n then let ev : P → R be the evaluation functional: c n

The mapping ƒ → ƒ(c) is linear since

If x , ..., x are n+1 distinct points in [a,b], then the evaluation functionals ev i, i=0,1,...,n form a basis of the dual 0 n x space of P . (Lax (1996) proves this last fact using Lagrange interpolation.) n

Application to quadrature The integration functional I defined above defines a linear functional on the subspace P of polynomials of degree n ≤ n. If x , …, x are n+1 distinct points in [a, b], then there are coefficients a , …, a for which 0 n 0 n

for all ƒ ∈ P . This forms the foundation of the theory of numerical quadrature. n This follows from the fact that the linear functionals ev i : ƒ → ƒ(x ) defined above form a basis of the dual space of x i P (Lax 1996). n

Linear functionals in quantum mechanics Linear functionals are particularly important in quantum mechanics. Quantum mechanical systems are represented by Hilbert spaces, which are anti-isomorphic to their own dual spaces. A state of a quantum mechanical system can be identified with a linear functional. For more information see bra-ket notation.

Distributions In the theory of generalized functions, certain kinds of generalized functions called distributions can be realized as linear functionals on spaces of test functions. Linear functional 3

Properties • Any linear functional is either trivial (equal to 0 everywhere) or surjective onto the scalar field. Indeed, this follows since the image of a vector subspace under a linear transformation is a subspace, so is the image of V under L. But the only subspaces (i.e., k-subspaces) of k are {0} and k itself. • A linear functional is continuous if and only if its kernel is closed (Rudin 1991, Theorem 1.18). • Linear functionals with the same kernel are proportional. • The absolute value of any linear functional is a on its vector space.

Dual vectors and bilinear forms Every non- on a finite-dimensional vector space V gives rise to an isomorphism from V to V*. Specifically, denoting the bilinear form on V by ( , ) (for instance in Euclidean space (v,w) = v•w is the dot product of v and w), then there is a natural isomorphism given by

The inverse isomorphism is given by where ƒ* is the unique element of V for which for all w ∈ V

The above defined vector v* ∈ V* is said to be the dual vector of v ∈ V. In an infinite dimensional Hilbert space, analogous results hold by the Riesz representation theorem. There is a mapping V → V* into the continuous dual space V*. However, this mapping is antilinear rather than linear.

Visualizing linear functionals In finite dimensions, a linear function can be visualized in terms of its level sets. In three dimensions, the level sets of a linear functional are a family of mutually parallel planes; in higher dimensions, they are parallel hyperplanes. This method of visualizing linear functionals is sometimes introduced in general relativity texts, such as Misner, Thorne & Wheeler (1973).

Bases in finite dimensions

Basis of the dual space in finite dimensions Let the vector space V have a basis , not necessarily orthogonal. Then the dual space V* has a basis called the dual basis defined by the special property that

Or, more succinctly,

where δ is the . Here the superscripts of the basis functionals are not exponents but are instead contravariant indices. A linear functional belonging to the dual space can be expressed as a linear combination of basis functionals, with coefficients ("components") u , i

Then, applying the functional to a basis vector e yields j Linear functional 4

due to linearity of scalar multiples of functionals and pointwise linearity of sums of functionals. Then

that is

This last equation shows that an individual component of a linear functional can be extracted by applying the functional to a corresponding basis vector.

The dual basis and inner product When the space V carries an inner product, then it is possible to write explicitly a formula for the dual basis of a given basis. Let V have (not necessarily orthogonal) basis . In three dimensions (n = 3), the dual basis can be written explicitly

for i=1,2,3, where is the Levi-Civita symbol and the inner product (or dot product) on V. In higher dimensions, this generalizes as follows

where is the Hodge star operator.

References • Bishop, Richard; Goldberg, Samuel (1980), "Chapter 4", Tensor Analysis on , Dover Publications, ISBN 0-486-64039-6 • Halmos, Paul (1974), Finite dimensional vector spaces, Springer, ISBN 0387900934 • Lax, Peter (1996), Linear algebra, Wiley-Interscience, ISBN 978-0471111115 • Misner, Charles W.; Thorne, Kip. S.; Wheeler, John A. (1973), Gravitation, W. H. Freeman, ISBN 0-7167-0344-0 • Rudin, Walter (1991), Functional Analysis, McGraw-Hill Science/Engineering/Math, ISBN 978-0-07-054236-5 • Schutz, Bernard (1985), "Chapter 3", A first course in general relativity, Cambridge University Press, ISBN 0-521-27703-5 Article Sources and Contributors 5 Article Sources and Contributors

Linear functional Source: http://en.wikipedia.org/w/index.php?oldid=455653984 Contributors: Ae-a, Ahruman, Alansohn, AlexDitto, Alksentrs, Archelon, BACbKA, Bdmy, BenFrantzDale, Boud, Brews ohare, Burn, Charles Matthews, Commander Keane, Drbreznjev, Filemon, Fropuff, Gene Ward Smith, Geometry guy, Giftlite, Goldencako, Howard McCay, Ht686rg90, Juan Marquez, KHamsun, Kbk, Komponisto, LokiClock, MathMartin, Michael Hardy, Oleg Alexandrov, PV=nRT, Pvkeller, Reyk, Silly rabbit, Sławomir Biały, TakuyaMurata, UncleDouggie, Unknown, VKokielov, Waltpohl, 32 anonymous edits License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ Weak convergence (Hilbert space) 1 Weak convergence (Hilbert space)

In mathematics, weak convergence in a Hilbert space is convergence of a sequence of points in the .

Definition A sequence of points in a Hilbert space H is said to converge weakly to a point x in H if

for all y in H. Here, is understood to be the inner product on the Hilbert space. The notation

is sometimes used to denote this kind of convergence.

Weak topology Weak convergence is in contrast to strong convergence or convergence in the norm, which is defined by

where is the norm of x. The notion of weak convergence defines a topology on H and this is called the weak topology on H. In other words, the weak topology is the topology generated by the bounded functionals on H. It follows from Schwarz inequality that the weak topology is weaker than the norm topology. Therefore convergence in norm implies weak convergence while the converse is not true in general. However, if for each y and , then we have as On the level of operators, a T is also continuous in the weak topology: If x → x weakly, then for n all y

Properties • If a sequence converges strongly, then it converges weakly as well. • Since every closed and bounded set is weakly relatively compact (its closure in the weak topology is compact), every bounded sequence in a Hilbert space H contains a weakly convergent subsequence. Note that closed and bounded sets are not in general weakly compact in Hilbert spaces (consider the set consisting of an in an infinitely dimensional Hilbert space which is closed and bounded but not weakly compact since it doesn't contain 0). However, bounded and weakly closed sets are weakly compact so as a consequence every convex bounded is weakly compact. • As a consequence of the principle of uniform boundedness, every weakly convergent sequence is bounded. • If converges weakly to x, then

and this inequality is strict whenever the convergence is not strong. For example, infinite orthonormal sequences converge weakly to zero, as demonstrated below. • If converges weakly to x and we have the additional assumption that lim ||x || = ||x||, then x converges to x n n strongly: Weak convergence (Hilbert space) 2

• If the Hilbert space is of finite dimensional, i.e. a Euclidean space, then the concepts of weak convergence and strong convergence are the same.

Example

The first 3 functions in the sequence on . As converges weakly to .

If for some , then converge weakly to 0 in whenever . (In the illustration the are simply integers.) It is clear that, while the function will be equal to zero more frequently n goes to infinity, it is not very similar to the zero function at all. This dissimilarity is one of the reasons why this type of convergence is considered to be "weak."

Weak convergence of orthonormal sequences Consider a sequence which was constructed to be orthonormal, that is,

where equals one if m = n and zero otherwise. We claim that if the sequence is infinite, then it converges weakly to zero. A simple proof is as follows. For x ∈ H, we have

(Bessel's inequality)

where equality holds when {e } is a Hilbert space basis. Therefore n

i.e. Weak convergence (Hilbert space) 3

Banach-Saks theorem The Banach-Saks theorem states that every bounded sequence contains a subsequence and a point x such that

converges strongly to x as N goes to infinity.

Generalizations The definition of weak convergence can be extended to Banach spaces. A sequence of points in a Banach space B is said to converge weakly to a point x in B if

for any bounded linear functional defined on , that is, for any in the dual space If is a Hilbert space, then, by the Riesz representation theorem, any such has the form

for some in , so one obtains the Hilbert space definition of weak convergence. Article Sources and Contributors 4 Article Sources and Contributors

Weak convergence (Hilbert space) Source: http://en.wikipedia.org/w/index.php?oldid=421048975 Contributors: Calle, Cj67, Dingenis, Futurebird, Gauge, Giftlite, J04n, Lavaka, Linas, Mct mht, Nsteinberg, Oleg Alexandrov, Sławomir Biały, Weaky87, 25 anonymous edits Image Sources, Licenses and Contributors

Image:Sinfrequency.jpg Source: http://en.wikipedia.org/w/index.php?title=File:Sinfrequency.jpg License: Creative Commons Attribution-Sharealike 3.0 Contributors: Futurebird License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ Weak topology 1 Weak topology

In mathematics, weak topology is an alternative term for . The term is most commonly used for the initial topology of a topological vector space (such as a normed vector space) with respect to its continuous dual. The remainder of this article will deal with this case, which is one of the basic concepts of functional analysis. One may call subsets of a topological vector space weakly closed (respectively, compact, etc.) if they are closed (respectively, compact, etc.) with respect to the weak topology. Likewise, functions are sometimes called weakly continuous (respectively, differentiable, analytic, etc.) if they are continuous (respectively, differentiable, analytic, etc.) with respect to the weak topology.

The weak and strong Let K be a topological field, namely a field with a topology such that , multiplication, and division are continuous. In most applications K will be either the field of complex numbers or the field of real numbers with the familiar topologies. Let X be a topological vector space over K. Namely, X is a K vector space equipped with a topology so that vector addition and scalar multiplication are continuous. We may define a possibly different topology on X using the continuous (or topological) dual space X*. The dual space consists of all linear functions from X into the field K which are continuous with respect to the given topology. The weak topology on X is the initial topology with respect to X*. In other words, it is the coarsest topology (the topology with the fewest open sets) such that each element of X* is a continuous function. In order to distinguish the weak topology from the original topology on X, the original topology is often called the strong topology. A subbase for the weak topology is the collection of sets of the form φ-1(U) where φ ∈ X* and U is an open subset of the base field K. In other words, a subset of X is open in the weak topology if and only if it can be written as a union of (possibly infinitely many) sets, each of which is an intersection of finitely many sets of the form φ-1(U). More generally, if F is a subset of the algebraic dual space, then the initial topology of X with respect to F, denoted by σ(X,F), is the weak topology with respect to F . If one takes F to be the whole continuous dual space of X, then the weak topology with respect to F coincides with the weak topology defined above. If the field K has an absolute value , then the weak topology σ(X,F) is induced by the family of ,

for all f∈F and x∈X. In particular, weak topologies are locally convex. From this point of view, the weak topology is the coarsest polar topology; see weak topology (polar topology) for details. Specifically, if F is a vector space of linear functionals on X which separates points of X, then the continuous dual of X with respect to the topology σ(X,F) is precisely equal to F (Rudin 1991, Theorem 3.10). Weak topology 2

Weak convergence The weak topology is characterized by the following condition: a (x ) in X converges in the weak topology to the λ element x of X if and only if φ(x ) converges to φ(x) in R or C for all φ in X* . λ In particular, if x is a sequence in X, then x converges weakly to x if n n

as n → ∞ for all φ ∈ X*. In this case, it is customary to write

or, sometimes,

Other properties If X is equipped with the weak topology, then addition and scalar multiplication remain continuous operations, and X is a locally convex topological vector space. If X is a normed space with norm | ⋅ |. then the dual space X* is itself a normed vector space by using the norm ǁφǁ = sup |φ(x)|. This norm gives rise to a topology, called the strong topology, on X*. This is the topology of uniform ǁxǁ≤1 convergence. The uniform and strong topologies are generally different for other spaces of linear maps; see below.

The weak-* topology A space X can be embedded into the double dual X** by

where

Thus T : X → X** is an injective linear mapping, though it is not surjective unless X is reflexive. The weak-* topology on X* is the weak topology induced by the image of T: T(X) ⊂ X**. In other words, it is the coarsest topology such that the maps T from X* to the base field R or C remain continuous. x

Weak-* convergence A net φ in X* is convergent to φ in the weak-* topology if it converges pointwise: λ

for all x in X. In particular, a sequence of φ ∈ X* converges to φ provided that n

for all x in X. In this case, one writes

as n → ∞. Weak-* convergence is sometimes called the topology of simple convergence or the topology of pointwise convergence. Indeed, it coincides with the topology of pointwise convergence of linear functionals. Weak topology 3

Other properties By definition, the weak* topology is weaker than the weak topology on X*. An important fact about the weak* topology is the Banach–Alaoglu theorem: if X is normed, then the closed unit ball in X* is weak*-compact (more generally, the polar in X* of a neighborhood of 0 in X is weak*-compact). Moreover, the closed unit ball in a normed space X is compact in the weak topology if and only if X is reflexive. In more generality, let F be locally compact valued field (e.g., the reals, the complex numbers, or any of the p-adic number systems). Let X be a normed topological vector space over F, compatible with the absolute value in F. Then in X*, the topological dual space X of continuous F-valued linear functionals on X, all norm-closed balls are compact in the weak-* topology. If a normed space X is separable, then the weak-* topology is separable. It is metrizable on the norm-bounded subsets of X*, although not metrizable on all of X* unless X is finite-dimensional.

Examples

Hilbert spaces Consider, for example, the difference between strong and weak convergence of functions in the Hilbert space L2(Rn). Strong convergence of a sequence ψ ∈L2(Rn) to an element ψ means that k

as k→∞. Here the notion of convergence corresponds to the norm on L2. In contrast weak convergence only demands that

for all functions f∈L2 (or, more typically, all f in a dense subset of L2 such as a space of test functions). For given test functions, the relevant notion of convergence only corresponds to the topology used in C. For example, in the Hilbert space L2(0,π), the sequence of functions

form an orthonormal basis. In particular, the (strong) limit of ψ as k→∞ does not exist. On the other hand, by the k Riemann–Lebesgue lemma, the weak limit exists and is zero.

Distributions One normally obtains spaces of distributions by forming the strong dual of a space of test functions (such as the compactly supported smooth functions on Rn). In an alternative construction of such spaces, one can take the weak dual of a space of test functions inside a Hilbert space such as L2. Thus one is led to consider the idea of a rigged Hilbert space.

Operator topologies If X and Y are topological vector spaces, the space L(X,Y) of continuous linear operators f:X → Y may carry a variety of different possible topologies. The naming of such topologies depends on the kind of topology one is using on the target space Y to define operator convergence (Yosida 1980, IV.7 Topologies of linear maps). There are, in general, a vast array of possible operator topologies on L(X,Y), whose naming is not entirely intuitive. For example, the on L(X,Y) is the topology of pointwise convergence. For instance, if Y is a normed space, then this topology is defined by the seminorms indexed by x∈X: Weak topology 4

More generally, if a family of seminorms Q defines the topology on Y, then the seminorms p on L(X,Y) defining q,x the strong topology are given by

indexed by q∈Q and x∈X. In particular, see the and weak* operator topology.

References • Conway, John B. (1994), A Course in Functional Analysis (2nd ed.), Springer-Verlag, ISBN 0-387-97245-5 • Pedersen, Gert (1989), Analysis Now, Springer, ISBN 0-387-96788-5 • Rudin, Walter (1991), Functional Analysis, McGraw-Hill Science, ISBN 0-07-054236-8 • Willard, Stephen (February 2004), General Topology, Courier Dover Publications, ISBN 0-486-43479-6 • Yosida, Kosaku (1980), Functional analysis (6th ed.), Springer, ISBN 978-3540586548 Article Sources and Contributors 5 Article Sources and Contributors

Weak topology Source: http://en.wikipedia.org/w/index.php?oldid=460030797 Contributors: 212.29.241.xxx, Aetheling, AxelBoldt, Boodlepounce, Bryan Derksen, Charles Matthews, Charr084, Conversion script, DesolateReality, Drusus 0, Fropuff, Futurebird, Giftlite, Grasyop, James pic, Joasiak, KleinKlio, Lethe, Linas, Lost-n-translation, Lupin, Mat cross, MathKnight, MathMartin, Mct mht, Mrw7, Nbarth, Oleg Alexandrov, Petrus, Quilbert, Salgueiro, Shotwell, Silly rabbit, Sławomir Biały, TexasAndroid, Tobias Bergemann, Tosha, U+003F, Xnn, Zundark, 35 anonymous edits License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ Convergence of random variables 1 Convergence of random variables

In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to statistics and stochastic processes. The same concepts are known in more general mathematics as stochastic convergence and they formalize the idea that a sequence of essentially random or unpredictable events can sometimes be expected to settle down into a behaviour that is essentially unchanging when items far enough into the sequence are studied. The different possible notions of convergence relate to how such a behaviour can be characterised: two readily understood behaviours are that the sequence eventually takes a constant value, and that values in the sequence continue to change but can be described by an unchanging probability distribution.

Background "Stochastic convergence" formalizes the idea that a sequence of essentially random or unpredictable events can sometimes be expected to settle into a pattern. The pattern may for instance be • Convergence in the classical sense to a fixed value, perhaps itself coming from a random event • An increasing similarity of outcomes to what a purely deterministic function would produce • An increasing preference towards a certain outcome • An increasing "aversion" against straying far away from a certain outcome Some less obvious, more theoretical patterns could be • That the probability distribution describing the next outcome may grow increasingly similar to a certain distribution • That the series formed by calculating the expected value of the outcome's distance from a particular value may converge to 0 • That the variance of the random variable describing the next event grows smaller and smaller. These other types of patterns that may arise are reflected in the different types of stochastic convergence that have been studied. While the above discussion has related to the convergence of a single series to a limiting value, the notion of the convergence of two series towards each other is also important, but this is easily handled by studying the sequence defined as either the difference or the ratio of the two series. For example, if the average of n uncorrelated random variables Y , i = 1, ..., n, all having the same finite mean and i variance, is given by

then as n tends to infinity, X converges in probability (see below) to the common mean, μ, of the random variables n Y . This result is known as the weak law of large numbers. Other forms of convergence are important in other useful i theorems, including the central limit theorem. Throughout the following, we assume that (X ) is a sequence of random variables, and X is a random variable, and n all of them are defined on the same probability space . Convergence of random variables 2

Convergence in distribution

Examples of convergence in distribution

Dice factory

Suppose a new dice factory has just been built. The first few dice come out quite biased, due to imperfections in the production process. The outcome from tossing any of them will follow a distribution markedly different from the desired uniform distribution. As the factory is improved, the dice become less and less loaded, and the outcomes from tossing a newly produced die will follow the uniform distribution more and more closely.

Tossing coins

Let X be the fraction of heads after tossing up an unbiased coin n times. Then X has the Bernoulli distribution with expected value μ = 0.5 and n 1 variance σ2 = 0.25. The subsequent random variables X , X , … will all be distributed binomially. 2 3 As n grows larger, this distribution will gradually start to take shape more and more similar to the bell curve of the normal distribution. If we shift and rescale X ’s appropriately, then will be converging in distribution to the standard normal, the result that follows from the n celebrated central limit theorem.

Graphic example

Suppose { X } is an iid sequence of uniform U(−1,1) random variables. Let be their (normalized) sums. Then according to the i central limit theorem, the distribution of Z approaches the normal N(0, ⅓) distribution. This convergence is shown in the picture: as n grows larger, n the shape of the pdf function gets closer and closer to the Gaussian curve.

With this mode of convergence, we increasingly expect to see the next outcome in a sequence of random experiments becoming better and better modeled by a given probability distribution. Convergence in distribution is the weakest form of convergence, since it is implied by all other types of convergence mentioned in this article. However convergence in distribution is very frequently used in practice; most often it arises from application of the central limit theorem.

Definition A sequence {X , X , …} of random variables is said to converge in distribution, or converge weakly, or converge 1 2 in law to a random variable X if

for every number x ∈ R at which F is continuous. Here F and F are the cumulative distribution functions of random n variables X and X correspondingly. n The requirement that only the continuity points of F should be considered is essential. For example if X are n distributed uniformly on intervals [0, 1⁄ ], then this sequence converges in distribution to a degenerate random n variable X = 0. Indeed, F (x) = 0 for all n when x ≤ 0, and F (x) = 1 for all x ≥ 1⁄ when n > 0. However, for this n n n limiting random variable F(0) = 1, even though F (0) = 0 for all n. Thus the convergence of cdfs fails at the point x = n Convergence of random variables 3

0 where F is discontinuous. Convergence in distribution may be denoted as

where is the law (probability distribution) of X. For example if X is standard normal we can write . For random vectors {X , X , …} ⊂ Rk the convergence in distribution is defined similarly. We say that this sequence 1 2 converges in distribution to a random k-vector X if

for every A ⊂ Rk which is a continuity set of X. The definition of convergence in distribution may be extended from random vectors to more complex random elements in arbitrary metric spaces, and even to the “random variables” which are not measurable — a situation which occurs for example in the study of empirical processes. This is the “weak convergence of laws without laws being defined” — except asymptotically.[1] In this case the term weak convergence is preferable (see weak ), and we say that a sequence of random elements {X } converges weakly to X (denoted as X ⇒ X) if n n

for all continuous bounded functions h(·).[2] Here E* denotes the outer expectation, that is the expectation of a “smallest measurable function g that dominates h(X )”. n

Properties • Since F(a) = Pr(X ≤ a), the convergence in distribution means that the probability for X to be in a given range is n approximately equal to the probability that the value of X is in that range, provided n is sufficiently large. • In general, convergence in distribution does not imply that the sequence of corresponding probability density functions will also converge. As an example one may consider random variables with densities ƒ (x) = (1 − cos(2πnx))1 . These random variables converge in distribution to a uniform U(0, 1), whereas n {x∈(0,1)} their densities do not converge at all.[3] • Portmanteau lemma provides several equivalent definitions of convergence in distribution. Although these definitions are less intuitive, they are used to prove a number of statistical theorems. The lemma states that {X } n converges in distribution to X if and only if any of the following statements are true: • Eƒ(X ) → Eƒ(X) for all bounded, continuous functions ƒ; n • Eƒ(X ) → Eƒ(X) for all bounded, Lipschitz functions ƒ; n • limsup{ Eƒ(X ) } ≤ Eƒ(X) for every upper semi-continuous function ƒ bounded from above; n • liminf{ Eƒ(X ) } ≥ Eƒ(X) for every lower semi-continuous function ƒ bounded from below; n • limsup{ Pr(X ∈ C) } ≤ Pr(X ∈ C) for all closed sets C; n • liminf{ Pr(X ∈ U) } ≥ Pr(X ∈ U) for all open sets U; n • lim{ Pr(X ∈ A) } = Pr(X ∈ A) for all continuity sets A of random variable X. n • Continuous mapping theorem states that for a continuous function g(·), if the sequence {X } converges in n distribution to X, then so does {g(X )} converge in distribution to g(X). n • Lévy’s continuity theorem: the sequence {X } converges in distribution to X if and only if the sequence of n corresponding characteristic functions {φ } converges pointwise to the characteristic function φ of X, and φ(t) is n continuous at t = 0. • Convergence in distribution is metrizable by the Lévy–Prokhorov metric. • A natural link to convergence in distribution is the Skorokhod's representation theorem. Convergence of random variables 4

Convergence in probability

Examples of convergence in probability

Height of a person

Suppose we would like to conduct the following experiment. First, pick a random person in the street. Let X be his/her height, which is ex ante a random variable. Then you start asking other people to estimate this height by eye. Let X be the average of the first n responses. Then (provided n there is no systematic error) by the law of large numbers, the sequence X will converge in probability to the random variable X. n Archer

Suppose a person takes a bow and starts shooting arrows at a target. Let X be his score in n-th shot. Initially he will be very likely to score zeros, n but as the time goes and his archery skill increases, he will become more and more likely to hit the bullseye and score 10 points. After the years of practice the probability that he hit anything but 10 will be getting increasingly smaller and smaller. Thus, the sequence X converges in probability n to X = 10. Note that X does not converge almost surely however. No matter how professional the archer becomes, there will always be a small probability of n making an error. Thus the sequence {X } will never turn stationary: there will always be non-perfect scores in it, even if they are becoming n increasingly less frequent.

The basic idea behind this type of convergence is that the probability of an “unusual” outcome becomes smaller and smaller as the sequence progresses. The concept of convergence in probability is used very often in statistics. For example, an estimator is called consistent if it converges in probability to the quantity being estimated. Convergence in probability is also the type of convergence established by the weak law of large numbers.

Definition A sequence {X } of random variables converges in probability towards X if for all ε > 0 n

Formally, pick any ε > 0 and any δ > 0. Let P be the probability that X is outside the ball of radius ε centered at X. n n Then for X to converge in probability to X there should exist a number N such that for all n ≥ N the probability P n δ δ n is less than δ. Convergence in probability is denoted by adding the letter p over an arrow indicating convergence, or using the “plim” probability limit operator:

For random elements {X } on a separable metric space (S, d), convergence in probability is defined similarly by[4] n

Properties • Convergence in probability implies convergence in distribution.[proof] • Convergence in probability implies almost sure convergence on discrete probability spaces. Since A.S. convergence always implies convergence in probability, in the discrete case, strong convergence and convergence in probability mean the same thing.[proof] • In the opposite direction, convergence in distribution implies convergence in probability only when the limiting random variable X is a constant.[proof] • The continuous mapping theorem states that for every continuous function g(·), if , then also . • Convergence in probability defines a topology on the space of random variables over a fixed probability space. This topology is metrizable by the Ky Fan metric:[5] Convergence of random variables 5

Almost sure convergence

Examples of almost sure convergence

Example 1

Consider an animal of some short-lived species. We note the exact amount of food that this animal consumes day by day. This sequence of numbers will be unpredictable in advance, but we may be quite certain that one day the number will become zero, and will stay zero forever after.

Example 2

Consider a man who starts tomorrow to toss seven coins once every morning. Each afternoon, he donates a random amount of money to a certain charity. The first time the result is all tails, however, he will stop permanently. Let X , X , … be the day by day amounts the charity receives from him. 1 2 We may be almost sure that one day this amount will be zero, and stay zero forever after that. However, when we consider any finite number of days, there is a nonzero probability the terminating condition will not occur.

This is the type of stochastic convergence that is most similar to pointwise convergence known from elementary real analysis.

Definition To say that the sequence X converges almost surely or or with probability 1 or strongly n towards X means that

This means that the values of X approach the value of X, in the sense (see almost surely) that events for which X n n does not converge to X have probability 0. Using the probability space and the concept of the random variable as a function from Ω to R, this is equivalent to the statement

Another, equivalent, way of defining almost sure convergence is as follows:

Almost sure convergence is often denoted by adding the letters a.s. over an arrow indicating convergence:

For generic random elements {X } on a metric space (S, d), convergence almost surely is defined similarly: n

Properties • Almost sure convergence implies convergence in probability, and hence implies convergence in distribution. It is the notion of convergence used in the strong law of large numbers. • The concept of almost sure convergence does not come from a topology. This means there is no topology on the space of random variables such that the almost surely convergent sequences are exactly the converging sequences with respect to that topology. In particular, there is no metric of almost sure convergence. Convergence of random variables 6

Sure convergence To say that the sequence or random variables (X ) defined over the same probability space (i.e., a random process) n converges surely or everywhere or pointwise towards X means

where Ω is the sample space of the underlying probability space over which the random variables are defined. This is the notion of pointwise convergence of sequence functions extended to sequence of random variables. (Note that random variables themselves are functions).

Sure convergence of a random variable implies all the other kinds of convergence stated above, but there is no payoff in probability theory by using sure convergence compared to using almost sure convergence. The difference between the two only exists on sets with probability zero. This is why the concept of sure convergence of random variables is very rarely used.

Convergence in mean We say that the sequence X converges in the r-th mean (or in the Lr-norm) towards X, for some r ≥ 1, if r-th n absolute moments of X and X exist, and n

where the operator E denotes the expected value. Convergence in r-th mean tells us that the expectation of the r-th power of the difference between X and X converges to zero. n This type of convergence is often denoted by adding the letter Lr over an arrow indicating convergence:

The most important cases of convergence in r-th mean are: • When X converges in r-th mean to X for r = 1, we say that X converges in mean to X. n n • When X converges in r-th mean to X for r = 2, we say that X converges in mean square to X. This is also n n sometimes referred to as convergence in mean, and is sometimes denoted[6]

Convergence in the r-th mean, for r > 0, implies convergence in probability (by Markov's inequality), while if r > s ≥ 1, convergence in r-th mean implies convergence in s-th mean. Hence, convergence in mean square implies convergence in mean.

Convergence in rth-order mean

Examples of convergence in rth-order mean. Basic example: A newly built factory produces cans of beer. The owners want each can to contain exactly a certain amount. Knowing the details of the current production process, engineers may compute the expected error in a newly produced can. They are continuously improving the production process, so as time goes by, the expected error in a newly produced can tends to zero. This example illustrates convergence in first-order mean.

This is a rather "technical" mode of convergence. We essentially compute a sequence of real numbers, one number for each random variable, and check if this sequence is convergent in the ordinary sense. Convergence of random variables 7

Formal definition If for some real number a, then {X } converges in rth-order mean to a. n Commonly used notation:

Properties The chain of implications between the various notions of convergence are noted in their respective sections. They are, using the arrow notation:

These properties, together with a number of other special cases, are summarized in the following list: • Almost sure convergence implies convergence in probability:[7] [proof]

• Convergence in probability implies there exists a sub-sequence which almost surely converges:[8]

• Convergence in probability implies convergence in distribution:[7] [proof]

• Convergence in r-th order mean implies convergence in probability:

• Convergence in r-th order mean implies convergence in lower order mean, assuming that both orders are greater than one:

provided r ≥ s ≥ 1. • If X converges in distribution to a constant c, then X converges in probability to c:[7] [proof] n n provided c is a constant. • If X converges in distribution to X and the difference between X and Y converges in probability to zero, then Y n n n n also converges in distribution to X:[7] [proof]

• If X converges in distribution to X and Y converges in distribution to a constant c, then the joint vector (X , Y ) n n n n converges in distribution to (X, c):[7] [proof]

provided c is a constant. Note that the condition that Y converges to a constant is important, if it were to converge to a random variable Y n then we wouldn’t be able to conclude that (X , Y ) converges to (X, Y). n n • If X converges in probability to X and Y converges in probability to Y, then the joint vector (X , Y ) converges in n n n n probability to (X, Y):[7] [proof]

• If X converges in probability to X, and if P(|X | ≤ b) = 1 for all n and some b, then X converges in rth mean to X n n n for all r ≥ 1. In other words, if X converges in probability to X and all random variables X are almost surely n n bounded above and below, then X converges to X also in any rth mean. n • Almost sure representation. Usually, convergence in distribution does not imply convergence almost surely. However for a given sequence {X } which converges in distribution to X it is always possible to find a new n 0 probability space (Ω, F, P) and random variables {Y , n = 0,1,…} defined on it such that Y is equal in n n distribution to X for each n ≥ 0, and Y converges to Y almost surely.[9] n n 0 Convergence of random variables 8

• If for all ε > 0,

then we say that X converges almost completely, or almost in probability towards X. When X converges n n almost completely towards X then it also converges almost surely to X. In other words, if X converges in n probability to X sufficiently quickly (i.e. the above sequence of tail probabilities is summable for all ε > 0), then X also converges almost surely to X. This is a direct implication from the Borel-Cantelli lemma. n • If S is a sum of n real independent random variables: n

then S converges almost surely if and only if S converges in probability. n n • The dominated convergence theorem gives sufficient conditions for almost sure convergence to imply L1-convergence:

• A necessary and sufficient condition for L1 convergence is and the sequence (X ) is uniformly n integrable.

Notes [1] Bickel et al. 1998, A.8, page 475 [2] van der Vaart & Wellner 1996, p. 4 [3] Romano & Siegel 1985, Example 5.26 [4] Dudley 2002, Chapter 9.2, page 287 [5] Dudley 2002, p. 289 [6] Porat, B. (1994). Digital Processing of Random Signals: Theory & Methods. Prentice Hall. pp. 19. ISBN 0130637513. [7] van der Vaart 1998, Theorem 2.7 [8] Gut, Allan (2005). Probability: A graduate course. Theorem 3.4: Springer. ISBN 0387228330. [9] van der Vaart 1998, Th.2.19

References • Bickel, Peter J.; Klaassen, Chris A.J.; Ritov, Ya’acov; Wellner, Jon A. (1998). Efficient and adaptive estimation for semiparametric models. New York: Springer-Verlag. ISBN 0387984739. LCCN QA276.8.E374. • Billingsley, Patrick (1986). Probability and Measure. Wiley Series in Probability and Mathematical Statistics (2nd ed.). Wiley. • Billingsley, Patrick (1999). Convergence of probability measures (2nd ed.). John Wiley & Sons. pp. 1–28. ISBN 0471197459. • Dudley, R.M. (2002). Real analysis and probability. Cambridge, UK: Cambridge University Press. ISBN 052180972X. • Grimmett, G.R.; Stirzaker, D.R. (1992). Probability and random processes (2nd ed.). Clarendon Press, Oxford. pp. 271–285. ISBN 0-19-853665-8. • Jacobsen, M. (1992). Videregående Sandsynlighedsregning (Advanced Probability Theory) (3rd ed.). HCØ-tryk, Copenhagen. pp. 18–20. ISBN 87-91180-71-6. • Ledoux, Michel; Talagrand, Michel (1991). Probability in Banach spaces. Berlin: Springer-Verlag. pp. xii+480. ISBN 3-540-52013-9. MR1102015. Convergence of random variables 9

• Romano, Joseph P.; Siegel, Andrew F. (1985). Counterexamples in probability and statistics. Great Britain: Chapman & Hall. ISBN 0412989018. LCCN QA273.R58 1985. • van der Vaart, Aad W.; Wellner, Jon A. (1996). Weak convergence and empirical processes. New York: Springer-Verlag. ISBN 0387946403. LCCN QA274.V33 1996. • van der Vaart, Aad W. (1998). Asymptotic statistics. New York: Cambridge University Press. ISBN 9780521496032. LCCN QA276.V22 1998. • Williams, D. (1991). Probability with Martingales. Cambridge University Press. ISBN 0521406056. • Wong, E.; Hájek, B. (1985). Stochastic Processes in Engineering Systems. New York: Springer–Verlag. This article incorporates material from the Citizendium article "Stochastic convergence", which is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License but not under the GFDL. Article Sources and Contributors 10 Article Sources and Contributors

Convergence of random variables Source: http://en.wikipedia.org/w/index.php?oldid=457408351 Contributors: A. Pichler, Aastrup, Albmont, Amir Aliev, Andrea Ambrosio, Ardonik, AxelBoldt, Bjcairns, CALR, Cenarium, Chungc, Constructive editor, DHN, David Eppstein, Deepakr, Derveish, Ensign beedrill, Fpoursafaei, Fram, GaborPete, Giftlite, Headbomb, HyDeckar, Igny, J04n, JA(000)Davidson, JamesBWatson, Jmath666, Jncraton, Josh Guffin, Josh Parris, Kolesarm, Kri, Landroni, LoveOfFate, McKay, Melcombe, Mets501, Michael Hardy, Notedgrant, Oleg Alexandrov, Paul August, Philtime, PigFlu Oink, Prewitt81, Qwfp, Ricardogpn, Robin S, Robinh, Sam Hocevar, Sligocki, Spireguy, Steve Kroon, Stigin, Stpasha, Sullivan.t.j, Szepi, TedPavlic, The Mad Echidna, Voorlandt, Wartoilet, Weierstraß, Zvika, 107 anonymous edits Image Sources, Licenses and Contributors

Image:Convergence in distribution (sum of uniform rvs).gif Source: http://en.wikipedia.org/w/index.php?title=File:Convergence_in_distribution_(sum_of_uniform_rvs).gif License: Creative Commons Zero Contributors: Stpasha License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/ Convergence of measures 1 Convergence of measures

In mathematics, more specifically measure theory, there are various notions of the convergence of measures. Three of the most common notions of convergence are described below.

Total variation convergence of measures This is the strongest notion of convergence shown on this page and is defined a follows. Let be a measurable space. The distance between two (positive) measures and is then given by

If and are both probability measures, then the total variation distance is also given by

The equivalence between these two definitions can be seen as a particular case of the Monge-Kantorovich . From the two definitions above, it is clear that the total variation distance between probability measures is always between 0 and 2. To illustrate the meaning of the total variation distance, consider the following thought experiment. Assume that we are given two probability measures and , as well as a random variable . We know that has law either or , but we do not know which one of the two. Assume now that we are given one single sample distributed according to the law of and that we are then asked to guess which one of the two distributions describes that law. The quantity

then provides a sharp upper bound on the probability that our guess is correct.

Strong convergence of measures For a measurable space, a sequence is said to converge strongly to a limit if

for every set in . For example, as a consequence of the Riemann–Lebesgue lemma, the sequence of measures on the interval [-1,1] given by converges strongly to Lebesgue measure, but it does not converge in total variation.

Weak convergence of measures In mathematics and statistics, weak convergence (also known as narrow convergence or weak-* convergence which is a more appropriate name from the point of view of functional analysis but less frequently used) is one of many types of convergence relating to the convergence of measures. It depends on a topology on the underlying space and thus is not a purely measure theoretic notion. There are several equivalent definitions of weak convergence of a sequence of measures, some of which are (apparently) more general than others. The equivalence of these conditions is sometimes known as the portmanteau theorem. Definition. Let S be a metric space with its Borel σ-algebra Σ. We say that a sequence of probability measures on (S, Σ), P , n = 1, 2, ..., converges weakly to the probability measure P, and write n Convergence of measures 2

if any of the following equivalent conditions is true: • E ƒ → Eƒ for all bounded, continuous functions ƒ; n • E ƒ → Eƒ for all bounded and Lipschitz functions ƒ; n • limsup E ƒ ≤ Eƒ for every upper semi-continuous function ƒ bounded from above; n • liminf E ƒ ≥ Eƒ for every lower semi-continuous function ƒ bounded from below; n • limsup P (C) ≤ P(C) for all closed sets C of space S; n • liminf P (U) ≥ P(U) for all open sets U of space S; n • lim P (A) = P(A) for all continuity sets A of measure P. n In the case S = R with its usual topology, if F , F denote the cumulative distribution functions of the measures P , P n n respectively, then P converges weakly to P if and only if lim F (x) = F(x) for all points x ∈ R at which F is n n→∞ n continuous. For example, the sequence where P is the Dirac measure located at 1/n converges weakly to the Dirac measure n located at 0 (if we view these as measures on R with the usual topology), but it does not converge strongly. This is intuitively clear: we only know that 1/n is "close" to 0 because of the topology of R. This definition of weak convergence can be extended for S any metrizable topological space. It also defines a weak topology on P(S), the set of all probability measures defined on (S, Σ). The weak topology is generated by the following basis of open sets:

where

If S is also separable, then P(S) is metrizable and separable, for example by the Lévy–Prokhorov metric, if S is also compact or Polish, so is P(S). If S is separable, it naturally embeds into P(S) as the (closed) set of dirac measures, and its is dense. There are many "arrow notations" for this kind of convergence: the most frequently used are , and .

Weak convergence of random variables Let be a probability space and X be a metric space. If X , X: Ω → X is a sequence of random variables n then X is said to converge weakly (or in distribution or in law) to X as n → ∞ if the sequence of pushforward n measures (X ) (P) converges weakly to X (P) in the sense of weak convergence of measures on X, as defined above. n ∗ ∗ References • Ambrosio, L., Gigli, N. & Savaré, G. (2005). Gradient Flows in Metric Spaces and in the Space of Probability Measures. Basel: ETH Zürich, Birkhäuser Verlag. ISBN 3-7643-2428-7. • Billingsley, Patrick (1995). Probability and Measure. New York, NY: John Wiley & Sons, Inc.. ISBN 0-471-00710-2. • Billingsley, Patrick (1999). Convergence of Probability Measures. New York, NY: John Wiley & Sons, Inc.. ISBN 0-471-19745-9. Article Sources and Contributors 3 Article Sources and Contributors

Convergence of measures Source: http://en.wikipedia.org/w/index.php?oldid=460030246 Contributors: Derveish, Diegotorquemada, Hairer, Igny, J04n, Melcombe, Michael Hardy, Paul August, Radagast83, Stpasha, Sullivan.t.j, Thenub314, 13 anonymous edits License

Creative Commons Attribution-Share Alike 3.0 Unported //creativecommons.org/licenses/by-sa/3.0/