The Sample-Complexity of General Reinforcement Learning

Total Page:16

File Type:pdf, Size:1020Kb

The Sample-Complexity of General Reinforcement Learning View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by The Australian National University The Sample-Complexity of General Reinforcement Learning Tor Lattimore [email protected] Australian National University Marcus Hutter [email protected] Australian National University Peter Sunehag [email protected] Australian National University Abstract fν1; ··· ; νN g of arbitrary environments, an accuracy , and a confidence δ. The main result is that MERL We present a new algorithm for general re- has a sample-complexity of inforcement learning where the true environ- N N ment is known to belong to a finite class of N ~ 2 O 2 3 log ; arbitrary models. The algorithm is shown to (1 − γ) δ(1 − γ) be near-optimal for all but O(N log2 N) time- where 1=(1 − γ) is the effective horizon determined steps with high probability. Infinite classes by discount rate γ. We also consider the case where are also considered where we show that com- M is infinite, but compact with respect to a particular pactness is a key criterion for determining topology. In this case, a variant of MERL has the same the existence of uniform sample-complexity sample-complexity as above, but where N is replaced bounds. A matching lower bound is given for by the size of the smallest -cover. A lower bound the finite case. is also given that matches the upper bound except for logarithmic factors. Finally, if M is non-compact then in general no finite sample-complexity bound exists. 1. Introduction 1.1. Related Work Reinforcement Learning (RL) is the task of learning Many authors have worked on the sample-complexity policies that lead to nearly-optimal rewards where the of RL in various settings. The simplest case is the environment is unknown. One metric of the efficiency multiarmed bandit problem that has been extensively of an RL algorithm is sample-complexity, which is a studied with varying assumptions. The typical mea- high probability upper bound on the number of time- sure of efficiency in the bandit literature is regret, but steps when that algorithm is not nearly-optimal that sample-complexity bounds are also known and some- holds for all environment in some class. Such bounds times used. The next step from bandits is finite state are typically shown for very specific classes of environ- MDPs, of which bandits are an example with only a ments, such as (partially observable/factored) Markov single state. There are two main settings when MDPs Decision Processes (MDP) and bandits. We consider are considered, the discounted case where sample- more general classes of environments where at each complexity bounds are proven and the undiscounted time-step an agent takes an action a 2 A where-upon (average reward) case where regret bounds are more it receives reward r 2 [0; 1] and an observation o 2 O, typical. In the discounted setting the upper and lower which are generated stochastically by the environment bounds on sample-complexity are now extremely re- and may depend arbitrarily on the entire history se- fined. See Strehl et al. (2009) for a detailed review quence. of the popular algorithms and theorems. More re- We present a new reinforcement learning algorithm, cent work on closing the gap between upper and lower named Maximum Exploration Reinforcement Learn- bounds is by Szita & Szepesv´ari(2010); Lattimore & ing (MERL), that accepts as input a finite set M := Hutter (2012); Azar et al. (2012). In the undiscounted case it is necessary to make some form of ergodicity Proceedings of the 30 th International Conference on Ma- assumption as without this regret bounds cannot be chine Learning, Atlanta, Georgia, USA, 2013. JMLR: given. In this work we avoid ergodicity assumptions W&CP volume 28. Copyright 2013 by the author(s). The Sample-Complexity of General Reinforcement Learning and discount future rewards. Nevertheless, our al- butions over observation/reward pairs given the his- gorithm borrows some tricks used by UCRL2 (Auer tory so far. A policy π is a function π : H∗ ! et al., 2010). Previous work for more general environ- A. An environment and policy interact sequen- ment classes is somewhat limited. For factored MDPs tially to induce a measure, Pµ,π, on filtered prob- 1 there are known bounds, see (Chakraborty & Stone, ability space (H ; F; fFtg). For convenience, we 2011) and references there-in. Even-dar et al. (2005) abuse notation and write Pµ,π(h) := Pµ,π(Γh). If 0 0 give essentially unimprovable exponential bounds on h @ h then conditional probabilities are Pµ,π(h jh) := the sample-complexity of learning in finite partially 0 Pt+d k−t Pµ,π(h )=Pµ,π(h). Rt(h; d) := k=t γ rk(h) is the observable MDPs. Maillard et al. (2013) show regret d-step return function and Rt(h) := limd!1 Rt(h; d). bounds for undiscounted RL where the true environ- Given history ht with `(ht) = t, the value func- ment is assumed to be finite, Markov and commu- π tion is defined by Vµ (ht; d) := E[Rt(h; d)jht] where nicating, but where the state is not directly observ- the expectation is taken with respect to Pµ,π(·|ht). able. As far as we know there has been no work on π π Vµ (ht) := limd!1 Vµ (ht; d). The optimal policy for the sample-complexity of RL when environments are ∗ π environment µ is πµ := arg maxπ Vµ , which with our completely general, but asymptotic results have gar- assumptions is known to exist (Lattimore & Hutter, ∗ nered some attention with positive results by Hutter πµ 2011a). The value of the optimal policy is V ∗ := V . (2002); Ryabko & Hutter (2008); Sunehag & Hutter µ µ In general, µ denotes the true environment while ν is a (2012) and (mostly) negative ones by Lattimore & model. π will typically be the policy of the algorithm Hutter (2011b). Perhaps the closest related worked is under consideration. Q∗ (h; a) is the value in history (Diuk et al., 2009), which deals with a similar problem µ h of following policy π∗ except for the first time-step in the rather different setting of learning the optimal µ when action a is taken. M is a set of environments predictor from a class of N experts. They obtain an (models). O(N log N) bound, which is applied to the problem of Sample-complexity. Policy π is -optimal in his- structure learning for discounted finite-state factored tory h and environment µ if V ∗(h) − V π(h) ≤ . The MDPs. Our work generalises this approach to the non- µ µ sample-complexity of a policy π in environment class Markov case and compact model classes. M is the smallest Λ such that, with high probability, π is -optimal for all but Λ time-steps for all µ 2 M. 1 2. Notation Define Lµ,π : H ! N [ f1g to be the number of time-steps when π is not -optimal. The definition of environments is borrowed from the 1 work of ?, although the notation is slightly more formal X ∗ π Lµ,π(h) := Vµ (ht) − Vµ (ht) > ; to ease the application of martingale inequalities. t=1 General. = f0; 1; 2; · · · g is the natural numbers. N where ht is the length t prefix of h. The sample- For the indicator function we write [[x = y]] = 1 if complexity of policy π is Λ with respect to accuracy x = y and 0 otherwise. We use ^ and _ for logical and confidence 1 − δ if P Lµ,π(h) > Λ < δ; 8µ 2 M. and/or respectively. If A is a set then jAj is its size and A∗ is the set of all finite strings (sequences) over 3. Finite Case A. If x and y are sequences then x @ y means that x is a prefix of y. Unless otherwise mentioned, log We start with the finite case where the true environ- represents the natural logarithm. For random variable ment is known to belong to a finite set of models, M. X we write EX for its expectation. For x 2 R, dxe is The Maximum Exploration Reinforcement Learning the ceiling function. algorithm is model-based in the sense that it maintains Environments and policies. Let A, O and R ⊂ R a set, Mt ⊆ M, where models are eliminated once be finite sets of actions, observations and rewards re- they become implausible. The algorithm operates in spectively and H := A × O × R. H1 is the set of phases of exploration and exploitation, choosing to ex- infinite history sequences while H∗ := (A × O × R)∗ ploit if it knows all plausible environments are reason- is the set of finite history sequences. If h 2 H∗ ably close under all optimal policies and explore oth- then `(h) is the number of action/observation/reward erwise. This method of exploration essentially guar- tuples in h. We write at(h), ot(h), rt(h) for the antees that MERL is nearly optimal whenever it is tth action/observation/reward of history sequence h. exploiting and the number of exploration phases is ∗ 0 1 0 For h 2 H ,Γh := fh 2 H : h @ h g is the cylin- limited with high probability. The main difficulty is ∗ der set. Let F := σ(fΓh : h 2 H g) and Ft := specifying what it means to be plausible. Previous au- ∗ σ(fΓh : h 2 H ^ `(h) = tg) be σ-algebras. An en- thors working on finite environments, such as MDPs vironment µ is a set of conditional probability distri- or bandits, have removed models for which the tran- The Sample-Complexity of General Reinforcement Learning sition probabilities are not sufficiently close to their one of them has expectation larger than (1 − γ)/8.
Recommended publications
  • Valuative Characterization of Central Extensions of Algebraic Tori on Krull
    Valuative characterization of central extensions of algebraic tori on Krull domains ∗ Haruhisa Nakajima † Department of Mathematics, J. F. Oberlin University Tokiwa-machi, Machida, Tokyo 194-0294, JAPAN Abstract Let G be an affine algebraic group with an algebraic torus G0 over an alge- braically closed field K of an arbitrary characteristic p. We show a criterion for G to be a finite central extension of G0 in terms of invariant theory of all regular 0 actions of any closed subgroup H containing ZG(G ) on affine Krull K-schemes such that invariant rational functions are locally fractions of invariant regular func- tions. Consider an affine Krull H-scheme X = Spec(R) and a prime ideal P of R with ht(P) = ht(P ∩ RH ) = 1. Let I(P) denote the inertia group of P un- der the action of H. The group G is central over G0 if and only if the fraction e(P, P ∩ RH )/e(P, P ∩ RI(P)) of ramification indices is equal to 1 (p = 0) or to the p-part of the order of the group of weights of G0 on RI(P)) vanishing on RI(P))/P ∩ RI(P) (p> 0) for an arbitrary X and P. MSC: primary 13A50, 14R20, 20G05; secondary 14L30, 14M25 Keywords: Krull domain; ramification index; algebraic group; algebraic torus; char- acter group; invariant theory 1 Introduction 1.A. We consider affine algebraic groups and affine schemes over a fixed algebraically closed field K of an arbitrary characteristic p. For an affine group G, denote by G0 its identity component.
    [Show full text]
  • Arxiv:1811.04918V6 [Cs.LG] 1 Jun 2020
    Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers Zeyuan Allen-Zhu Yuanzhi Li Yingyu Liang [email protected] [email protected] [email protected] Microsoft Research AI Stanford University University of Wisconsin-Madison November 12, 2018 (version 6)∗ Abstract The fundamental learning theory behind neural networks remains largely open. What classes of functions can neural networks actually learn? Why doesn't the trained network overfit when it is overparameterized? In this work, we prove that overparameterized neural networks can learn some notable con- cept classes, including two and three-layer networks with fewer parameters and smooth activa- tions. Moreover, the learning can be simply done by SGD (stochastic gradient descent) or its variants in polynomial time using polynomially many samples. The sample complexity can also be almost independent of the number of parameters in the network. On the technique side, our analysis goes beyond the so-called NTK (neural tangent kernel) linearization of neural networks in prior works. We establish a new notion of quadratic ap- proximation of the neural network (that can be viewed as a second-order variant of NTK), and connect it to the SGD theory of escaping saddle points. arXiv:1811.04918v6 [cs.LG] 1 Jun 2020 ∗V1 appears on this date, V2/V3/V4 polish writing and parameters, V5 adds experiments, and V6 reflects our conference camera ready version. Authors sorted in alphabetical order. We would like to thank Greg Yang and Sebastien Bubeck for many enlightening conversations. Y. Liang was supported in part by FA9550-18-1-0166, and would also like to acknowledge that support for this research was provided by the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin-Madison with funding from the Wisconsin Alumni Research Foundation.
    [Show full text]
  • Analysis of Perceptron-Based Active Learning
    JournalofMachineLearningResearch10(2009)281-299 Submitted 12/07; 12/08; Published 2/09 Analysis of Perceptron-Based Active Learning Sanjoy Dasgupta [email protected] Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92093-0404, USA Adam Tauman Kalai [email protected] Microsoft Research Office 14063 One Memorial Drive Cambridge, MA 02142, USA Claire Monteleoni [email protected] Center for Computational Learning Systems Columbia University Suite 850, 475 Riverside Drive, MC 7717 New York, NY 10115, USA Editor: Manfred Warmuth Abstract Ω 1 We start by showing that in an active learning setting, the Perceptron algorithm needs ( ε2 ) labels to learn linear separators within generalization error ε. We then present a simple active learning algorithm for this problem, which combines a modification of the Perceptron update with an adap- tive filtering rule for deciding which points to query. For data distributed uniformly over the unit 1 sphere, we show that our algorithm reaches generalization error ε after asking for just O˜(d log ε ) labels. This exponential improvement over the usual sample complexity of supervised learning had previously been demonstrated only for the computationally more complex query-by-committee al- gorithm. Keywords: active learning, perceptron, label complexity bounds, online learning 1. Introduction In many machine learning applications, unlabeled data is abundant but labeling is expensive. This distinction is not captured in standard models of supervised learning, and has motivated the field of active learning, in which the labels of data points are initially hidden, and the learner must pay for each label it wishes revealed.
    [Show full text]
  • The Complexity Zoo
    The Complexity Zoo Scott Aaronson www.ScottAaronson.com LATEX Translation by Chris Bourke [email protected] 417 classes and counting 1 Contents 1 About This Document 3 2 Introductory Essay 4 2.1 Recommended Further Reading ......................... 4 2.2 Other Theory Compendia ............................ 5 2.3 Errors? ....................................... 5 3 Pronunciation Guide 6 4 Complexity Classes 10 5 Special Zoo Exhibit: Classes of Quantum States and Probability Distribu- tions 110 6 Acknowledgements 116 7 Bibliography 117 2 1 About This Document What is this? Well its a PDF version of the website www.ComplexityZoo.com typeset in LATEX using the complexity package. Well, what’s that? The original Complexity Zoo is a website created by Scott Aaronson which contains a (more or less) comprehensive list of Complexity Classes studied in the area of theoretical computer science known as Computa- tional Complexity. I took on the (mostly painless, thank god for regular expressions) task of translating the Zoo’s HTML code to LATEX for two reasons. First, as a regular Zoo patron, I thought, “what better way to honor such an endeavor than to spruce up the cages a bit and typeset them all in beautiful LATEX.” Second, I thought it would be a perfect project to develop complexity, a LATEX pack- age I’ve created that defines commands to typeset (almost) all of the complexity classes you’ll find here (along with some handy options that allow you to conveniently change the fonts with a single option parameters). To get the package, visit my own home page at http://www.cse.unl.edu/~cbourke/.
    [Show full text]
  • The Sample Complexity of Agnostic Learning Under Deterministic Labels
    JMLR: Workshop and Conference Proceedings vol 35:1–16, 2014 The sample complexity of agnostic learning under deterministic labels Shai Ben-David [email protected] Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, CANADA Ruth Urner [email protected] School of Computer Science, Georgia Institute of Technology, Atlanta, Georgia 30332, USA Abstract With the emergence of Machine Learning tools that allow handling data with a huge number of features, it becomes reasonable to assume that, over the full set of features, the true labeling is (almost) fully determined. That is, the labeling function is deterministic, but not necessarily a member of some known hypothesis class. However, agnostic learning of deterministic labels has so far received little research attention. We investigate this setting and show that it displays a behavior that is quite different from that of the fundamental results of the common (PAC) learning setups. First, we show that the sample complexity of learning a binary hypothesis class (with respect to deterministic labeling functions) is not fully determined by the VC-dimension of the class. For any d, we present classes of VC-dimension d that are learnable from O~(d/)- many samples and classes that require samples of size Ω(d/2). Furthermore, we show that in this setup, there are classes for which any proper learner has suboptimal sample complexity. While the class can be learned with sample complexity O~(d/), any proper (and therefore, any ERM) algorithm requires Ω(d/2) samples. We provide combinatorial characterizations of both phenomena, and further analyze the utility of unlabeled samples in this setting.
    [Show full text]
  • Lecture 11 — October 16, 2012 1 Overview 2
    6.841: Advanced Complexity Theory Fall 2012 Lecture 11 | October 16, 2012 Prof. Dana Moshkovitz Scribe: Hyuk Jun Kweon 1 Overview In the previous lecture, there was a question about whether or not true randomness exists in the universe. This question is still seriously debated by many physicists and philosophers. However, even if true randomness does not exist, it is possible to use pseudorandomness for practical purposes. In any case, randomization is a powerful tool in algorithms design, sometimes leading to more elegant or faster ways of solving problems than known deterministic methods. This naturally leads one to wonder whether randomization can solve problems outside of P, de- terministic polynomial time. One way of viewing randomized algorithms is that for each possible setting of the random bits used, a different strategy is pursued by the algorithm. A property of randomized algorithms that succeed with bounded error is that the vast majority of strategies suc- ceed. With error reduction, all but exponentially many strategies will succeed. A major theme of theoretical computer science over the last 20 years has been to turn algorithms with this property into algorithms that deterministically find a winning strategy. From the outset, this seems like a daunting task { without randomness, how can one efficiently search for such strategies? As we'll see in later lectures, this is indeed possible, modulo some hardness assumptions. There are two main objectives in this lecture. The first objective is to introduce some probabilis- tic complexity classes, by bounding time or space. These classes are probabilistic analogues of deterministic complexity classes P and L.
    [Show full text]
  • R Z- /Li ,/Zo Rl Date Copyright 2012
    Hearing Moral Reform: Abolitionists and Slave Music, 1838-1870 by Karl Byrn A thesis submitted to Sonoma State University in partial fulfillment of the requirements for the degree of MASTER OF ARTS III History Dr. Steve Estes r Z- /Li ,/zO rL Date Copyright 2012 By Karl Bym ii AUTHORIZATION FOR REPRODUCTION OF MASTER'S THESIS/PROJECT I grant permission for the reproduction ofparts ofthis thesis without further authorization from me, on the condition that the person or agency requesting reproduction absorb the cost and provide proper acknowledgement of authorship. Permission to reproduce this thesis in its entirety must be obtained from me. iii Hearing Moral Reform: Abolitionists and Slave Music, 1838-1870 Thesis by Karl Byrn ABSTRACT Purpose ofthe Study: The purpose of this study is to analyze commentary on slave music written by certain abolitionists during the American antebellum and Civil War periods. The analysis is not intended to follow existing scholarship regarding the origins and meanings of African-American music, but rather to treat this music commentary as a new source for understanding the abolitionist movement. By treating abolitionists' reports of slave music as records of their own culture, this study will identify larger intellectual and cultural currents that determined their listening conditions and reactions. Procedure: The primary sources involved in this study included journals, diaries, letters, articles, and books written before, during, and after the American Civil War. I found, in these sources, that the responses by abolitionists to the music of slaves and freedmen revealed complex ideologies and motivations driven by religious and reform traditions, antebellum musical cultures, and contemporary racial and social thinking.
    [Show full text]
  • Sinkhorn Autoencoders
    Sinkhorn AutoEncoders Giorgio Patrini⇤• Rianne van den Berg⇤ Patrick Forr´e Marcello Carioni UvA Bosch Delta Lab University of Amsterdam† University of Amsterdam KFU Graz Samarth Bhargav Max Welling Tim Genewein Frank Nielsen ‡ University of Amsterdam University of Amsterdam Bosch Center for Ecole´ Polytechnique CIFAR Artificial Intelligence Sony CSL Abstract 1INTRODUCTION Optimal transport o↵ers an alternative to Unsupervised learning aims at finding the underlying maximum likelihood for learning generative rules that govern a given data distribution PX . It can autoencoding models. We show that mini- be approached by learning to mimic the data genera- mizing the p-Wasserstein distance between tion process, or by finding an adequate representation the generator and the true data distribution of the data. Generative Adversarial Networks (GAN) is equivalent to the unconstrained min-min (Goodfellow et al., 2014) belong to the former class, optimization of the p-Wasserstein distance by learning to transform noise into an implicit dis- between the encoder aggregated posterior tribution that matches the given one. AutoEncoders and the prior in latent space, plus a recon- (AE) (Hinton and Salakhutdinov, 2006) are of the struction error. We also identify the role of latter type, by learning a representation that maxi- its trade-o↵hyperparameter as the capac- mizes the mutual information between the data and ity of the generator: its Lipschitz constant. its reconstruction, subject to an information bottle- Moreover, we prove that optimizing the en- neck. Variational AutoEncoders (VAE) (Kingma and coder over any class of universal approxima- Welling, 2013; Rezende et al., 2014), provide both a tors, such as deterministic neural networks, generative model — i.e.
    [Show full text]
  • Data-Dependent Sample Complexity of Deep Neural Networks Via Lipschitz Augmentation
    Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation Colin Wei Tengyu Ma Computer Science Department Computer Science Department Stanford University Stanford University [email protected] [email protected] Abstract Existing Rademacher complexity bounds for neural networks rely only on norm control of the weight matrices and depend exponentially on depth via a product of the matrix norms. Lower bounds show that this exponential dependence on depth is unavoidable when no additional properties of the training data are considered. We suspect that this conundrum comes from the fact that these bounds depend on the training data only through the margin. In practice, many data-dependent techniques such as Batchnorm improve the generalization performance. For feedforward neural nets as well as RNNs, we obtain tighter Rademacher complexity bounds by considering additional data-dependent properties of the network: the norms of the hidden layers of the network, and the norms of the Jacobians of each layer with respect to all previous layers. Our bounds scale polynomially in depth when these empirical quantities are small, as is usually the case in practice. To obtain these bounds, we develop general tools for augmenting a sequence of functions to make their composition Lipschitz and then covering the augmented functions. Inspired by our theory, we directly regularize the network’s Jacobians during training and empirically demonstrate that this improves test performance. 1 Introduction Deep networks trained in
    [Show full text]
  • Simple Doubly-Efficient Interactive Proof Systems for Locally
    Electronic Colloquium on Computational Complexity, Revision 3 of Report No. 18 (2017) Simple doubly-efficient interactive proof systems for locally-characterizable sets Oded Goldreich∗ Guy N. Rothblumy September 8, 2017 Abstract A proof system is called doubly-efficient if the prescribed prover strategy can be implemented in polynomial-time and the verifier’s strategy can be implemented in almost-linear-time. We present direct constructions of doubly-efficient interactive proof systems for problems in P that are believed to have relatively high complexity. Specifically, such constructions are presented for t-CLIQUE and t-SUM. In addition, we present a generic construction of such proof systems for a natural class that contains both problems and is in NC (and also in SC). The proof systems presented by us are significantly simpler than the proof systems presented by Goldwasser, Kalai and Rothblum (JACM, 2015), let alone those presented by Reingold, Roth- blum, and Rothblum (STOC, 2016), and can be implemented using a smaller number of rounds. Contents 1 Introduction 1 1.1 The current work . 1 1.2 Relation to prior work . 3 1.3 Organization and conventions . 4 2 Preliminaries: The sum-check protocol 5 3 The case of t-CLIQUE 5 4 The general result 7 4.1 A natural class: locally-characterizable sets . 7 4.2 Proof of Theorem 1 . 8 4.3 Generalization: round versus computation trade-off . 9 4.4 Extension to a wider class . 10 5 The case of t-SUM 13 References 15 Appendix: An MA proof system for locally-chracterizable sets 18 ∗Department of Computer Science, Weizmann Institute of Science, Rehovot, Israel.
    [Show full text]
  • Glossary of Complexity Classes
    App endix A Glossary of Complexity Classes Summary This glossary includes selfcontained denitions of most complexity classes mentioned in the b o ok Needless to say the glossary oers a very minimal discussion of these classes and the reader is re ferred to the main text for further discussion The items are organized by topics rather than by alphab etic order Sp ecically the glossary is partitioned into two parts dealing separately with complexity classes that are dened in terms of algorithms and their resources ie time and space complexity of Turing machines and complexity classes de ned in terms of nonuniform circuits and referring to their size and depth The algorithmic classes include timecomplexity based classes such as P NP coNP BPP RP coRP PH E EXP and NEXP and the space complexity classes L NL RL and P S P AC E The non k uniform classes include the circuit classes P p oly as well as NC and k AC Denitions and basic results regarding many other complexity classes are available at the constantly evolving Complexity Zoo A Preliminaries Complexity classes are sets of computational problems where each class contains problems that can b e solved with sp ecic computational resources To dene a complexity class one sp ecies a mo del of computation a complexity measure like time or space which is always measured as a function of the input length and a b ound on the complexity of problems in the class We follow the tradition of fo cusing on decision problems but refer to these problems using the terminology of promise problems
    [Show full text]
  • User's Guide for Complexity: a LATEX Package, Version 0.80
    User’s Guide for complexity: a LATEX package, Version 0.80 Chris Bourke April 12, 2007 Contents 1 Introduction 2 1.1 What is complexity? ......................... 2 1.2 Why a complexity package? ..................... 2 2 Installation 2 3 Package Options 3 3.1 Mode Options .............................. 3 3.2 Font Options .............................. 4 3.2.1 The small Option ....................... 4 4 Using the Package 6 4.1 Overridden Commands ......................... 6 4.2 Special Commands ........................... 6 4.3 Function Commands .......................... 6 4.4 Language Commands .......................... 7 4.5 Complete List of Class Commands .................. 8 5 Customization 15 5.1 Class Commands ............................ 15 1 5.2 Language Commands .......................... 16 5.3 Function Commands .......................... 17 6 Extended Example 17 7 Feedback 18 7.1 Acknowledgements ........................... 19 1 Introduction 1.1 What is complexity? complexity is a LATEX package that typesets computational complexity classes such as P (deterministic polynomial time) and NP (nondeterministic polynomial time) as well as sets (languages) such as SAT (satisfiability). In all, over 350 commands are defined for helping you to typeset Computational Complexity con- structs. 1.2 Why a complexity package? A better question is why not? Complexity theory is a more recent, though mature area of Theoretical Computer Science. Each researcher seems to have his or her own preferences as to how to typeset Complexity Classes and has built up their own personal LATEX commands file. This can be frustrating, to say the least, when it comes to collaborations or when one has to go through an entire series of files changing commands for compatibility or to get exactly the look they want (or what may be required).
    [Show full text]