Discrete Categorical Distribution

Discrete Categorical Distribution

Discrete Categorical Distribution Carl Edward Rasmussen November 11th, 2016 Carl Edward Rasmussen Discrete Categorical Distribution November 11th, 2016 1 / 8 Key concepts We generalize the concepts from binary variables to multiple discrete outcomes. • discrete and multinomial distributions • the Dirichlet distribution Carl Edward Rasmussen Discrete Categorical Distribution November 11th, 2016 2 / 8 The multinomial distribution (1) Generalisation of the binomial distribution from 2 outcomes to m outcomes. Useful for random variables that take one of a finite set of possible outcomes. Throw a die n = 60 times, and count the observed (6 possible) outcomes. Outcome Count X = x1 = 1 k1 = 12 X = x2 = 2 k2 = 7 Note that we have one parameter too many. We X = x3 = 3 k3 = 11 don’t need to know all the ki and n, because 6 X = x4 = 4 k4 = 8 i=1 ki = n. X = x5 = 5 k5 = 9 P X = x6 = 6 k6 = 13 Carl Edward Rasmussen Discrete Categorical Distribution November 11th, 2016 3 / 8 The multinomial distribution (2) Consider a discrete random variable X that can take one of m values x1,..., xm. Out of n independent trials, let ki be the number of times X = xi was observed. m It follows that i=1 ki = n. m Denote by πi the probability that X = xi, with i=1 πi = 1. P > The probability of observing a vector of occurrences k = [k1,..., km] is given by P > the multinomial distribution parametrised by π = [π1,..., πm] : n! ki p(kjπ, n) = p(k1,..., kmjπ1,..., πm, n) = πi k !k !... km! 1 2 i= Y1 • Note that we can write p(kjπ) since n is redundant. n • The multinomial coefficient n! is a generalisation of . k1!k2!...km! k The discrete or categorical distribution is the generalisation of the Bernoulli to m outcomes, and the special case of the multinomial with one trial: p(X = xijπ) = πi. Carl Edward Rasmussen Discrete Categorical Distribution November 11th, 2016 4 / 8 Example: word counts in text Consider describing a text document by the frequency of occurrence of every distinct word. The UCI Bag of Words dataset from the University of California, Irvine. 1 1http://archive.ics.uci.edu/ml/machine-learning-databases/bag-of-words/ Carl Edward Rasmussen Discrete Categorical Distribution November 11th, 2016 5 / 8 Priors on multinomials: The Dirichlet distribution The Dirichlet distribution is to the categorical/multinomial what the Beta is to the Bernoulli/binomial. It is a generalisation of the Beta defined on the m - 1 dimensional simplex. > • Consider the vector π = [π1,..., πm] , with m i=1 πi = 1 and πi 2 (0, 1) 8i. • π m - PVector lives in the open standard 1 simplex. • π could for example be the parameter vector of a multinomial. [Figure on the right m = 3.] The Dirichlet distribution is given by m m m Γ( i=1 αi) αi-1 1 αi-1 Dir(πjα1,..., αm) = m πi = πi Γ(αi) B(α) i=1 i= i= P Y1 Y1 Q > • α = [α1,..., αm] are the shape parameters. • B(α) is the multivariate beta function. αj • E(πj) = m is the mean for the j-th element. i=1 αi Carl Edward RasmussenP Discrete Categorical Distribution November 11th, 2016 6 / 8 Dirichlet Distributions from Wikipedia Carl Edward Rasmussen Discrete Categorical Distribution November 11th, 2016 7 / 8 The symmetric Dirichlet distribution In the symmetric Dirichlet distribution all parameters are identical: αi = α, 8i. en.wikipedia.org/wiki/File:LogDirichletDensity-alpha_0.3_to_alpha_2.0.gif To sample from a symmetric Dirichlet in D dimensions with concentration α use: w = randg(alpha,D,1); bar(w/sum(w)); alpha = 0.1 alpha = 1 alpha = 10 0.7 0.7 0.7 0.6 0.6 0.6 0.5 0.5 0.5 0.4 0.4 0.4 0.3 0.3 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Distributions drawn at random from symmetric 10 dimensional Dirichlet distributions with various concentration parameters. Carl Edward Rasmussen Discrete Categorical Distribution November 11th, 2016 8 / 8.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us