Sampling Algorithms, from Survey Sampling to Monte Carlo Methods: Tutorial and Literature Review Benyamin Ghojogh*
[email protected] Department of Electrical and Computer Engineering, Machine Learning Laboratory, University of Waterloo, Waterloo, ON, Canada Hadi Nekoei*
[email protected] MILA (Montreal Institute for Learning Algorithms) – Quebec AI Institute, Montreal, Quebec, Canada Aydin Ghojogh*
[email protected] Fakhri Karray
[email protected] Department of Electrical and Computer Engineering, Centre for Pattern Analysis and Machine Intelligence, University of Waterloo, Waterloo, ON, Canada Mark Crowley
[email protected] Department of Electrical and Computer Engineering, Machine Learning Laboratory, University of Waterloo, Waterloo, ON, Canada * The first three authors contributed equally to this work. Abstract pendent, we cover importance sampling and re- This paper is a tutorial and literature review on jection sampling. For MCMC methods, we cover sampling algorithms. We have two main types Metropolis algorithm, Metropolis-Hastings al- of sampling in statistics. The first type is sur- gorithm, Gibbs sampling, and slice sampling. vey sampling which draws samples from a set or Then, we explain the random walk behaviour of population. The second type is sampling from Monte Carlo methods and more efficient Monte probability distribution where we have a proba- Carlo methods, including Hamiltonian (or hy- bility density or mass function. In this paper, we brid) Monte Carlo, Adler’s overrelaxation, and cover both types of sampling. First, we review ordered overrelaxation. Finally, we summarize some required background on mean squared er- the characteristics, pros, and cons of sampling ror, variance, bias, maximum likelihood estima- methods compared to each other. This paper can tion, Bernoulli, Binomial, and Hypergeometric be useful for different fields of statistics, machine distributions, the Horvitz–Thompson estimator, learning, reinforcement learning, and computa- and the Markov property.