On a class of random probability measures with general predictive structure Stefano Favaro Igor Prünster Stephen G. Walker No. 161 November 2010 www.carloalberto.org/working_papers © 2010 by Stefano Favaro, Igor Prünster and Stephen G. Walker. Any opinions expressed here are those of the authors and not those of the Collegio Carlo Alberto. On a class of random probability measures with general predictive structure S. Favaro1, I. Pr¨unster2 and S.G. Walker3 1 Universit`adegli Studi di Torino and Collegio Carlo Alberto, Torino, Italy. E-mail:
[email protected] 2 Universit`adegli Studi di Torino, Collegio Carlo Alberto and ICER, Torino, Italy. E-mail:
[email protected] 3 Institute of Mathematics, Statistics and Actuarial Science, University of Kent E-mail:
[email protected] February 2010 Abstract In this paper we investigate a recently introduced class of nonparametric priors, termed generalized Dirichlet process priors. Such priors induce (exchangeable random) partitions which are characterized by a more elaborate clustering structure than those arising from other widely used priors. A natural area of application of these random probability measures is represented by species sampling problems and, in particular, prediction problems in genomics. To this end we study both the distribution of the number of distinct species present in a sample and the distribution of the number of new species conditionally on an observed sample. We also provide the Bayesian nonparametric estimator for the number of new species in an additional sample of given size and for the discovery probability as function of the size of the additional sample.