Name Clustering on the Basis of Parental Preferences Gerrit Bloothooft and Loek Groot Utrecht University, the Netherlands
names, Vol. 56, No. 3, September, 2008, 111–163 Name Clustering on the Basis of Parental Preferences Gerrit Bloothooft and Loek Groot Utrecht University, The Netherlands Parents do not choose fi rst names for their children at random. Using two large datasets, for the UK and the Netherlands, covering the names of children born in the same family over a period of two decades, this paper seeks to identify clusters of names entirely inferred from common parental naming preferences. These name groups can be considered as coherent sets of names that have a high probability to be found in the same family. Operational measures for the statistical association between names and clusters are developed, as well as a two-stage clustering technique. The name groups are subsequently merged into a limited set of grand clusters. The results show that clusters emerge with cultural, linguistic, or ethnic parental backgrounds, but also along characteristics inherent in names, such as clusters of names after fl owers and gems for girls, abbreviated names for boys, or names ending in –y or -ie. Introduction The variety in personal given names has increased enormously over the past century. In the Netherlands, the top 3, top 10, and top 100 names account, respectively, for 16%, 33%, and 70% of the fi rst names of elderly born between 1910 and 1930, while these fi gures are 3%, 8%, and 39% for babies born between 2000 and 2004. Compa- rable fi gures are presented by Galbi (2002, 4) for England and Wales. Along with the increase in the variety in names, the motives behind the choice of names for children by their parents have changed from a more or less prescribed naming after relatives to a free decision, a process that was facilitated in the Netherlands by the tolerant name law of 1970.
[Show full text]