Overview of the TREC-2007 Blog Track Craig Macdonald, Iadh Ounis Ian Soboroff University of Glasgow NIST Glasgow, UK Gaithersburg, MD, USA {craigm,ounis}@dcs.gla.ac.uk
[email protected] 1. INTRODUCTION The number of permalink documents in the collection is over The goal of the Blog track is to explore the information seeking 3.2 million, while the number of feeds is over 100,000 blogs. The behaviour in the blogosphere. It aims to create the required in- permalink documents are used as a retrieval unit for the opinion frastructure to facilitate research into the blogosphere and to study finding task and its associated polarity subtask. For the blog distil- retrieval from blogs and other related applied tasks. The track was lation task, the feed documents are used as the retrieval unit. The introduced in 2006 with a main opinion finding task and an open collection has been distributed by the University of Glasgow since task, which allowed participants the opportunity to influence the March 2006. Further information on the collection and how it was determination of a suitable second task for 2007 on other aspects created can be found in [1]. of blogs besides their opinionated nature. As a result, we have created the first blog test collection, namely the TREC Blog06 3. OPINION FINDING TASK collection, for adhoc retrieval and opinion finding. Further back- Many blogs are created by their authors as a mechanism for self- ground information on the Blog track can be found in the 2006 expression. Extremely-accessible blog software has facilitated the track overview [2].