Ricardo Baeza-Yates Berthier Ribeiro-Neto
Total Page:16
File Type:pdf, Size:1020Kb
Modern Information Retrieval The Concepts and Technology behind Search Ricardo Baeza-Yates Berthier Ribeiro-Neto Second edition Addison-Wesley Harlow, England Reading, Massachusetts Menlo Park, California• New York Don Mills, Ontario Amsterdam• Bonn Sydney Singapore• Tokyo Madrid• San Juan• Milan •Mexico City• Seoul Taipei • • • • References [1] I. Aalbersberg. Incremental relevance feedback. In Proc of the Fifteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, pages 11–22, Denmark, 1992. [2] H. Abdi. Kendall rank correlation. In N. Salkind, editor, Encyclopedia of Measure- ment and Statistics, Thousand Oaks, CA, 2007. Sage. [3] H. Abdi. The Kendall rank correlation coefficient. Technical report, Univ. of Texas at Dallas, 2007. [4] K. Aberer, F. Klemm, M. Rajman, and J. Wu. An architecture for peer-to-peer information retrieval. In Workshop on Peer-to-Peer Information Retrieval,Sheffield, UK, July 2004. [5] S. Abiteboul. Querying semi-structured data. In F. N. Afrati and P. Kolaitis, editors, Int. Conf. on Database Theory (ICDT), number 1186 in LNCS, pages 1–18, Delphi, Greece, 1997. Springer-Verlag. [6] S. Abiteboul, M. Preda, and G. Cobena. Adaptive on-line page importance compu- tation. In Proceedings of the twelfth international conference on World Wide Web, pages 280–290, Budapest, Hungary, 2003. ACM Press. [7] M. Abolhassani and N. Fuhr. Applying the divergence from randomness approach for content-only search in xml documents. Lecture Notes in Computer Science, 2997:409– 420, 2004. [8] M. Abrams, editor. World Wide Web: Beyond the Basics. Prentice Hall, 1998. [9] M. Abrol, N. Latarche, U. Mahadevan, J. Mao, R. Mukherjee, P. Raghavan, M. Tourn, J. Wang, and G. Zhang. Navigating large-scale semi-structured data in business portals. In Proceedings of the 27th VLDB Conference, pages 663–666, Roma, Italy, 2001. http://www.vldb.org/conf/2001/P663.pdf. [10] A. Adams and A. Blandford. Digital libraries’ support for the user’s ’information jour- ney’. In Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, pages 160–169, Denver, Colorado, 2005. [11] E. Adar. User 4xxxxx9: Anonymizing query logs. In Query Log Analysis: Social and Technological Challenges, Workshop in WWW’07, 2007. [12] J. Adiego, G. Navarro, and P. de la Fuente. Scm: Structural contexts model for improving compression in semistructured text databases. In Proc. 10th International Symposium on String Processing and Information Retrieval (SPIRE 2003),LNCS 767 768 REFERENCES 2857, pages 153–167. Springer, 2003. Extended version appeared in Information Processing and Management 43(3), May 2007, pp. 769–790. [13] J. Adiego, G. Navarro, and P. de la Fuente. Lempel-Ziv compression of structured text. In Proc. 14th IEEE Data Compression Conference (DCC’04), pages 112–121, 2004. Extended version appeared in JASIST 58(4), 2007, pp. 461–478. [14] M. Adler and M. Mitzenmacher. Towards compressing Web graphs. In Data Com- pression Conference, pages 203–212, 2001. [15] G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender sys- tems: A survey of the state-of-the-art and possible extensions. IEEE Trans. on Knowl. and Data Eng., 17(6):734–749, 2005. [16] D. Agarwal and S. Merugu. Predictive discrete latent-factor models for large-scale dyadic data. In KDD ’07: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 26–35, New York, NY, USA, 2007. ACM. [17] E. Agichtein, E. Brill, and S. Dumais. Improving Web search ranking by incorporating user behavior information. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 19–26, New York, NY, USA, 2006. ACM Press. [18] E. Agichtein, E. Brill, S. Dumais, and R. Ragno. Learning user interaction models for predicting Web search result preferences. In Proceedings of the 29th annual interna- tional ACM SIGIR conference on Research and development in information retrieval, pages 3–10. ACM Press, 2006. [19] A. Agogino and J. Ghosh. Increasing pagerank through reinforcement learning. In Proceedings of Intelligent Engineering Systems Through Artificial Neural Networks, volume 12, pages 27–32, St. Louis, Missouri, USA, November 2002. ASME Press. [20] M. Agosti and A. F. Smeaton, editors. Information retrieval and hypertext.Kluwer Academic Publishers, Boston/London/Dordrecht, 1996. [21] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In J. Bocca, M. Jarke, and C. Zaniolo, editors, 20th Int Conference on Very Large Databases, pages 487–499. Morgan Kaufmann Publishers, 1994. [22] A. Aho and M. Corasick. Efficient string matching: an aid to bibliographic search. Communications of the ACM, 18(6):333–340, June 1975. [23] AIR Workshops: Adversarial Web Retrieval. http://airweb.cse.lehigh.edu/, 2005. [24] A. Al-Maskari, M. Sanderson, and P. Clough. The relationship betweenIReffective- ness measures and user satisfaction. In SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 773–774, 2007. [25] S. M. Alessi and S. R. Trollip. Multimedia for learning: methods and development. Allyn and Bacon, 2001. 580 pages. [26] S. Ali, M. Consens, G. Kazai, and M. Lalmas. A common basis for the evaluation of structured document retrieval. In ACM CIKM International Conference on Infor- mation and Knowledge Management, Bremen, Germany, 2008. In Press. [27] W. Alink, R. Bhoedjang, P. Boncz, and A. de Vries. XIRAF - XML-based indexing and querying for digital forensics. Digital Investigation, 3(Supplement-1):50–58, 2006. [28] Alis Technologies. Web languages hit parade, 1997. REFERENCES 769 [29] J. Allan. Incremental relevance feedback for information filtering. In Proc of the 19th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, pages 270–278, Zurich, Switzerland, 1996. [30] J. Allan. Perspectives on information retrieval and speech. In Information Retrieval Techniques for Speech Applications, from the workshop “Information Retrieval Tech- niques for Speech Applications,” held as part of the 24th Annual International ACM SIGIR Conference, pages 1–10, London, UK, 2002. Springer-Verlag. [31] J. Allan. HARD Track overview in TREC 2004 high accuracy retrieval from docu- ments. Proceedings of the Thirteenth Text REtrieval Conference (TREC’04), 2005. [32] B. L. Allen. Information Tasks: Toward a User-Centered Approach to Information Systems. Academic Press, San Diego, CA, 1996. [33] E. L. Allwein, R. E. Schapire, and Y. Singer. Reducing multiclass to binary: A unify- ing approach for margin classifiers. Journal of Machine Learning Research, 1:113–141, 2000. [34] O. Alonso and R. Baeza-Yates. A model for visualizing large answers in WWW. In XVIII Int. Conf. of the Chilean CS Society, pages 2–7, Antofagasta, Chile, 1998. IEEE CS Press. [35] O. Alonso, R. Baeza-Yates, and M. Gertz. Effectiveness of temporal snippets. In WSSP Workshop at the World Wide Web Conference—WWW’09, 2009. [36] O. Alonso and S. Mizzaro. Can we get rid of TREC assessors? Using Mechanical Turk for relevance assessment. In SIGIR Evaluation Workshop, 2009. [37] O. Alonso, D. E. Rose, and B. Stewart. Crowdsourcing for relevance evaluation. SIGIR Forum, 42(2):9–15, 2008. [38] G. Amati. Probability Models for Information Retrieval based on Divergence from Randomness. PhD thesis, Department of Computing Science University of Glasgow, 2003. http://www.dcs.gla.ac.uk/∼gianni/selectedPapers.html. [39] G. Amati and C. van Rijsbergen. Probabilistic models of information retrieval based on measuring divergence from randomness. ACM Transaction on Office and Infor- mation Systems (TOIS), 20(4), 2002. [40] Amazon. Mechanical Turk, 2009. http://www.mturk.com. [41] G. M. Amdahl. Validity of the single-processor approach to achieving large scale com- puting capabilities. In Proc. AFIPS 1967 Spring Joint Computer Conf., volume 30, pages 483–485, Atlantic City, N.J., Apr. 1967. [42] S. Amer-Yahia, C. Botev, J. D¨orre, and J. Shanmugasundaram. Full-Text extensions explained. IBM Systems Journal, 45(2):335–352, 2006. [43] S. Amer-Yahia, C. Botev, and J. Shanmugasundaram. TexQuery: a full-text search extension to XQuery. In 13th international conference on World Wide Web, New York, NY, USA, pages 583–594, 2004. [44] S. Amer-Yahia, P. Case, T. Roelleke, J. Shanmugasundaram, and G. Weikum. Report on the DB/IR panel at SIGMOD 2005. SIGMOD Record, 34(4):71–74, 2005. [45] S. Amer-Yahia, S. Cho, and D. Srivastava. Tree Pattern Relaxation. In Advances in Database Technology - EDBT 2002, 8th International Conference on Extending Database Technology, Prague, Czech Republic, pages 496–513, 2002. [46] S. Amer-Yahia, D. Hiemstra, T. Roelleke, D. Srivastava, and G. Weikum. Ranked XML Querying. In S. Amer-Yahia, D. Srivastava, and G. Weikum, editors, Work- shop on Ranked XML Querying, number 08111 in Dagstuhl Seminar Proceedings, Dagstuhl, Germany, 2008. 770 REFERENCES [47] S. Amer-Yahia and M. Lalmas. XML search: languages, INEX and scoring. SIGMOD Record, 35(4):16–23, 2006. [48] A. Amir, S. Srinivasan, and A. Efrat. Search the audio, browse the video: A generic paradigm for video collections. EURASIP Journal on Applied Signal Processing, 2003(2):209–222, 2003. doi:10.1155/S111086570321012X. [49] E. Amitay and C. Paris. Automatically summarising Web sites - is there a way around it? In Proceedings of the 2000 ACM CIKM International Conference on Information and Knowledge Management, pages 173–179, McLean, VA, USA, November 2000. [50] C. Anderson. The Long Tail: Why the Future of Business Is Selling Less of More. Hyperion, New York, revised edition, 2008. [51] T. Anderson, A. Hussam, B. Plummer, and N. Jacobs. Pie charts for visualizing query term frequency in search results. Proceedings of the Fifth International Conference on Asian Digital Library, pages 440–451, 2002. [52] K. Andrews. Visualising cyberspace: Information visualization in the Harmony In- ternet browser. In Proceedings ’95 Information Visualization, pages 97–104, Atlanta, Oct. 1995. [53] K. Andrews, C. G¨utl, J.