ASIST Word Template For Conference Proceedings

Global vs. Localized Search: A Comparison of Database Selection Methods in a Hierarchical Environment

Jack G. Conrad R&D, Thomson Legal & Regulatory, St. Paul, MN 55123 USA Email: [email protected]

Joanne Smestad Claussen TPD, West Group, St. Paul, MN 55123 USA Email: [email protected]

Changwen Yang Oracle Corporation, Oracle Pkwy, Redwood Shores, CA 94065 Email: [email protected]

In this work, we compare standard global IR directory. To test the effectiveness of browse plus search searching with more localized techniques to functionality, we designed and conducted a series of address the database selection problem. We experiments on three search-modes, using the same set of conduct a series of experiments to compare the real user queries. These search-modes include (1) a global retrieval effectiveness of three separate search search of the directory from the root node, (2) a localized modes using a hierarchically structured data search of the relevant sub-directories using global idfs,1 environment of textual database representations. and (3) a localized search of the relevant sub-directories The data environment is represented as a tree- using the appropriate local idfs. In the next section we like structure containing over 15,000 unique review related work. Section 3 briefly describes our databases and approximately 100,000 total leaf operational environment while section 4 discusses the nodes. The search modes consist of varying underlying data. Section 5 describes the user queries degrees of browse and search, from a global harnessed for this investigation. Section 6 addresses the search at the root node to a refined search at a particular tf*idf scoring algorithm used. Our experiments sub-node using dynamically-calculated inverse are outlined in section 7 and our results are presented in document frequencies (idfs) to score the section 8. In section 9 we draw our conclusions and in candidate databases for probable relevance. Our section 10 we mention future applications of this browse findings indicate that a browse plus search and search technology. approach that relies upon localized searching from sub-nodes in this environment produces 2. Previous Work the most effective results. An appreciable body of work has focused on searching distributed databases of textual documents for relevant 1. Introduction information in response to user queries (Gravano 1994; The continued growth of online databases has made the Callan, 1995; Yuwono, 1997; French 1999). Yet such work of finding the most relevant databases increasingly fully automated retrieval and the corpus of related challenging. Until recently, the ability to search a research which followed have been performed metadata repository as well as 'drill down' into its independent of additional user involvement. For this hierarchical structure, e.g., as in a data directory, have reason, the IR community has been periodically reproved largely remained separate activities. That is, browse and for not incorporating added user preferences into the IR search tasks in the same repository have often been process (Saracevic, 1997; Stellin, 2000; Kramer, 2000; presented as mutually exclusive. As a result, large Belkin, 2000). numbers of users exit online systems with unmet Other research has investigated mechanisms for information needs when failing to find relevant sources of improving query-side matching with metadata associated interest. This was the case with the Westlaw (Database) with documents (or databases). In particular, the Metadata Directory. We hypothesized that if users could first Research Program at U.C.–Berkeley has focused on browse to a potentially relevant sub-directory in the large “Entry Vocabulary Indexes” as well as other means to directory, results from a search in the sub-directory would be more precise than results from a search on the entire 1 idf signifies the inverse document frequency of a term. accomplish this task (Buckland, 2001). One of the chief distinct database characterization is important in notions of this work is that statistical association supporting user choice for integration, (3) some users techniques can be efficiently utilized to map words found prefer database selection control together with merged in queries to intermediary metadata vocabulary which in results, and (4) the assumption that integrated interaction turn will facilitate matches with document-side terms. is best may invite revisiting (Park, 2000). Some of Park's One of the drawbacks of this approach, however, is that if findings actually support a number of our related a query is submitted to an inappropriate sub-domain discoveries, especially those involving user preference for index, results can deteriorate relative to baseline greater control in database selection and interaction. performance. Some researchers have examined information filtering 3. Operational Environment and retrieval using a heightened role for users, but these The Westlaw system actually permits both an integrated studies have generally been smaller scale and addressed and dedicated searching environment within a single additional issues such as negative as well as positive cohesive interface. Users can run their searches against a relevance criteria (Fidel, 1998; Wu, 2000; Bopp, 2000). single database or multiple databases while receiving Other approaches have asked users to provide metadata merged document result lists. The findings mentioned concepts or applied thesauri with semantic links to a above, particularly those of Park, are instructive. Our query, either before or after examining highly-ranked users are given such flexibility in a robust system. They source documents (Chakravarthy, 1995; Dolan, 1996; have the choice between desired level of control (variable Hearst, 1994). A number of librarians, computer search granularity) and simplicity. Yet we wanted to scientists, and other professionals have made dedicated reexamine the issue of flexibility in other important ways. efforts to simplify and economize the steps necessary for Beyond integrated versus dedicated database search users to access relevant information. To design useful capability, and notions of simplicity plus control, we digital libraries, a number of studies have examined users' wanted to investigate another kind of hybrid system, one actual work environments and behavioral patterns which offers users both browse and search functionality. (Bishop, 1995; Lloyd, 1996; Van House 1996). In earlier systems, these features—browsing a collection via navigation and searching it with user-specified terms Payette and Rieger performed a study on user —were often mutually exclusive.2 We wanted to inspect, perspectives when searching a common versus a in particular, whether results from suitably restricted database-specific interface, where the latter enabled fine- searching (following a user's navigation into a tuned searches. Only two of forty-four participants hierarchically arranged directory structure) would be preferred the database-specific interface and only four measurably better than that from a simple global search at preferred to have both interfaces. The limitation of this the root level. Secondly, we wanted to determine whether survey, however, was that the authors did not assess the restricted search using standard collection-wide idf values actual uses and effectiveness of the searches under or using localized idf values (defined by the sub- investigation. Furthermore, results were merely ordered collection being searched) provided better performance. chronologically and by author (Payete, 1997). In both of these investigations, we wanted to determine Using a technique well-suited for tree-like database whether higher precision could be achieved without a entries, Meuss and Schulz developed a novel approach significant loss of recall. which combines querying and navigation (Meuss, 2001). They extended Kilpeläinen's Tree Matching formalism by 4. Data introducing a concept called "complete answer West Publishing, a West Group entity, first began aggregates." It relies on a new data structure applicable publishing judicial opinions in the 1870s. Its National for queries that can be represented as partial descriptions Reporter System (NRS) now contains approximately six of trees. They assert that it produces efficient results and million published case law opinions from virtually every works well for data encoded with markup languages like state and federal jurisdiction. Yet in the past decade, the SGML, HTML, or XML. How practical this approach is amount of non-legal data available on Westnews, the and to what extent it would scale to operational news-based data within Westlaw, has been growing at applications remains an open question. nearly an exponential rate. The Westlaw Directory Park examined user-system interaction and database organizes over 15,000 unique databases in a tree-like selection in the TREC environment, investigating whether structure; it lists approximately 100,000 databases at its users prefer and perform better when interacting with leaf nodes. In some regions, the Directory is 12 or more multiple databases separately with a common interface or levels deep, but it averages 4-6 levels. So on the average, interacting with the databases as if they were one integrated interaction. Her findings suggest that (1) more 2 In this paper, we use collection to refer to a database of user control is important in a distributed environment, (2) textual documents. each one of the databases is represented in the Directory were generalized, a conflation of queries resulted. This set between 6 and 7 times. For example, a database dedicated ultimately produced 30 queries of medium-range to Nevada Environmental Regulations might be listed granularity. The average length of the queries was slightly under “Nevada Materials,” under “Environmental Laws,” over three terms before stop words were removed and as well as under “Regulatory Materials.” In our slightly under three terms after. These queries were experiments, each of the 100,000 database representations subsequently divided into two sets, with set A containing was included. labor and tax queries and set B containing more general The database representations or profiles consist of legal queries (See Tables 1 & 3). Such a division, in metadata documents containing the titles, topical principle, would permit us to determine whether results coverage, and other significant content information from specific legal practice areas would be measurably associated with the databases. A Westlaw source and different from results originating from more general legal publication “authority file” is leveraged for this purpose; queries. it contains one profile for each database. These profile “documents” are in turn linked and indexed under the 6. System Environment Westlaw Directory hierarchy. A facsimile of one of these Our experiments were conducted on a Sun Enterprise database profiles is shown in Figure 1 below. 4500 which also executed an Apache web server application. Our system harnessed the WIN search engine3 (Thompson, 1995; Turtle, 1991; Turtle, 1994), a cousin to the INQUERY engine developed at the Center 5593 for Intelligent Information Retrieval at the University of GLBLGOVERN Massachusetts (Allan, 1997; Broglio, 1993). We used Global Governance WIN to run against all or a portion of our Westlaw A Review of Multilateralism and Directory data. INQUERY's and WIN's algorithms for International Organizations ranking documents have been previously reported Dow Jones Interactive (above), although we modified WIN's scoring formula for ALLNEWS, …, ENVNEWS, INTNEWS our last search mode which incorporated the use of dynamically calculated sub-directory specific local idf Economic Development values. Definitions for WIN’s probability of relevance Human Rights (belief score) and its term frequency (tf) and inverse Environmental Preservation document frequency (idf) components are presented ...... below. Non-Governmental Bodies (NGOs)

pbelief (wi | dj) = db + (1 – db) · tfb · idfb, (1) INT, ASA, AUS, CAN, EUR, NZ, US, UK Figure 0 where, ..

log(tf i  0.5) Figure 1: Facsimile Database Profile tfb = dt + (1 – dt) · (2) log(tf max 1.0) 5. Queries We began with approximately 50 real user queries N  0.5 originating from our existing database selection system's log( ) idf b = n (3) query logs. In order to carefully test the browse and search approach, we required queries that were not log(N  1.0) excessively specific (which would be indicative of leaf nodes in the Directory) and not excessively general n represents the number of database profile (which would be indicative of high-level nodes in the “documents” in which the query term wi appears while N Directory, rather than intermediate ones). In cases where is the total number of database profile documents. db is the the queries were not of a sufficiently useful level of minimum belief component and dt is the minimum term granularity, they were either discarded or refined to focus frequency component when term wi is present in a on more than simply, for instance, a legal topic. Similarly, document, dj. The variable tfmax is the frequency of the they were not left so specific that they would retrieve most frequently occurring term in the collection. In the only one or two particular databases. This task was three cases described below, N will vary depending on performed by a legal domain expert from our sponsoring department. In a few instances, where specific queries 3 WIN stands for Westlaw is Natural. whether one is calculating idf by means of the global or local document set. Using a humanIssues to select the intermediate nodes to (1) Global Search 7. Experiments search does introduceDirectory some subjectivity… to the study. We ran two sets of queries in each of three search However, one of the purposes of the experiment was to modes. Our sets of queries included: (a) labor and tax; and measure the results obtained from increased human (b) general legal, while the three search modes consisted interaction with a database selection system. The baseline of: measure of such interaction would logically be whether searchingNews individualInt’l human-selected Topical nodes would 1. global search of the entire data hierarchy improve precision overContent searching the entireMaterials hierarchy of … using global idfs; nodes. Furthermore, the nodes were selected not by an 2. local search of the relevant sub-directories “expert” who would know all locations, or even the best using global idfs; location where the source materials would be found, but by a person with knowledge of the sources requested and 3. local search of the relevant sub-directories therefore a reasonable idea of where, given a list of top- using the appropriate dynamically calculated local level categories, the material could be found. idfs; where the idf parameter (Section 6) was modified 8. Results depending on the currently defined scope of the document Tables 1 & 2 and their labor and tax queries (set A, in set, either global or local. Appendix) present a detailed overview of the complete It should be noted that the "relevant sub-directories" performance we observed as measured by micro-averaged were determined by a domain expert, an attorney with a precision. The performance improvements that occurred background in library science. They were selected in a when shifting to localized searches and again when using principled manner, focusing on the first sub-directory or local idfs were statistically significant (33.9%  50.4% sub-directories under which a user would reasonably  77.0% for Good Matches and 60.6%  83.3%  expect to find the initial relevant materials. (See Figure 2) 86.7% for Good and Close Matches). The emphasis was placed on the top 20 results, for both recall and precision, since users generally do not tend to examine candidate databases beyond this initial grouping.

Westlaw Database Directory 15,000+ Databases 100,000+ Leaf Nodes Science & Technolog … y

(2) and (3) Federal State Localized Searches Materials Materials

Case Law Statutes Law Reviews

Figure 2: Sample Search / Browse Scenario for Search-types: “Publications of the Federal Government on Bioengineering & Cloning” By contrast, Tables 3 & 4 and their more general set of presented in Figure 10. A graph of recall results for the legal queries (set B, in Appendix) show some of the two sets is shown in Figure 11. These figures appear in nuances that can be associated with certain queries, the Appendix, Section 13. regardless of the type or level of search. Consequently, the differences in performance shown between search- modes are not as pronounced, as seen in Table 4 (40.9%  48.5%  47.1% for Good Matches and 61.6%  75.4%  76.2% for Good and Close Matches). Part of the reason for this mitigated performance can be explained by some of the esoteric or difficult to field queries contained in this set (e.g., "federalist papers", "florida cle [continuing legal education] materials", or other database identifier abbreviations such as "cftc"). Yet even query set B demonstrates that browse and localized search produces results that are superior to global search alone, though in this case the differences between use of global and local idfs are statistically insignificant. Along with increased precision, we would expect some degradation in recall as we proceed from searching the entire Directory to searching a subsection of it. We measured recall for these queries by assuming that the Figure 4: Westlaw Directory – Upper Level highest total of good and close matches for each query represented 100% recall for this collection. The corresponding graphical interfaces that were developed to present candidate database results to the Using this definition, the right-hand columns of Tables project assessors are presented in Figures 3 through 7. 2 and 4 demonstrate that recall, micro-averaged, does suffer appreciably as we move from searching at the root node to searching sub-branches with global idfs (92.2%  69.8% in Table 2 and 88.1%  79.0% in Table 4). However, recall made a slight recovery when we used dynamic idfs to generate our search results (69.8%  73.2% in Table 2 and 79.0%  79.9% in Table 4).

Figure 5: Westlaw Directory – Intermediate Level

The initial prototype screens appear in Figures 6 and 7. The project they were associated with was entitled Database Navigation or DBN for short. These were early Figure 3: Westlaw Directory – Root Level proof of concept screens intended to give the assessors a qualitative as well as quantitative notion of how effective browse and search techniques could be, and how they A graphical summary of micro-averaged precision might contribute to Westlaw. results for the labor & employment query set (A) is presented in Figure 8, for the general query set (B) is presented in Figure 9, and for the combined set (A + B) is 9. Conclusions The evidence resulting from our experiments suggests that browse and localized search can improve precision in comparison to conventional global searches in environments where the data is organized in a hierarchical fashion such as the Westlaw Directory. We have further determined that restricted searches using localized idfs defined by the sub-branches inspected provide a more effective performance than restricted searches using the often more easily obtainable global idfs. Further, our exploration suggests that use of local idfs provides the best results in terms of improved precision with a minimal reduction in recall. Although the results presented in Table 2 best support these claims, Table 4 delivers the same behavior, though in a less dramatic manner. Nonetheless, both Tables 2 and 4, together with Figures Figure 6: Intermediate Level DBN Query Box 8-10, show that localized search produces higher precision results than global search (i.e., searches undertaken from the root node). As a consequence of this potential performance improvement, time previously spent on sometimes agonizing database selection searches can now be more productively spent on actual document retrieval and analysis. Moreover, an effective browse and search paradigm provides users, especially professional users, the ability to exert more control over their searches (Park 2000). Our performance evaluation is based largely on conventional notions of precision and recall. The data sets we have judged relevant to the user's information need relied upon a broad sense of relevance, since our queries tended to be short in nature. However, it is likely that our users would apply a narrower concept of relevance to their queries, given their contextual understanding of their specific information need. Hence, some of the data sets from alternative paths (i.e., paths not taken by the users) Figure 7: DBN Results Listings that we have judged relevant, even though simply potentially relevant, may in fact be regarded as less relevant once our users descend along other richer paths in our hierarchical tree. The significance of this 11. Acknowledgements observation is that our performance figures for the We wish to thank West Online Research for their legal localized search query modes would be best viewed as domain expertise and for the many hours spent reviewing lower bounds with respect to recall [since the number of and judging the results of our test queries. relevant databases–in the denominator–would decrease]. Thus, the perceived loss of recall upon invoking browse 12. References and search might be less significant than it initially Allan, J., Callan, J., Croft, W.B., Ballesteros, L., Byrd, J., appears. Swann, R., & Xu, J. (1997). Inquery does battle with TREC-6. In Proceedings of the Sixth Text REtrieval Conference 10. Future Work (TREC-6) (pp.169-201). Washington, D.C.: TREC. As a result of our findings from our experiments, Belkin, N.J. (2000). Helping people find what they don’t know. digging down plus restricted search functionality---a.k.a. ACM Communications, 43(8) (pp. 58-61). New York: ACM Database Navigation---will be incorporated into the next Press. version of the Westlaw Directory. We thus plan further Bishop, A. P. (1995). Working toward an understanding of evaluation of users’ research results in a “live” production digital library use: a report on the user research efforts of the environment. nsf/arpa/nasa dli projects. In D-Lib Magazine. http://www.dlib.org/dlib/october95/10bishop.html Bopp, R.E., & Smith, L.C. (2000). Chapter 3: Bibliographic Information Systems, 19(2) (pp. 161-215). New York: ACM Control, Organization of Information, and Search Strategies. Press. rd In Reference and Information Services, An Introduction (3 Park, S. (2000). Usability, user preferences, effectiveness, and ed.) (pp. 88-93). Englewood, CA: Libraries Unlimited. user behaviors when searching individual and integrated full- Broglio, J., Callan, J., & Croft, W.B. (1993). Inquery system text databases: Implications for digital libraries. In Journal of overview. In Proceedings of the TIPSTER Text Program the American Society for Information Science, 51(5) (pp.456- (Phase I) (pp. 47-67). Washington, D.C.: TIPSTER. 468). Washington, D.C.: JASIS. Buckland, M. (2001). Entry vocabularies, intermediaries, and Paytette, S. D., & Rieger, O.Y. (1997). Z39.50: The user’s retrieval performance. In Proceedings of the 64th Annual perspective. In D-Lib Magazine. Meeting of the American Society for Information Science and http://www.dlib.org/dlib/april97/cornell/04payette.html Technology (Washington, D.C.) (pp.112-117). Medford, NJ: Saracevic, T. (1997). Users lost: Reflections on the past, present, Information Today, Inc. future, and limits of information science. In Proceedings of Callan, J. & Croft, W.B. (1995). Searching distributed the 20th Annual Int’l ACM SIGIR Conference on Research and collections with inference networks. In Proceedings of the 18th Development in Information Retrieval (Philadelphia, PA) (pp. Annual Int’l ACM SIGIR Conference on Research and 1-2). New York: ACM Press. Development in Information Retrieval (Seattle, WA) (pp. 21- Stellin, S. (2000). E-commerce report: Internet companies learn 29). New York: ACM Press. how to personalize. In New York Times (pp. C8). New York: Chakravarthy, A. S. & Haase, K. B. (1995). Netserf: Using The Times Company. semantic knowledge to find Internet information archives. In th Thompson, P., Turtle, H., Yang, B., & Flood, J. (1995). TREC-3 Proceedings of the 18 Annual Int’l ACM SIGIR Conference ad hoc retrieval and routing experiments using the WIN on Research and Development in Information Retrieval system. In Proceedings of the Third Text REtrieval (Seattle, WA) (pp. 4-11). New York: ACM Press. Conference (TREC-3) In (pp. 211-217). Washington, D.C.: Dolan, R. Agrawal, D., Dillon, L. & Abbadi, A. E. (1996). NIST. Pharos: A scalable distributed architecture for locating Turtle, H. (1994). Natural language vs. Boolean query heterogeneous information sources. In Tech Report TRCS95- evaluation: A comparison of retrieval performance. In 05, Santa Barbara: University of California, S.B., Department Proceedings of the 17th Annual Int’l ACM SIGIR Conference of Computer Science. on Research and Development in Information Retrieval Fidel, R. & Crandall, M. (1998). The role of subject access in (Dublin, Ireland) (pp. 212-221). London: Springer-Verlag. information filtering. In P. A. Cochran and E. H. Johnson st Turtle, H.R. (1991). Inference Networks for Document (Eds.), Visualizing Subject Access for 21 Century Retrieval. Information Resources (pp.16-27). Urbana-Champaign: PhD. Dissertation. Amherst, MA: University of University of Illinois. Massachusetts. Gravano, L., Garcia-Molina, H. & Tomasic, A. (1994). The Van House, N. A., Butler, M. H., Ogle, V. & Schiff, L. (1996). effectiveness of GlOSS for the text database discovery User centered iterative design for digital libraries: the cypress Problem. In Proceedings of the 1994 ACM SIGMOD Int’l experience. In D-Lb Magazine. Conference of Management of Data (Minneapolis, MN) http://www.dlib.org/dlib/february96/02vanhouse.html (pp.126-137). New York: ACM Press. Yuwono, B. & Lee D.L. (1997). Server ranking for distributed French, J.C., Powell, A.L., Callan, J., Viles, C.L., Emmitt, T., text retrieval systems on the Internet. In Proceedings of the 5th Prey, K.J. & Mou, Y. (1999). Comparing the performance of st Annual Int’l Conference on Database Systems for Advanced database selection algorithms. In Proceedings of the 21 Applications (Melbourne, Australia) (pp.41-49). Sydney: Annual Int’l ACM SIGIR Conference on Research and World Scientific Press. Development in Information Retrieval (Berkeley, CA) (pp. 238-245). New York: ACM Press. Wu, M., Fuller, M., & Wilkenson, R. (2000). The role of a judge in a user based retrieval experiment. In Proceedings of Hearst, M. A. (1994). Using categories to provide context for the 23rd Annual Int’l ACM SIGIR Conference on Research and full-text retrieval results. In Proceedings of the RIAO Development in Information Retrieval (Athens, Greece) (pp. (Computer Assisted Information Retrieval) Conference (New 331-333). New York: ACM Press. York, NY) (pp. 115-130). Paris: CID. Kramer, J. Noronha, S. & Vergo, J. (2000). A user-centered design approach to personalization. In ACM Communications, 43(8) (pp. 45-48). New York: ACM Press. Lloyd, C. (1996). A new digital library project on delivery of copyright materials in electronic format: The decomate user study. In London School of Economics and Political Science. London: LSEPS. http://www.lse.ac.uk/decomate/docs/71.htm Meuss, H., & Schultz, K. U. (2001). Complete answer aggregates for tree-like databases: A novel approach to combine querying and navigation. In ACM Transactions on 13. Appendix:

Performance Evaluation ― Precision & Recall

Figures 8, 9, 10: Comparison of Precision (Sets A, B, A + B) Figure 11: Comparison of Recall (Sets A + B)

Tables 1, 2, 3, 4: Query Sets A + B (Query-by-Query Performance)

Precision of 3 Methods - Set A

Root Node; Global 33.89% 26.67% 39.44% IDFs

Relevant Intermediate Node; 50.36% 32.97% 16.67% Partially Relevant Global IDFs Not Relevant

Intermediate Node; 77.03% 9.64% 13.33% Local, Dynamic IDFs

0% 20% 40% 60% 80% 100%

Figure 8: Comparison of Precision (Set A)

Precision of 3 Methods - Set B

Root Node; Global 40.93% 20.65% 38.42% IDFs

Relevant Intermediate Node; Partially Relevant Global IDFs 48.53% 26.86% 24.62% Not Relevant

0% 20% 40% 60% 80% 100%

Figure 9: Comparison of Precision (Set B) Precision of 3 Methods - Sets A and B

Root Node; Global IDFs 38.67% 22.58% 38.75%

Relevant Intermediate Node; 47.38% 27.86% 21.18% Partially Relevant Global IDFs Not Relevant

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Figure 10: Comparison of Precision (Sets A + B)

Recall of 3 Methods

100.00% 92.22% 90.00% 79.10% 79.90% 80.00% 88.10% 70.00% 73.15% 60.00% 69.81% Set A 50.00% Set B 40.00% 30.00% 20.00% 10.00% 0.00% Root Node; Intermediate Intermediate Global IDFs Node; Global Node; Local, IDFs Dynamic IDFs

Figure 11: Comparison of Recall (Sets A + B) Table 1: Query Set A – Sample Labor and Tax Queries No. Text of Query 1. employment and labor relations 2. federal congressional legislative materials from Washington 3. list of reporters 4. OSHA publications 5. labor relations materials 6. administrative materials relating to specific occupations 7. tax materials from Texas 8. tax court materials from Arizona 9. Commerce Clearing House materials

Table 2: Performance Evaluation I (Precision & Recall) Labor and Tax Queries No. Total Good % Good Close % Bad % Full Query % Results Match Match Close Match Bad Recall Recall Recall Root Avgs 180 61 33.9% 48 26.7% 71 39.4% 122 109 92.2% Global idf 113 45 50.4% 38 33.0% 30 16.6% 122 83 69.8% Totals / Avgs Dynamic idf 113 77 77.0% 12 9.6% 24 13.4% 122 89 73.2% Totals / Avgs

Table 3: Query Set B – Sample General Legal Queries No. Text of Query 1. biographies of us senators 2. judicial disciplinary materials 3. rules for judicial conduct 4. federalist papers 5. state legislative news 6. federal reserve board issuances 7. comptroller of the currency materials 8. social security reporting service 9. food and drug administration 10. board of immigration 11. cftc 12. voting rights 13. interstate commerce commission 14. superfund 15. pbgc 16. compensation planning 17. delaware corporate law 18. florida cle materials 19. rutter group

Table 4: Performance Evaluation II (Precision & Recall) – General Legal Queries No. Total Good % Good Close % Bad % Full Query % Results Match Match Close Match Bad Recall Recall Recall Root Avgs 333 111 40.9% 76 20.7% 146 38.4% 215 187 88.1% Global idf 237 95 48.5% 72 26.9% 70 24.6% 215 167 79.0% Totals / Avgs Dynamic idf 237 90 47.1% 79 29.1% 67 23.8% 215 170 79.9% Totals / Avgs