Medical Text Indexer (MTI)
Total Page:16
File Type:pdf, Size:1020Kb
Medical Text Indexer (MTI) Title + Abstract Phrasex PubMed Noun Phrases Trigram Related Citations MetaMap UMLS concepts Rel. Citations Extract Restrict to MeSH MeSH descr. MeSH Main Headings Clustering & Ranking Ordered list of MeSH Main Headings (Last Updated: Monday, March 13, 2006) Index 1. Introduction............................................................................................................................................................... 3 2. Exclusions................................................................................................................................................................. 4 3. Clustering and Ranking............................................................................................................................................. 6 3.1. Overview of Clustering and Ranking (from BoSC99 report)............................................................................ 6 3.2. UMLS Metathesaurus Files........................................................................................................................... 7 3.2.1. Related Concepts (File = MRREL) ........................................................................................................... 7 3.2.2. Co-occurring Concepts (File = MRCOC).................................................................................................. 8 3.3. Creating the Normalized Frequency Scores for the Co-Occurring Concepts.................................................... 9 3.3.1. Overview ................................................................................................................................................... 9 3.3.2. Detailed Explanation and Example ........................................................................................................... 9 3.4. Calculating TermWeight................................................................................................................................. 10 3.4.1. Tunable and System Parameters.............................................................................................................. 11 3.4.2. Steps Followed in Calculating the TermWeight...................................................................................... 11 3.5. Clustering....................................................................................................................................................... 12 3.5.1. Overview of Steps for Clustering ........................................................................................................... 13 3.5.2. Example of Clustering............................................................................................................................. 14 3.6. Calculating the RankScore.............................................................................................................................. 17 3.6.1. Summary of Steps for Calculating the RankScore .................................................................................. 17 3.6.2. Example of Calculating the RankScore................................................................................................... 18 4. Determining Heading Mapped to (Optional) .......................................................................................................... 20 5. Boosting New Terms .............................................................................................................................................. 21 6. Emphasize Titles..................................................................................................................................................... 21 7. Emphasize HSTAR (Optional) ............................................................................................................................... 21 8. Float Chemicals ...................................................................................................................................................... 22 9. Determine TopN Terms List ................................................................................................................................... 23 10. Senile Plaque/Dental Plaque Disambiguation....................................................................................................... 23 11. Medium Filtering (Optional)................................................................................................................................. 24 12. Validate TopN Terms............................................................................................................................................ 24 13. MH/SH Substitution.............................................................................................................................................. 26 14. Add drug therapy SH......................................................................................................................................... 27 15. Drop physiology & analysis SHs ................................................................................................................... 27 16. Add CheckTags from Text and doAgedReview (Optional).................................................................................. 27 17. Add Geographics from Text (Optional)................................................................................................................ 28 18. Strict Filtering (Optional)...................................................................................................................................... 28 19. Update Chemicals (Optional)................................................................................................................................ 28 20. Add USA CheckTag from Text (Optional)........................................................................................................... 28 21. Display Results ..................................................................................................................................................... 29 21.1. showHMs Display Option............................................................................................................................. 29 MTI Processing Flow Explained i 21.2. limitTitleOnly Display Option ...................................................................................................................... 30 21.3. RSfilterTO Display Option ........................................................................................................................... 30 21.4. limitPTs Display Option................................................................................................................................ 30 21.5. showETs Display Option .............................................................................................................................. 30 Appendix A MTI Exceptions for Medium Filtering ................................................................................................ 33 Appendix B MTI Heuristics for Medium Filtering.................................................................................................. 34 Appendix C CheckTags (2006)................................................................................................................................ 35 Appendix D Bad CheckTags List (BadCTs)............................................................................................................ 35 Appendix E SubHeadings (2006)............................................................................................................................. 36 Appendix F Geographics Lookup & Substitution List (GEOs) ............................................................................... 37 Appendix G MH/SH Lookup & Substitution List (MHSHs)................................................................................... 38 Appendix H MH Exclusion List (MH_Excludes).................................................................................................... 39 Appendix I UMLS Concepts to Exclude List (CUI_Excludes) ............................................................................... 42 Appendix J New MeSH Headings List (NewTerms)............................................................................................... 46 Appendix K Additional US Triggers List (USAtriggers)......................................................................................... 54 Appendix L Lookup Lists ........................................................................................................................................ 55 Appendix M MeSH SubHeading Treecode Triggers List........................................................................................ 78 Appendix N Special Publication Type List.............................................................................................................. 79 List of Equations Equation 1 - TermWeight Formula ............................................................................................................................. 10 Equation 2 - RankScore Formula................................................................................................................................ 17 List of Figures Figure 1: Detailed Medical Text Indexer Process Flow Diagram................................................................................. 3 Figure 2: Picture of how we traverse the item list for clustering