What Is Pairs Trading
Total Page:16
File Type:pdf, Size:1020Kb
LyncP PageRank LocalRankU U HilltopU U HITSU U AT(k)U U NORM(p)U U moreU 〉〉 U Searching for a better search… LYNC Search I’m Feeling Luckier RadhikaHTU GuptaUTH NalinHTU MonizUTH SudiptoHTU GuhaUTH th CSE 401 Senior Design. April 11P ,P 2005. PageRank LocalRankU U HilltopU U HITSU U AT(k)U U NORM(p)U U moreU 〉〉 U Searching for a better search … LYNC Search Lync "for"T is a very common word and was not included in your search. [detailsHTU ]UTH Table of Contents Pages 1 – 31 for SearchingHTU for a better search UTH (2 Semesters) P P Sponsored Links AbstractHTU UTH PROBLEMU Solved U A summary of the topics covered in our paper. Beta Power! Pages 1 – 2 - CachedHTU UTH - SimilarHTU pages UTH www.PROBLEM.com IntroductionHTU and Definitions UTH FreeU CANDDE U An introduction to web searching algorithms and the Link Analysis Rank Algorithms space as well as a Come get it. Don’t be detailed list of key definitions used in the paper. left dangling! Pages 3 – 7 - CachedHTU UTH - SimilarHTU pagesUTH www.CANDDE.gov SurveyHTU of the Literature UTH A detailed survey of the different classes of Link Analysis Rank algorithms including PageRank based AU PAT on the Back U algorithms, local interconnectivity algorithms, and HITS and the affiliated family of algorithms. This The Best Authorities section also includes a detailed discuss of the theoretical drawbacks and benefits of each algorithm. on Every Subject Pages 8 – 31 - CachedHTU UTH - SimilarHTU pages UTH www.PATK.edu PageHTU Ranking Algorithms UTH PagingU PAGE U An examination of the idea of a simple page rank algorithm and some of the theoretical difficulties The shortest path to with page ranking, as well as a discussion and analysis of Google’s PageRank algorithm. perfect search. Pages 8 – 15 - CachedHTU UTH - SimilarHTU pages UTH www.PAGE.net IntroducingHTU Local Inter-Connectivity UTH A discussion of the motivation for using local connectivity in page rank algorithms, as well as an detailed discussion and analysis of both the Hilltop and the LocalRank algorithms. Pages 15 – 23 - CachedHTU UTH - SimilarHTU pages UTH HubHTU and Authority Based Ranking Algorithms UTH A discussion of the HITS algorithm and the ideas of hubs and authorities as well as an examinations of variations of the HITS algorithm including PHITS, Hub Averaging HITS, SALSA, BFS, and the non-linear dynamic algorithms, AT(k) and NORM(p). Pages 23 – 31 - CachedHTU UTH - SimilarHTU pagesUTH Lynnnnnnnnc Result Page: 1 2U U 3U U NextU U Coming Soon! LYNCHTU Goes Public:UTH LYNC Gets Bigger and Better!!! Searching for a better search… LYNC Search LYNCHTU Home UTH - BusinessHTU Solutions UTH - AboutHTU LYNCUTH ©2005 LYNC Nalin Moniz and Radhika Gupta PageRank LocalRankU U HilltopU U HITSU U AT(k)U U NORM(p)U U moreU 〉〉 U Searching for a better search … LYNC Search Lync "for" is a very common word and was not included in your search. [detailsHTU ]UTH Table of Contents Pages 32 – 81 for SearchingHTU for a better search UTH (2 Semesters) P P Sponsored Links SurveyHTU of Rank Merging and Aggregation UTH SimpleU Borda U A summary of our research on rank merging and aggregation including a description of key distance Bored of Being Alone? measures as well as rank aggregation algorithms that we use in our implementation. Lets Merge ☺ Pages 32 – 39 - CachedHTU UTH - SimilarHTU pagesUTH www.sborda.com HybridHTU Algorithm Proposals UTH GeometricU Borda U A discussion of the technical details and motivations behind six of our own hybrid algorithms developed Really Bored? Like by analyzing the flaws and strengths of the different classes of Link Analysis Rank algorithms. Really Bored? Pages 40 – 50 - CachedHTU UTH - SimilarHTU pagesUTH www.gborda.gov TechnicalHTU Approach UTH MarkovU Merge U A discussion of our approach to this project including the research done in algorithms, the implementation Mark My Words… of the algorithms, as well as the analysis of the results. Includes system architecture diagrams. This is really good ☺ Pages 51 – 65 - CachedHTU UTH - SimilarHTU pagesUTH www.markov.uk TechnicalHTU Approach Appendix UTH Samples of the XML schema of our software, sample output, and sample surveys and results. Pages 66 – 72 - CachedHTU UTH - SimilarHTU pagesUTH AnalysisHTU of Survey Results UTH A detailed statistical analysis of the results of our survey including a discussion of the performance of different hybrid algorithms as well as possible explanations. Pages 73 – 79 - CachedHTU UTH - SimilarHTU pagesUTH FutureHTU Improvements UTH A discussion of possible future improvements that could be made to the hybrid algorithms. Pages 80 – 81 - CachedHTU UTH - SimilarHTU pagesUTH Lynnnnnnnnc Result Page: PreviousU U 1 2 3U U NextU U Coming Soon! LYNCHTU Goes Public:UTH LYNC Gets Bigger and Better!!! Searching for a better search… LYNC Search LYNCHTU Home UTH - BusinessHTU Solutions UTH - AboutHTU LYNCUTH ©2005 LYNC Nalin Moniz and Radhika Gupta PageRank LocalRankU U HilltopU U HITSU U AT(k)U U NORM(p)U U moreU 〉〉 U Searching for a better search … LYNC Search Lync "for" is a very common word and was not included in your search. [detailsHTU ]UTH Table of Contents Pages 82 – 91 for SearchingHTU for a better search UTH (2 Semesters) P P Sponsored Links MilestonesHTU and Timeline UTH CSEU 400 U The milestones and timeline for this project. 004 Sleepless Nights Pages 82 – 85 - CachedHTU UTH - SimilarHTU pagesUTH www.hell.com ConclusionHTU and Reflections UTH CSEU 401 U Our thoughts, reflections, and observations on this year long project. 104 Sleepless Nights Pages 86 – 88 - CachedHTU UTH - SimilarHTU pagesUTH www.hellreloaded.com U U ReferencesHTU UTH CSE 401 is Over! A list of all our references. No More Sleepless Nights Pages 89 – 91 - CachedHTU UTH - SimilarHTU pagesUTH www.sleep.com Lynnnnnnnnc Result Page: PreviousU U 1U U 2U U 3 Coming Soon! LYNCHTU Goes Public:UTH LYNC Gets Bigger and Better!!! Searching for a better search… LYNC Search LYNCHTU Home UTH - BusinessHTU Solutions UTH - AboutHTU LYNCUTH ©2005 LYNC Nalin Moniz and Radhika Gupta . ABSTRACT . Efficient and accurate ranking of web pages in response to a query is at the core of the information retrieval on the Internet. The problem of web search has been most commonly approached through Link Analysis Rank algorithms. Google’s PageRank is one of the more well known algorithms in this class, and was reasonably successful, until the growth of blogs and link exchanges allowed for the manipulation of the system. These developments gave rise to the idea of rankings should focus on links coming from the relatively more important sources – the notion at the heart of the LocalRank and Hilltop algorithms. At the same time, researchers began to investigate algorithms that used a dual ranking system instead of a single ranking system. Kleinberg popularized the idea of ranking pages separately for outgoing and incoming hyperlinks in his seminal paper on Hyperlink Induced Topic Distillation (HITS). HITS and related algorithms have met with some success, yet they have fundamental symmetry flaws that mandate a shift away from the linear to the non linear system paradigm if they are to be solved. The early classes of non- linear algorithms – NORM(p) and AT(k), have performed well on a small samples of queries and moderate sized systems despite their computational limitations, but have yet to gain widespread popularity. Our project looks at the work done in non linear dynamic systems for ranking web pages, and develops hybrid algorithms that bring together the simplicity and local focus of linear algorithms like LocalRank, while exploiting the benefits of the non- linear and other interesting paradigms. The models we propose draw from Hilltop, LocalRank, HITS, and the AT(k) classes of algorithms. Our first algorithm PROBLEM, uses the concept of beta distributed user web surfing, instead of the random surfer model of PageRank, while our second algorithm, PAGE is based on a percentage of shortest path ideas. Our third algorithm, PAT(k) is a modification of the non linear dynamic algorithm, AT(k), while our fourth algorithm CANDDE, takes a new approach to dangling links. We test our algorithms in practice by implementing a demonstrative system that crawls the web on the upenn.edu domain and retrieves the top ranked pages for a particular query on a particular algorithm. We then measure our results by asking 1 users on the Penn campus to participate in a survey, which asks them to rate these particular different rankings. The experimental approach we will adopt is drawn from Tsaparas (Tsaparas 67) and Kleinberg and involves comparing the performance of our algorithm with PageRank, LocalRank, Hilltop, HITS, and AT(k). To explore the newer field of rank aggregation, we also include as benchmarks, three algorithms that essentially merge PageRank and HITS using different merging schemes (Borda and Markov Chain ideas). We perform a detail statistical analysis of our results, from which we gather, that for these particular queries, the algorithms tested fall into four different performance buckets. CANDDE out performed all the other algorithms, largely because of the influence of dangling links on a small graph. PROBLEM and PAT(k) fell into the second performance bucket, that included the HITS algorithm and a Borda merge. PROBLEM performed relatively well because the Beta distribution model was effectively able to capture the fact that upenn.edu has a few central pages that contain most of the information and to which users are most likely to jump to. PAT(k) on the other hand performed well because it captured the essence of the non-linear dynamical system filtering. In the third bucket, we saw Page Rank, AT(k), and the Markov algorithms, and in the fourth, and worst performing bucket, we saw PAGE and LocalRank.