Fractal Prefetching B·-Trees: Optimizing Both Cache and Disk

Total Page:16

File Type:pdf, Size:1020Kb

Fractal Prefetching B·-Trees: Optimizing Both Cache and Disk Fractal Prefetching B · -Trees: Optimizing Both Cache and Disk Performance Ü Shimin Chen, Phillip B. Gibbons Ý , Todd C. Mowry, and Gary Valentin Ü School of Computer Science Ý Information Sciences Research Center DB2 UDB Development Team Carnegie Mellon University Bell Laboratories IBM Toronto Lab Pittsburgh, PA 15213 Murray Hill, NJ 07974£ Markham, Ontario, Canada L6G 1C7 g fchensm,tcm @cs.cmu.edu [email protected] ABSTRACT · B ¹ÌÖee× haÚe b eeÒ ØÖadiØiÓÒaÐÐÝ ÓÔØiÑiÞed fÓÖ Á»Ç Ô eÖfÓÖ¹ ÑaÒce ÛiØh di×k Ôage× a× ØÖee ÒÓ de׺ ÊeceÒØÐݸ Öe×eaÖcheÖ× · haÚe ÔÖÓÔ Ó×ed ÒeÛ ØÝÔ e× Óf B ¹ÌÖee× ÓÔØiÑiÞed fÓÖ CÈÍ cache Ô eÖfÓÖÑaÒce iÒ ÑaiÒ ÑeÑÓÖÝ eÒÚiÖÓÒÑeÒØ×¸ ÛheÖe Øhe ØÖee ÒÓ de ×iÞe× aÖe ÓÒe ÓÖ a feÛ cache ÐiÒe׺ ÍÒfÓÖØÙÒaØeÐݸ dÙe ÔÖiÑaÖiÐÝ ØÓ Øhi× ÐaÖge di×cÖeÔaÒcÝ iÒ ÓÔØiÑaÐ ÒÓ de ×iÞe׸ · eÜiרiÒg di×k¹ÓÔØiÑiÞed B ¹ÌÖee× ×Ù«eÖ fÖÓÑ Ô Ó ÓÖ cache Ô eÖ¹ · fÓÖÑaÒce ÛhiÐe cache¹ÓÔØiÑiÞed B ¹ÌÖee× eÜhibiØ Ô Ó ÓÖ di×k FigÙÖe ½: ËeÐf¹×iÑiÐaÖ \ØÖee ÛiØhiÒ a ØÖee" רÖÙcØÙÖe Ô eÖfÓÖÑaÒceº ÁÒ Øhi× ÔaÔ eÖ¸ Ûe ÔÖÓÔ Ó×e fÖacØaÐ ÔÖefeØch¹ · · iÒg B ¹ÌÖee× ´fÔB ¹ÌÖee×µ¸ ÛhicheÑb ed \cache¹ÓÔØiÑiÞed" ØÖee× ÛiØhiÒ \di×k¹ÓÔØiÑiÞed" ØÖee׸ iÒ ÓÖdeÖ ØÓ ÓÔØiÑiÞe CÈÍ cache Ô eÖfÓÖÑaÒce bÝ ÑiÒiÑiÞiÒg Øhe iÑÔacØ Óf cache · b ÓØh cache aÒd Á»Ç Ô eÖfÓÖÑaÒceº Ïe de×igÒ aÒd eÚaÐÙaØe Ñi××e׺ Ìhe×e \cache¹ÓÔØiÑiÞed" B ¹ÌÖee× aÖe cÓÑÔ Ó×ed Óf ½ ØÛÓ aÔÔÖÓache× ØÓ bÖeakiÒg di×k Ôage× iÒØÓ cache¹ÓÔØiÑiÞed ÒÓ de× Øhe ×iÞe Óf a cache ÐiÒe |iºeº¸ Øhe ÒaØÙÖaÐ ØÖaÒ×feÖ ×iÞe ÒÓ de×: di×k¹¬Öר aÒd cache¹¬Öרº Ìhe×e aÔÔÖÓache× aÖe ×ÓÑe¹ fÓÖ ÖeadiÒg ÓÖ ÛÖiØiÒg ØÓ ÑaiÒ ÑeÑÓÖݺ · ÛhaØ bia×ed iÒ faÚÓÖ Óf ÑaÜiÑiÞiÒg di×k aÒd cache Ô eÖfÓÖ¹ ÍÒfÓÖØÙÒaØeÐݸ B ¹ÌÖee× ÓÔØiÑiÞed fÓÖ di×k ×Ù«eÖ fÖÓÑ · ÑaÒce¸ Öe×Ô ecØiÚeÐݸ a× deÑÓÒרÖaØed bÝ ÓÙÖ Öe×ÙÐØ×º BÓØh Ô Ó ÓÖ CÈÍ cache Ô eÖfÓÖÑaÒce¸ aÒd B ¹ÌÖee× ÓÔØiÑiÞed fÓÖ · iÑÔÐeÑeÒØaØiÓÒ× Óf fÔB ¹ÌÖee× achieÚe dÖaÑaØicaÐÐÝ b eØØeÖ cache ×Ù«eÖ fÖÓÑ Ô Ó ÓÖ Á»Ç Ô eÖfÓÖÑaÒceº Ìhi× i× ÔÖiÑaÖiÐÝ · cache Ô eÖfÓÖÑaÒce ØhaÒ di×k¹ÓÔØiÑiÞed B ¹ÌÖee×: a facØÓÖ b ecaÙ×e Óf Øhe ÐaÖge di×cÖeÔaÒcÝ iÒ ÒÓ de ×iÞe×: di×k Ôage× aÖe Óf ½º½ß½º8 iÑÔÖÓÚeÑeÒØ fÓÖ ×eaÖch¸ ÙÔ ØÓ a facØÓÖ Óf 4º¾ iѹ ØÝÔicaÐÐÝ 4ÃBß64ÃB ÛhiÐe cache ÐiÒe× aÖe ÓfØeÒ ¿¾Bß½¾8B¸ · ÔÖÓÚeÑeÒØ fÓÖ ÖaÒge ×caÒ׸ aÒd ÙÔ ØÓ a ¾¼¹fÓÐd iÑÔÖÓÚe¹ deÔ eÒdiÒg ÓÒ Øhe ×ÝרeѺ ÌhÙ× eÜiרiÒg di×k¹ÓÔØiÑiÞed B ¹ ÑeÒØ fÓÖ ÙÔ daØe׸ aÐÐ ÛiØhÓÙØ ×igÒi¬caÒØ degÖadaØiÓÒ Óf Á»Ç ÌÖee× ×Ù«eÖ aÒ eÜce××iÚeÒÙÑb eÖ Óf cache Ñi××e× ØÓ ×eaÖchiÒ · Ô eÖfÓÖÑaÒceº ÁÒ addiØiÓÒ¸ fÔB ¹ÌÖee× acceÐeÖaØe Á»Ç Ô eÖ¹ a ´ÐaÖgeµ ÒÓ de¸ ÛaרiÒg ØiÑe aÒd fÓÖciÒg Øhe eÚicØiÓÒ Óf Ù×e¹ fÓÖÑaÒce fÓÖ ÖaÒge ×caÒ× bÝ Ù×iÒg jÙÑÔ¹Ô ÓiÒØeÖ aÖÖaÝ× ØÓ fÙÐ daØa fÖÓÑ Øhe cacheº ÄikeÛi×e¸ eÜiרiÒg cache¹ÓÔØiÑiÞed · ÔÖefeØch Ðeaf Ôage׸ ØheÖebÝachieÚiÒg a ×Ô eed¹ÙÔ Óf ¾º5ß5 B ¹ÌÖee׸ iÒ ×eaÖchiÒg fÖÓÑ Øhe ÖÓ ÓØ ØÓ Øhe de×iÖed Ðeaf¸ ÓÒ ÁBÅ³× DB¾ ÍÒiÚeÖ×aÐ DaØaba×eº ÑaÝ feØch a diרiÒcØ Ôage fÓÖ each ÒÓ de ÓÒ Øhi× ÔaØhº Ìhi× i× a ×igÒi¬caÒØ Ô eÖfÓÖÑaÒce Ô eÒaÐØÝ¸ fÓÖ Øhe ×ÑaÐÐeÖ ÒÓ de× · he¹ÓÔØiÑiÞed B ¹ÌÖee× iÑÔÐÝ ÑÙch deeÔ eÖ ØÖee× ØhaÒ 1. INTRODUCTION Óf cac · iÒ Øhe di×k¹ÓÔØiÑiÞed ca×e× ´eºgº¸ ØÛice a× deeÔµº Ìhe Á»Ç Ìhe B ¹ÌÖee i× a ÙbiÕÙiØÓÙ× ×ØÖÙcØÙÖe fÓÖ iÒdeÜiÒg di×k¹ Ô eÒaÐØÝ fÓÖ ÖaÒge ×caÒ× ÓÒ ÒÓÒcÐÙרeÖed iÒdeÜe× Óf cache¹ Öe×ideÒØ daØaº ÁØ ÔÖÓÚide× ba×ic iÒdeÜ ÓÔ eÖaØiÓÒ× ×Ùch a× ÓÔØiÑiÞed ØÖee× i× eÚeÒ ÛÓÖ×e: a diרiÒcØ Ôage ÑaÝ b e feØched ×eaÖch¸ ÖaÒge ×caÒ¸ iÒ×eÖØiÓÒ aÒd deÐeØiÓÒ¸ ÛhiÐe ÑiÒiÑiÞ¹ fÓÖ each Ðeaf ÒÓ de iÒ Øhe ÖaÒge¸ iÒcÖea×iÒg Øhe ÒÙÑb eÖ Óf di×k iÒg Øhe ÒÙÑbeÖ Óf di×k acce××e׺ ÌÓ ÓÔØiÑiÞe Á»Ç Ô eÖfÓÖ¹ · acce××e× bÝ Øhe ÖaØiÓ Óf Øhe ÒÓ de ×iÞe× ´eºgº¸ a facØÓÖ Óf 5¼¼µº ÑaÒce¸ ØÖadiØiÓÒaÐ \di×k¹ÓÔØiÑiÞed" B ¹ÌÖee× aÖe cÓÑÔ Ó×ed Óf ÒÓ de× Øhe ×iÞe Óf a di×k Ôage|iºeº¸ Øhe ÒaØÙÖaÐ ØÖaÒ×feÖ · ÊeceÒØÐݸ ×eÚeÖaÐ ×ØÙd¹ ×iÞe fÓÖ ÖeadiÒg ÓÖ ÛÖiØiÒg ØÓ di×kº 1.1 Our Approach: Fractal Prefetching B -Trees · ie× [5¸ 6¸ ½9] haÚe cÓÒ×ideÖed B ¹ÌÖee ÚaÖiaÒØ× fÓÖ iÒdeܹ ÁÒ Øhi× ÔaÔ eÖ¸ Ûe ÔÖÓÔ Ó×e aÒd eÚaÐÙaØe FÖacØaÐ ÈÖefeØch¹ · · · iÒg ÑeÑÓÖݹÖe×ideÒØ daØaº Ìhe×e רÙdie× ÔÖe×eÒØ ÒeÛ ØÝÔ e× iÒg B ¹ÌÖee× ´fÔB ¹ÌÖee×µ¸ Ûhich aÖe a ÒeÛ ØÝÔeÓfB ¹ÌÖee · · · Óf B ¹ÌÖee×|cache¹×eÒ×iØiÚe B ¹ÌÖee× [½9]¸ ÔaÖØiaйkeÝ B ¹ ØhaØ ÓÔØiÑiÞe× bÓØh cache aÒd Á»Ç Ô eÖfÓÖÑaÒceº ÁÒ a ÒÙØ¹ · · ÌÖee× [5]¸ aÒd ÔÖefeØchiÒg B ¹ÌÖee× [6]|ØhaØ ÓÔØiÑiÞe fÓÖ ×heÐи aÒ fÔB ¹ÌÖee i× a ×iÒgÐe iÒdeÜ ×ØÖÙcØÙÖe ØhaØ caÒ b e ÚieÛed aØ ØÛÓ di«eÖeÒØ gÖaÒÙÐaÖiØie×: aØ a cÓaÖ×e gÖaÒÙÐaÖ¹ £ CÙÖÖeÒØ aÆÐiaØiÓÒ: ÁÒØeÐ Êe×eaÖch ÈiØØ×bÙÖgh¸ 4½7 ËÓÙØh CÖaig ËØº¸ iØÝ¸ iØ cÓÒØaiÒ× di×k¹ÓÔØiÑiÞed ÒÓ de× ØhaØ aÖe ÖÓÙghÐÝ Øhe ÈiØØ×bÙÖgh¸ ÈA ½5¾½¿ ÔhiÐ ÐiÔºbºgibbÓÒ×@iÒØeкcÓÑ ×iÞe Óf a di×k Ôage¸ aÒd aØ a ¬Òe gÖaÒÙÐaÖiØÝ¸ iØ cÓÒØaiÒ× he¹ÓÔØiÑiÞed ÒÓ de× ØhaØ aÖe ÖÓÙghÐÝ Øhe ×iÞe Óf a cache Permission to make digital or hard copies of all or part of this work for cac · ÐiÒeº Ïe ÖefeÖ ØÓ a fÔB Öee a× b eiÒg \fÖacØaÐ" b ecaÙ×e Óf personal or classroom use is granted without fee provided that copies are ¹Ì not made or distributed for profit or commercial advantage and that copies iØ× ×eÐf¹×iÑiÐaÖ \ØÖee ÛiØhiÒ a ØÖee" רÖÙcØÙÖe¸ a× iÐÐÙרÖaØed FigÙÖe ½º Ìhe cache¹ÓÔØiÑiÞed a×Ô ecØ i× ÑÓ deÐed afØeÖ bear this notice and the full citation on the first page. To copy otherwise, to iÒ · ÔÖefeØchiÒg B ¹ÌÖee× ØhaØ Ûe ÔÖÓÔ Ó×ed eaÖÐieÖ [6]¸ Ûhich republish, to post on servers or to redistribute to lists, requires prior specific Øhe permission and/or a fee. ½ · ÔÖefeØchiÒg B ¹ÌÖee× [6]¸ Øhe ÒÓ de× aÖe ×eÚeÖaÐ cache ACM SIGMOD ’2002 June 4-6, Madison, Wisconsin, USA ÁÒ Øhe ca×e Óf ÐiÒe× Ûideº Copyright 2002 ACM 1-58113-497-5/02/06 ...°5.00. ÛeÖe ×hÓÛÒ ØÓ haÚe Øhe b eר ÑaiÒ ÑeÑÓÖÝ Ô eÖfÓÖÑaÒce fÓÖ ×iÞe ÛheÒ cÓÒ×ideÖiÒg bÙ«eÖ cache Ô eÖfÓÖÑaÒce¸ fÓÖ di×k¹ · ¬Üed¹×iÞe keÝ׺ ´Ïe ÒÓØe¸ hÓÛeÚeÖ¸ ØhaØ Øhi× geÒeÖaÐ aÔ¹ Öe×ideÒØ daØaº ÄÓÑeØ³× ÖeceÒØ ×ÙÖÚeÝ Óf B ¹ÌÖee ØechÒiÕÙe× [½6] · ÔÖÓach caÒ b e aÔÔÐied ØÓ aÒÝ cache¹ÓÔØiÑiÞed B ¹ÌÖeeºµ ÁÒ ÑeÒØiÓÒed Øhe idea Óf iÒØÖa¹ÒÓ de ÑicÖÓ¹iÒdeÜiÒg: iºeº¸ ÔÐac¹ · a ÔÖefeØchiÒg B ¹ÌÖee¸ ÒÓ de× aÖe ×eÚeÖaÐ cache ÐiÒe× Ûide iÒg a ×ÑaÐÐ aÖÖaÝ iÒ a feÛ cache ÐiÒe× Óf Øhe Ôage ØhaØ iÒdeÜe× ´eºgº¸ 8|Øhe eÜacØ ÒÙÑbeÖ i× ØÙÒed accÓÖdiÒg ØÓ ÚaÖiÓÙ× Øhe ÖeÑaiÒiÒg keÝ× iÒ Øhe Ôageº ÏhiÐe iØ aÔÔ eaÖ× ØhaØ Øhi× ÑeÑÓÖÝ ×ÝרeÑ ÔaÖaÑeØeÖ×µ¸ aÒd ÔÖefeØchiÒg i× Ù×ed ×Ó ØhaØ idea had ÒÓØ b eeÒ ÔÙÖ×Ùed iÒ aÒÝ deØaiÐ b efÓÖe¸ Ûe cÓÑÔaÖe · Øhe ØiÑe ØÓ feØch a ÒÓ de i× ÒÓØ ÑÙch ÐÓÒgeÖ ØhaÒ Øhe deÐaÝ iØ× Ô eÖfÓÖÑaÒce agaiÒר fÔB ¹ÌÖee× ÐaØeÖ iÒ ÓÙÖ eÜÔ eÖiÑeÒ¹ fÓÖ a ×iÒgÐe cache Ñi×׺ ØaÐ Öe×ÙÐØ×º Ïe Ób×eÖÚe ØhaØ ÛhiÐe ÑicÖÓ¹iÒdeÜiÒg achieÚe× · Ïe de×igÒ aÒd eÚaÐÙaØe ØÛÓ aÔÔÖÓache× ØÓ iÑÔÐeÑeÒØiÒg gÓ Ó d ×eaÖch Ô eÖfÓÖÑaÒce ´ÓfØeÒ cÓÑÔaÖabÐe ØÓ fÔB ¹ÌÖee×µ¸ · fÔB ¹ÌÖee×: ´iµ di×k¹¬Öר aÒd ´iiµ cache¹¬Öרº ÁÒ Øhe di×k¹ iØ ×Ù«eÖ× fÖÓÑ Ô Ó ÓÖ ÙÔ daØe Ô eÖfÓÖÑaÒceº A× ÔaÖØ Óf fÙØÙÖe · ¬Öר aÔÔÖÓach¸ Ûe רaÖØ ÛiØh a di×k¹ÓÔØiÑiÞed B ¹ÌÖee¸ bÙØ diÖecØiÓÒ׸ ÄÓÑeØ [½6] ha× iÒdeÔ eÒdeÒØÐÝ adÚÓ caØed bÖeak¹ · ØheÒ ÓÖgaÒiÞe Øhe keÝ× aÒd Ô ÓiÒØeÖ× ÛiØhiÒ each Ôage¹×iÞed iÒg ÙÔ B ¹ÌÖee di×k Ôage× iÒØÓ cache¹fÖieÒdÐÝ ÙÒiØ×¸ Ô ÓiÒØ¹ ÒÓ de a× a ×ÑaÐÐ ØÖeeº Ìhi× iÒ¹Ôage ØÖee i× a ÚaÖiaÒØ Óf Øhe iÒg ÓÙØ Øhe chaÐÐeÒge× Óf ¬ÒdiÒg aÒ ÓÖgaÒiÞaØiÓÒ ØhaØ ×ØÖike× · ÔÖefeØchiÒg B ¹ÌÖeeº ÌÓ Ôack ÑÓÖe keÝ× aÒd Ô ÓiÒØeÖ× iÒØÓ a gÓ Ó d baÐaÒce b eØÛeeÒ ×eaÖch aÒd iÒ×eÖØiÓÒ Ô eÖfÓÖÑaÒce¸ · aÒ iÒ¹Ôage ØÖee¸ Ûe Ù×e ×hÓÖØ iÒ¹Ôage Ó«×eØ× ÖaØheÖ ØhaÒ fÙÐÐ ×ØÓÖage ÙØiÐiÞaØiÓÒ¸ aÒd ×iÑÔÐiciØÝº Ïe b eÐieÚe ØhaØ fÔB ¹ Ô ÓiÒØeÖ× iÒ aÐÐ bÙØ Øhe Ðeaf ÒÓ de× Óf aÒ iÒ¹Ôage ØÖeeº Ïe aÐ×Ó ÌÖee× achieÚe Øhi× baÐaÒceº GÖaefe aÒd ÄaÖ×ÓÒ [½½] ÔÖe×eÒØed ×hÓÛ Øhe adÚaÒØage× Óf Ù×iÒg di«eÖeÒØ ×iÞe× fÓÖ Ðeaf ÚeÖ×Ù× a ×ÙÖÚeÝ Óf ØechÒiÕÙe× fÓÖ iÑÔÖÓÚiÒg Øhe CÈÍ cache Ô eÖfÓÖ¹ · ÒÓÒ¹Ðeaf ÒÓ de× iÒ aÒ iÒ¹Ôage ØÖeeº ÁÒ cÓÒØÖaר¸ Øhe cache¹ ÑaÒce Óf B ¹ÌÖee iÒdeÜe׺ ÌheÝ di×cÙ××ed a ÒÙÑb eÖ Óf Øech¹ · ¬Öר aÔÔÖÓach רaÖØ× ÛiØh a cache¹ÓÔØiÑiÞed ÔÖefeØchiÒg B ¹ ÒiÕÙe׸ ×Ùcha×keÝ cÓÑÔÖe××iÓÒ¸ ØhaØ aÖe cÓÑÔÐeÑeÒØaÖÝ ØÓ · ÌÖee ´igÒÓÖiÒg di×k Ôage b ÓÙÒdaÖie×µ¸ aÒd ØheÒ aØØeÑÔØ× ØÓ ÓÙÖ ×ØÙdݸ aÒd cÓÙÐd b e iÒcÓÖÔ ÓÖaØed iÒØÓ fÔB ¹ÌÖee׺ BeÒ¹ · gÖÓÙÔ ØÓgeØheÖ Øhe×e ×ÑaÐÐeÖ ÒÓ de× iÒØÓ Ôage¹×iÞed ÒÓ de× deÖ eØ aк [4] ÔÖe×eÒØ a ÖecÙÖ×iÚeB ¹ÌÖee רÖÙcØÙÖe ØhaØ i× ØÓ ÓÔØiÑiÞe di×k Ô eÖfÓÖÑaÒceº ËÔ eci¬caÐÐݸ Øhe cache¹¬Öר a×ÝÑÔØÓØicaÐ ÐÝ ÓÔØiÑaи ÖegaÖdÐe×× Óf Øhe cache ÐiÒe ×iÞe× aÒd aÔÔÖÓach ×eek× ØÓ ÔÐace a ÔaÖeÒØ aÒd iØ× chiÐdÖeÒ ÓÒ Øhe di×k Ôage ×iÞe׸ bÙØ a××ÙÑiÒg ÒÓ ÔÖefeØchiÒgº ×aÑe Ôage¸ aÒd ØÓ ÔÐace adjaceÒØ Ðeaf ÒÓ de× ÓÒ Øhe ×aÑe ÅaiÒØaiÒiÒg b ÓØh רÖÙcØÙÖe× a× ÒeÛ keÝ× aÖe added Ôageº 1.3 Contributions of This Paper aÒd ÒÓ de× ×ÔÐiØ Ô Ó×e× ÔaÖØicÙÐaÖ chaÐÐeÒge׺ Ïe ÛiÐÐ ×hÓÛ Ìhi× ÔaÔ eÖ Ñake× Øhe fÓÐÐÓÛiÒg cÓÒØÖibÙØiÓÒ׺ FiÖר¸ Ûe · · hÓÛ ØÓ ÔÖÓ ce×× iÒ×eÖØiÓÒ× aÒd deÐeØiÓÒ× eÆcieÒØÐÝ iÒ b ÓØh ÔÖÓÔ Ó×e aÒd eÚaÐÙaØe FÖacØaÐ ÈÖefeØchiÒg B ¹ÌÖee× ´fÔB ¹ · di×k¹¬Öר aÒd cache¹¬Öר fÔB ¹ÌÖee׺ Ïe ×eÐecØ Øhe ÓÔØiÑaÐ ÌÖee×µ a× a ÒÓÚeÐ iÒdeÜ ×ØÖÙcØÙÖe ØhaØ ÓÔØiÑiÞe× b ÓØh cache ÒÓ de ×iÞe× iÒ b ÓØh aÔÔÖÓache× ØÓ ÑaÜiÑiÞe Øhe ÒÙÑbeÖ Óf aÒd di×k Ô eÖfÓÖÑaÒce ×iÑÙÐØaÒeÓÙ×Ðݺ ËecÓÒd¸ Ûe ÔÖe×eÒØ eÒØÖÝ ×ÐÓØ× iÒ a Ðeaf Ôage ÛhiÐe aÒaÐÝØicaÐÐÝ achieÚiÒg ×eaÖch deØaiÐed aÒaÐÝ×i× Óf Øhe fÙÒdaÑeÒØaÐ ØÖadeÓ«× b eØÛeeÒ Øhe · cache Ô eÖfÓÖÑaÒce ÛiØhiÒ ½¼± Óf Øhe b eרº di×k¹¬Öר aÒd Øhe cache¹¬Öר iÑÔÐeÑeÒØaØiÓÒ× Óf fÔB ¹ÌÖee׺ ÁdeaÐÐݸ b ÓØh Øhe di×k¹¬Öר aÒd Øhe cache¹¬Öר aÔÔÖÓache× ÏhiÐe Øhe Ô eÖfÓÖÑaÒce Óf each Óf Øhe×e iÑÔÐeÑeÒØaØiÓÒ× ÛÓÙÐd achieÚe ideÒØicaÐ daØa ÐaÝÓÙØ×¸ aÒd heÒce eÕÙiÚaÐeÒØ ÖeÑaiÒ× ×ÐighØÐÝ bia×ed ØÓÛaÖd iØ× ÓÖigiÒaÐ gÓaи b ÓØh ÚeÖ¹ · cache aÒd Á»Ç Ô eÖfÓÖÑaÒceº ÁÒ ÔÖacØice¸ hÓÛeÚeÖ¸ Øhe Ñi×¹ ×iÓÒ× Óf fÔB ¹ÌÖee× iÑÔÖÓÚe ÙÔ ÓÒ Øhe cache Ô eÖfÓÖÑaÒce · ÑaØch ØhaØ aÐÑÓר aÐÛaÝ× Ó ccÙÖ× b eØÛeeÒ Øhe ×iÞe Óf a cache¹ Óf di×k¹ÓÔØiÑiÞed B ¹ÌÖee× ´ÛiØhÓÙØ ×igÒi¬caÒØÐÝ degÖadiÒg ÓÔØiÑiÞed ×ÙbØÖee aÒd Øhe ×iÞe Óf a di×k Ôage ´iÒ addiØiÓÒ Á»Ç Ô eÖfÓÖÑaÒceµ
Recommended publications
  • Tree-Combined Trie: a Compressed Data Structure for Fast IP Address Lookup
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 6, No. 12, 2015 Tree-Combined Trie: A Compressed Data Structure for Fast IP Address Lookup Muhammad Tahir Shakil Ahmed Department of Computer Engineering, Department of Computer Engineering, Sir Syed University of Engineering and Technology, Sir Syed University of Engineering and Technology, Karachi Karachi Abstract—For meeting the requirements of the high-speed impact their forwarding capacity. In order to resolve two main Internet and satisfying the Internet users, building fast routers issues there are two possible solutions one is IPv6 IP with high-speed IP address lookup engine is inevitable. addressing scheme and second is Classless Inter-domain Regarding the unpredictable variations occurred in the Routing or CIDR. forwarding information during the time and space, the IP lookup algorithm should be able to customize itself with temporal and Finding a high-speed, memory-efficient and scalable IP spatial conditions. This paper proposes a new dynamic data address lookup method has been a great challenge especially structure for fast IP address lookup. This novel data structure is in the last decade (i.e. after introducing Classless Inter- a dynamic mixture of trees and tries which is called Tree- Domain Routing, CIDR, in 1994). In this paper, we will Combined Trie or simply TC-Trie. Binary sorted trees are more discuss only CIDR. In addition to these desirable features, advantageous than tries for representing a sparse population reconfigurability is also of great importance; true because while multibit tries have better performance than trees when a different points of this huge heterogeneous structure of population is dense.
    [Show full text]
  • Heaps a Heap Is a Complete Binary Tree. a Max-Heap Is A
    Heaps Heaps 1 A heap is a complete binary tree. A max-heap is a complete binary tree in which the value in each internal node is greater than or equal to the values in the children of that node. A min-heap is defined similarly. 97 Mapping the elements of 93 84 a heap into an array is trivial: if a node is stored at 90 79 83 81 index k, then its left child is stored at index 42 55 73 21 83 2k+1 and its right child at index 2k+2 01234567891011 97 93 84 90 79 83 81 42 55 73 21 83 CS@VT Data Structures & Algorithms ©2000-2009 McQuain Building a Heap Heaps 2 The fact that a heap is a complete binary tree allows it to be efficiently represented using a simple array. Given an array of N values, a heap containing those values can be built, in situ, by simply “sifting” each internal node down to its proper location: - start with the last 73 73 internal node * - swap the current 74 81 74 * 93 internal node with its larger child, if 79 90 93 79 90 81 necessary - then follow the swapped node down 73 * 93 - continue until all * internal nodes are 90 93 90 73 done 79 74 81 79 74 81 CS@VT Data Structures & Algorithms ©2000-2009 McQuain Heap Class Interface Heaps 3 We will consider a somewhat minimal maxheap class: public class BinaryHeap<T extends Comparable<? super T>> { private static final int DEFCAP = 10; // default array size private int size; // # elems in array private T [] elems; // array of elems public BinaryHeap() { .
    [Show full text]
  • Binary Search Tree
    ADT Binary Search Tree! Ellen Walker! CPSC 201 Data Structures! Hiram College! Binary Search Tree! •" Value-based storage of information! –" Data is stored in order! –" Data can be retrieved by value efficiently! •" Is a binary tree! –" Everything in left subtree is < root! –" Everything in right subtree is >root! –" Both left and right subtrees are also BST#s! Operations on BST! •" Some can be inherited from binary tree! –" Constructor (for empty tree)! –" Inorder, Preorder, and Postorder traversal! •" Some must be defined ! –" Insert item! –" Delete item! –" Retrieve item! The Node<E> Class! •" Just as for a linked list, a node consists of a data part and links to successor nodes! •" The data part is a reference to type E! •" A binary tree node must have links to both its left and right subtrees! The BinaryTree<E> Class! The BinaryTree<E> Class (continued)! Overview of a Binary Search Tree! •" Binary search tree definition! –" A set of nodes T is a binary search tree if either of the following is true! •" T is empty! •" Its root has two subtrees such that each is a binary search tree and the value in the root is greater than all values of the left subtree but less than all values in the right subtree! Overview of a Binary Search Tree (continued)! Searching a Binary Tree! Class TreeSet and Interface Search Tree! BinarySearchTree Class! BST Algorithms! •" Search! •" Insert! •" Delete! •" Print values in order! –" We already know this, it#s inorder traversal! –" That#s why it#s called “in order”! Searching the Binary Tree! •" If the tree is
    [Show full text]
  • 6.172 Lecture 19 : Cache-Oblivious B-Tree (Tokudb)
    How TokuDB Fractal TreeTM Indexes Work Bradley C. Kuszmaul Guest Lecture in MIT 6.172 Performance Engineering, 18 November 2010. 6.172 —How Fractal Trees Work 1 My Background • I’m an MIT alum: MIT Degrees = 2 × S.B + S.M. + Ph.D. • I was a principal architect of the Connection Machine CM-5 super­ computer at Thinking Machines. • I was Assistant Professor at Yale. • I was Akamai working on network mapping and billing. • I am research faculty in the SuperTech group, working with Charles. 6.172 —How Fractal Trees Work 2 Tokutek A few years ago I started collaborating with Michael Bender and Martin Farach-Colton on how to store data on disk to achieve high performance. We started Tokutek to commercialize the research. 6.172 —How Fractal Trees Work 3 I/O is a Big Bottleneck Sensor Query Systems include Sensor Disk Query sensors and Sensor storage, and Query want to perform Millions of data elements arrive queries on per second Query recently arrived data using indexes. recent data. Sensor 6.172 —How Fractal Trees Work 4 The Data Indexing Problem • Data arrives in one order (say, sorted by the time of the observation). • Data is queried in another order (say, by URL or location). Sensor Query Sensor Disk Query Sensor Query Millions of data elements arrive per second Query recently arrived data using indexes. Sensor 6.172 —How Fractal Trees Work 5 Why Not Simply Sort? • This is what data Data Sorted by Time warehouses do. • The problem is that you Sort must wait to sort the data before querying it: Data Sorted by URL typically an overnight delay.
    [Show full text]
  • B-Trees M-Ary Search Tree Solution
    M-ary Search Tree B-Trees • Maximum branching factor of M • Complete tree has height = Section 4.7 in Weiss # disk accesses for find : Runtime of find : 2 Solution: B-Trees B-Trees • specialized M-ary search trees What makes them disk-friendly? • Each node has (up to) M-1 keys: 1. Many keys stored in a node – subtree between two keys x and y contains leaves with values v such that • All brought to memory/cache in one access! 3 7 12 21 x ≤ v < y 2. Internal nodes contain only keys; • Pick branching factor M Only leaf nodes contain keys and actual data such that each node • The tree structure can be loaded into memory takes one full irrespective of data object size {page, block } x<3 3≤x<7 7≤x<12 12 ≤x<21 21 ≤x • Data actually resides in disk of memory 3 4 B-Tree: Example B-Tree Properties ‡ B-Tree with M = 4 (# pointers in internal node) and L = 4 (# data items in leaf) – Data is stored at the leaves – All leaves are at the same depth and contain between 10 40 L/2 and L data items – Internal nodes store up to M-1 keys 3 15 20 30 50 – Internal nodes have between M/2 and M children – Root (special case) has between 2 and M children (or root could be a leaf) 1 2 10 11 12 20 25 26 40 42 AB xG 3 5 6 9 15 17 30 32 33 36 50 60 70 Data objects, that I’ll Note: All leaves at the same depth! ignore in slides 5 ‡These are technically B +-Trees 6 1 Example, Again B-trees vs.
    [Show full text]
  • AVL Trees and Rotations
    / AVL trees and rotations Q1 Operations (insert, delete, search) are O(height) Tree height is O(log n) if perfectly balanced ◦ But maintaining perfect balance is O(n) Height-balanced trees are still O(log n) ◦ For T with height h, N(T) ≤ Fib(h+3) – 1 ◦ So H < 1.44 log (N+2) – 1.328 * AVL (Adelson-Velskii and Landis) trees maintain height-balance using rotations Are rotations O(log n)? We’ll see… / or = or \ Different representations for / = \ : Just two bits in a low-level language Enum in a higher-level language / Assume tree is height-balanced before insertion Insert as usual for a BST Move up from the newly inserted node to the lowest “unbalanced” node (if any) ◦ Use the balance code to detect unbalance - how? Do an appropriate rotation to balance the sub-tree rooted at this unbalanced node For example, a single left rotation: Two basic cases ◦ “See saw” case: Too-tall sub-tree is on the outside So tip the see saw so it’s level ◦ “Suck in your gut” case: Too-tall sub-tree is in the middle Pull its root up a level Q2-3 Unbalanced node Middle sub-tree attaches to lower node of the “see saw” Diagrams are from Data Structures by E.M. Reingold and W.J. Hansen Q4-5 Unbalanced node Pulled up Split between the nodes pushed down Weiss calls this “right-left double rotation” Q6 Write the method: static BalancedBinaryNode singleRotateLeft ( BalancedBinaryNode parent, /* A */ BalancedBinaryNode child /* B */ ) { } Returns a reference to the new root of this subtree.
    [Show full text]
  • Tries and String Matching
    Tries and String Matching Where We've Been ● Fundamental Data Structures ● Red/black trees, B-trees, RMQ, etc. ● Isometries ● Red/black trees ≡ 2-3-4 trees, binomial heaps ≡ binary numbers, etc. ● Amortized Analysis ● Aggregate, banker's, and potential methods. Where We're Going ● String Data Structures ● Data structures for storing and manipulating text. ● Randomized Data Structures ● Using randomness as a building block. ● Integer Data Structures ● Breaking the Ω(n log n) sorting barrier. ● Dynamic Connectivity ● Maintaining connectivity in an changing world. String Data Structures Text Processing ● String processing shows up everywhere: ● Computational biology: Manipulating DNA sequences. ● NLP: Storing and organizing huge text databases. ● Computer security: Building antivirus databases. ● Many problems have polynomial-time solutions. ● Goal: Design theoretically and practically efficient algorithms that outperform brute-force approaches. Outline for Today ● Tries ● A fundamental building block in string processing algorithms. ● Aho-Corasick String Matching ● A fast and elegant algorithm for searching large texts for known substrings. Tries Ordered Dictionaries ● Suppose we want to store a set of elements supporting the following operations: ● Insertion of new elements. ● Deletion of old elements. ● Membership queries. ● Successor queries. ● Predecessor queries. ● Min/max queries. ● Can use a standard red/black tree or splay tree to get (worst-case or expected) O(log n) implementations of each. A Catch ● Suppose we want to store a set of strings. ● Comparing two strings of lengths r and s takes time O(min{r, s}). ● Operations on a balanced BST or splay tree now take time O(M log n), where M is the length of the longest string in the tree.
    [Show full text]
  • Binary Tree Fall 2017 Stony Brook University Instructor: Shebuti Rayana [email protected] Introduction to Tree
    CSE 230 Intermediate Programming in C and C++ Binary Tree Fall 2017 Stony Brook University Instructor: Shebuti Rayana [email protected] Introduction to Tree ■ Tree is a non-linear data structure which is a collection of data (Node) organized in hierarchical structure. ■ In tree data structure, every individual element is called as Node. Node stores – the actual data of that particular element and – link to next element in hierarchical structure. Tree with 11 nodes and 10 edges Shebuti Rayana (CS, Stony Brook University) 2 Tree Terminology ■ Root ■ In a tree data structure, the first node is called as Root Node. Every tree must have root node. In any tree, there must be only one root node. Root node does not have any parent. (same as head in a LinkedList). Here, A is the Root node Shebuti Rayana (CS, Stony Brook University) 3 Tree Terminology ■ Edge ■ The connecting link between any two nodes is called an Edge. In a tree with 'N' number of nodes there will be a maximum of 'N-1' number of edges. Edge is the connecting link between the two nodes Shebuti Rayana (CS, Stony Brook University) 4 Tree Terminology ■ Parent ■ The node which is predecessor of any node is called as Parent Node. The node which has branch from it to any other node is called as parent node. Parent node can also be defined as "The node which has child / children". Here, A is Parent of B and C B is Parent of D, E and F C is the Parent of G and H Shebuti Rayana (CS, Stony Brook University) 5 Tree Terminology ■ Child ■ The node which is descendant of any node is called as CHILD Node.
    [Show full text]
  • Red-Black Trees
    Red-Black Trees 1 Red-Black Trees balancing binary search trees relation with 2-3-4 trees 2 Insertion into a Red-Black Tree algorithm for insertion an elaborate example of an insert inserting a sequence of numbers 3 Recursive Insert Function pseudo code MCS 360 Lecture 36 Introduction to Data Structures Jan Verschelde, 20 April 2020 Introduction to Data Structures (MCS 360) Red-Black Trees L-36 20 April 2020 1 / 36 Red-Black Trees 1 Red-Black Trees balancing binary search trees relation with 2-3-4 trees 2 Insertion into a Red-Black Tree algorithm for insertion an elaborate example of an insert inserting a sequence of numbers 3 Recursive Insert Function pseudo code Introduction to Data Structures (MCS 360) Red-Black Trees L-36 20 April 2020 2 / 36 Binary Search Trees Binary search trees store ordered data. Rules to insert x at node N: if N is empty, then put x in N if x < N, insert x to the left of N if x ≥ N, insert x to the right of N Balanced tree of n elements has depth is log2(n) ) retrieval is O(log2(n)), almost constant. With rotation we make a tree balanced. Alternative to AVL tree: nodes with > 2 children. Every node in a binary tree has at most 2 children. A 2-3-4 tree has nodes with 2, 3, and 4 children. A red-black tree is a binary tree equivalent to a 2-3-4 tree. Introduction to Data Structures (MCS 360) Red-Black Trees L-36 20 April 2020 3 / 36 a red-black tree red nodes have hollow rings 11 @ 2 14 @ 1 7 @ 5 8 Four invariants: 1 A node is either red or black.
    [Show full text]
  • Balanced Binary Search Trees – AVL Trees, 2-3 Trees, B-Trees
    Balanced binary search trees – AVL trees, 2‐3 trees, B‐trees Professor Clark F. Olson (with edits by Carol Zander) AVL trees One potential problem with an ordinary binary search tree is that it can have a height that is O(n), where n is the number of items stored in the tree. This occurs when the items are inserted in (nearly) sorted order. We can fix this problem if we can enforce that the tree remains balanced while still inserting and deleting items in O(log n) time. The first (and simplest) data structure to be discovered for which this could be achieved is the AVL tree, which is names after the two Russians who discovered them, Georgy Adelson‐Velskii and Yevgeniy Landis. It takes longer (on average) to insert and delete in an AVL tree, since the tree must remain balanced, but it is faster (on average) to retrieve. An AVL tree must have the following properties: • It is a binary search tree. • For each node in the tree, the height of the left subtree and the height of the right subtree differ by at most one (the balance property). The height of each node is stored in the node to facilitate determining whether this is the case. The height of an AVL tree is logarithmic in the number of nodes. This allows insert/delete/retrieve to all be performed in O(log n) time. Here is an example of an AVL tree: 18 3 37 2 11 25 40 1 8 13 42 6 10 15 Inserting 0 or 5 or 16 or 43 would result in an unbalanced tree.
    [Show full text]
  • Tree-To-Tree Neural Networks for Program Translation
    Tree-to-tree Neural Networks for Program Translation Xinyun Chen Chang Liu UC Berkeley UC Berkeley [email protected] [email protected] Dawn Song UC Berkeley [email protected] Abstract Program translation is an important tool to migrate legacy code in one language into an ecosystem built in a different language. In this work, we are the first to employ deep neural networks toward tackling this problem. We observe that program translation is a modular procedure, in which a sub-tree of the source tree is translated into the corresponding target sub-tree at each step. To capture this intuition, we design a tree-to-tree neural network to translate a source tree into a target one. Meanwhile, we develop an attention mechanism for the tree-to-tree model, so that when the decoder expands one non-terminal in the target tree, the attention mechanism locates the corresponding sub-tree in the source tree to guide the expansion of the decoder. We evaluate the program translation capability of our tree-to-tree model against several state-of-the-art approaches. Compared against other neural translation models, we observe that our approach is consistently better than the baselines with a margin of up to 15 points. Further, our approach can improve the previous state-of-the-art program translation approaches by a margin of 20 points on the translation of real-world projects. 1 Introduction Programs are the main tool for building computer applications, the IT industry, and the digital world. Various programming languages have been invented to facilitate programmers to develop programs for different applications.
    [Show full text]
  • Algorithm for Character Recognition Based on the Trie Structure
    University of Montana ScholarWorks at University of Montana Graduate Student Theses, Dissertations, & Professional Papers Graduate School 1987 Algorithm for character recognition based on the trie structure Mohammad N. Paryavi The University of Montana Follow this and additional works at: https://scholarworks.umt.edu/etd Let us know how access to this document benefits ou.y Recommended Citation Paryavi, Mohammad N., "Algorithm for character recognition based on the trie structure" (1987). Graduate Student Theses, Dissertations, & Professional Papers. 5091. https://scholarworks.umt.edu/etd/5091 This Thesis is brought to you for free and open access by the Graduate School at ScholarWorks at University of Montana. It has been accepted for inclusion in Graduate Student Theses, Dissertations, & Professional Papers by an authorized administrator of ScholarWorks at University of Montana. For more information, please contact [email protected]. COPYRIGHT ACT OF 1 9 7 6 Th is is an unpublished manuscript in which copyright sub ­ s i s t s , Any further r e p r in t in g of its contents must be approved BY THE AUTHOR, Ma n sfield L ibrary U n iv e r s it y of Montana Date : 1 987__ AN ALGORITHM FOR CHARACTER RECOGNITION BASED ON THE TRIE STRUCTURE By Mohammad N. Paryavi B. A., University of Washington, 1983 Presented in partial fulfillment of the requirements for the degree of Master of Science University of Montana 1987 Approved by lairman, Board of Examiners iean, Graduate School UMI Number: EP40555 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted.
    [Show full text]