The Journal of the Institute of Image Electronics Engineers of Japan Vol.47 No.4 (2018)
IIEEJ ☠ᄚ -- VC2018特集 --
อकతγϟυϚοϓΛ༻͍ͨ Frustum Traced Shadows ͷߴԽ
† † ٢༤հ ʢਖ਼ձһʣ℄ ஐതޱߔ
ࣜձࣾεΫΣΞɾΤχοΫεג †
Accelerating Frustum Traced Shadows Using Conservative Shadow Maps
Tomohiro MIZOKUCHI†, Yusuke TOKUYOSHI† (Member)
† SQUARE ENIX CO., LTD.
ʪ͋Β·͠ʫ ຊจ Frustum Traced Shadows ๏ʹΑΔϋʔυγϟυͷܭࢉΛߴԽ͢ΔඳըύΠϓϥΠϯΛ ఏҊ͢Δɽ͜ͷ Frustum Traced Shadows ๏ۙήʔϜͷΑ͏ͳϦΞϧλΠϜΞϓϦέʔγϣϯͰ༻͍ΒΕ͍ͯ Δ͕ɼγϟυϚοϓ๏ͱൺΔͱܭࢉίετ͕ߴ͍ͱ͍͏͕͋Δɽͦ͜ͰຊจͰ͜ͷ Frustum Traced Shadows ๏ͷύΠϓϥΠϯʹอकతγϟυϚοϓΛ༻͍ͨӨͷఆॲཧΛΈࠐΈɼೋஈ֊ͷӨͷఆΛߦ͏͜ ଘख๏ΑΓਫ਼ͷߴ͍อकతγϟυϚοϓͷ࣮ʹ͍ͭͯड़Δɽ4K ͷεطͱͰߴԽ͢Δɽ͞Βʹຊจ Δͱ 2.4 ഒӨ͢ۉΫϦʔϯղ૾ʹ͓͚Δ࣮ݧͷ݁ՌɼߴԽͷ߹͍γʔϯʹΑͬͯΒ͖͕ͭ͋Δͷͷฏ ͷඳըΛ্Ͱ͖Δ͜ͱ͕͔֬ΊΒΕͨɽ Ωʔϫʔυɿϋʔυγϟυ, irregular z-buffers, อकతγϟυϚοϓ
͏ɽ͔͠͠ͳ͕Β͜ͷϑϥελϜτϨʔγϯάʹΑΔӨͷ 1. ͡Ίʹ ࢉܭ͕ଟ͍ͱܗఆγΣʔσΟϯάͱγʔϯͷࡾ֯ Wyman Β1)͕ఏҊͨ͠ Frustum Traced Shadows ๏Ϩ ίετ͕ߴ͘ͳͬͯ͠·͏ͱ͍͏͕͋Δɽͦ͜Ͱຊ ΠτϨʔγϯάͱಉ࣭ͷϋʔυγϟυΛϦΞϧλΠϜ จͰ Frustum Traced Shadows ๏ͷύΠϓϥΠϯʹอक 6) Ͱඳը͢Δٕज़Ͱ͋ΓɼۙͰ NVIDIA Corporation ͕ తγϟυϚοϓ ΛΈࠐΉ͜ͱͰਤ 1 ʹࣔ͢ೋஈ֊ͷ Δ NVIDIA GameWorks Ͱར༻͢Δ͜ͱ͕Ͱ͖Δ2)ɽ ӨͷఆΛߦ͏ɽอकతγϟυϚοϓඞͣӨͱͳΔอ͢ڙఏ ͜ͷख๏ Ubisoft Entertainment S.A. ͷ Tom Clancy’s कతͳਂΛอଘͨ͠γϟυϚοϓͰ͋Δɽ௨ৗͷγϟ The DivisionTM ͳͲͷήʔϜͰ༻͞Ε͍ͯΔ3)͕ɼ௨ৗ υϚοϓ͕Ө͔൱͔ͷఆʹޡࠩΛ࣋ͭͷʹର͠ɼอक ͷγϟυϚοϓ๏4)ͱൺΔͱॲཧ͕࣌ؒେ͖͍ͱ͍͏ తγϟυϚοϓඞͣӨͰ͋Δ͜ͱ͕อূ͞ΕͨྖҬΛ ͕͋Δɽͦ͜Ͱຊจ͜ͷ Frustum Traced Shadows ෦తʹݕग़͢Δ͜ͱ͕Ͱ͖Δɽ͜ͷอकతγϟυϚο ๏ΛߴԽ͢ΔӨͷඳըύΠϓϥΠϯΛఏҊ͢Δɽ ϓݩʑ੩తͳγʔϯʹ͓͍ͯγϟυϨΠͷΛݮ͢ Frustum Traced Shadows ๏Ͱ Irregular Z-Buffer ΔͨΊʹ։ൃ͞ΕͨͷͰ͋Δ͕ɼຊจͰಈతͳγʔ IZB)1)ʹޫݯ͔ΒݟͨγΣʔσΟϯάΛ Per-Pixel ϯʹ͓͍ͯ IZB ͷςΫηϧʹ֨ೲ͞ΕͨγΣʔσΟϯά) Linked Lists5)Λ༻͍ͯอଘ͠ɼ֤γΣʔσΟϯάʹର ΛΧϦϯά͢ΔͨΊʹ༻͍Δɽͭ·Γ·ͣ͜ͷอकతγϟ ϑϥελϜτϨʔγϯάʹΑΔݫີͳӨͷఆΛߦ υϚοϓΛ༻͍ͨૈ͍ӨͷఆͰγΣʔσΟϯάΛΧͯ͠
418 The Journal of the Institute of Image Electronics Engineers of Japan Vol.47 No.4 (2018)
(a) Ұஈ֊ͷӨͷఆ (b) ೋஈ֊ͷӨͷఆ (c) ϨϯμϦϯά݁Ռ ਤ 1 ຊจͷϋʔυγϟυͷඳըύΠϓϥΠϯͰग़ྗͨ͠ visibility mask buffer (aɼb) ͱ࠷ऴϨϯμϦϯά݁Ռ (c) Fig. 1 Visibility mask buffer (aɼb) and final result (c) rendered using our hard shadow rendering pipeline
ڥද 1 ਤ 1 ͷ࣮ߦ Table 1 Execution environment of Fig. 1 279kܗࡾ֯ IZB ͷղ૾ 2048×2048 εΫϦʔϯղ૾ 3840×2160 GPU NVIDIA GeForce GTX 1070
Ϧϯά͠ʢਤ 1(a)ʣɼ࣍ʹΧϦϯά͞Εͳ͔ͬͨγΣʔσΟ ϯάʹͷΈϑϥελϜτϨʔγϯάʹΑΔӨͷఆΛߦ ࢉίετΛݮ͢Δʢਤ 1(b)ʣɽද 1 ʹ࣮ࣔ͢ߦܭ͜ͱͰ͏ ԼͰਤ 1 ͷγʔϯΛϨϯμϦϯά͢Δ߹ɼ͜ͷೋஈڥ Ͱશܗࢉ࣌ؒΛ 7.4 ms ͔Β 3.7 ms ·Ͱ ਤ 2 ඞͣӨͱͳΔγΣʔσΟϯάʢࠨʣͱࡾ֯ܭͷӨͷఆʹΑͬͯ֊ ʹ෴ΘΕͨςΫηϧʢӈʣ ݮΒ͢͜ͱ͕Ͱ͖ΔɽWyman Βͷ Frustum Traced Shad- Fig. 2 Fully shadowed shading points (left) and a texel (ows ๏Ͱ IZB ʹ֨ೲ͞ΕΔγΣʔσΟϯάΛ༻͍ͯࡾ fully covered by a triangle (right ΛΧϦϯά͢ΔͷͰɼຊจ͕ఏҊ͢ΔγΣʔσΟϯܗ֯ ΑΓଟ͘ΧϦϯάܗάͷΧϦϯάʹΑͬͯ͞Βʹࡾ֯ 2. อकతγϟυϚοϓ ͢Δ͜ͱ͕Ͱ͖ΔΑ͏ʹͳΔɽ·ͨຊจͰγΣʔσΟ ϯάͷݮΛΑΓ্ͤ͞ΔͨΊɼਫ਼ͷߴ͍อकత ຊύΠϓϥΠϯͷྲྀΕΛઆ໌͢Δલʹ·ͣอकతγϟυ ͰશܗγϟυϚοϓͷ࣮ʹ͍ͭͯड़Δɽ͜ͷߴਫ਼ͳอ Ϛοϓʹ͍ͭͯड़Δɽਤ 2 Ͱࣔ͢Α͏ʹࡾ֯ Ͱ ʹःณ͞ΕͨྖҬʢϐϯΫͷྖҬʣʹؚ·ΕΔγΣʔσΟ(7ڀकతγϟυϚοϓͷ࣮ʹ͍ͭͯզʑͷઌߦݚ ެ։͞Ε͍ͯΔ͕ɼຊจͰΑΓৄࡉʹ࣮ʹ͍ͭͯड़ ϯάʢͷʣඞͣӨͱͳΔɽ͜ͷྖҬอकతͳਂ Δɽ·ͨຊจͰղ૾ͱߴղ૾ͷอकతγϟυ ʢ͍ͷਂʣΛ༻͍ͯදݱ͞ΕΔͨΊɼอकతγϟ Ϛοϓʹ͓͚Δ࣮ݧ݁Ռͷҧ͍Λࣔ͢͜ͱͰख๏ͷ༗ޮ υϚοϓʹ͜ͷਂΑΓԕ͍͕֨ೲ͞Εͳ͚Ε ੑʹ͍ͭͯٞ͢Δɽ ͳΒͳ͍ɽอकతγϟυϚοϓʹ͜ͷอकతͳਂΛग़ Ͱશʹ෴ΘΕͯܗWyman Βͷ࣮ͰϑϥελϜτϨʔγϯάΛߦ͏த ྗ͢Δʹ·ͣςΫηϧ͕୯Ұͷࡾ֯ 6) Δ͔൱͔Λఆ͢Δඞཁ͕͋ΔɽHertel Β ਤ 3(a) Ͱ͍ ͠ڈͰӨͱఆ͞ΕͨγΣʔσΟϯάΛஞ࣍ IZB ͔Βআ ɼຊύΠϓϥΠϯϑϥελϜτϨʔγϯάΛߦ ࣔ͢Α͏ʹอकతγϟυϚοϓͷը૾ۭؒʹӨͨ͠ࡾ͕͍ͨͯ ࢉ͢Δ͜ܭΛڑͷ֤ล͔ΒςΫηϧͷ֎ԁ·Ͱͷܗ֯ લʹγΣʔσΟϯάΛΧϦϯά͢ΔͷͰޮ͕ྑ͍ɽ͏ ܝʹ ͜ͷΧϦϯάͰ༻͍ΔอकతγϟυϚοϓ୯Ұͷࡾ֯ ͱʹΑͬͯ͜ͷఆΛߦͬͨʢHLSL ίʔυΛ A Ͱશʹ෴ΘΕͨςΫܗͰશʹ෴ΘΕͨςΫηϧʹอकతͳਂΛอଘ͢Δ͜ ࡌ͢Δʣɽ͔͜͠͠ͷख๏ࡾ֯ܗ ͱʹΑͬͯੜ͞ΕΔͨΊɼຊύΠϓϥΠϯอकతγϟ ηϧΛඞཁҎ্ʹอकతʹݕग़ͯ͠͠·͏ͱ͍͏͕͋ ଘࡏ͢Δγʔ Δɽͦ͜ͰຊจͰอकతγϟυϚοϓͷਫ਼Λ্͕ܗυϚοϓͷςΫηϧΑΓେ͖ͳࡾ֯ ܭଘख๏ͱ΄΅ಉͷطϯʹରͯ͠ಛʹޮՌతͰ͋Δɽ ͤ͞ΔͨΊʹɼΑΓݫີͳఆΛ Ҏ্Λ·ͱΊΔͱຊจͷߩݙ࣍ͷͱ͓ΓͰ͋Δɽ ࢉ࣌ؒͰߦ͏࣮ΛఏҊ͢Δɽ
Ͱશʹ෴ΘΕͨςΫηϧܗFrustum Traced Shadows ๏ͷύΠϓϥΠϯʹอकత 2.1 ࡾ֯ • γϟυϚοϓΛΈࠐΉ͜ͱͰӨͷඳըΛߴԽ͢Δɽ ਤ 4 ʹຊจͷอकతγϟυϚοϓͷϐΫηϧγΣʔ Ͱશʹ෴ΘΕͨςΫηϧΛݫີʹܗμʔΛࣔ͢ɽࡾ֯ • อकతγϟυϚοϓͷਫ਼Λ্ͤ͞Δ͜ͱͰ IZB ʹ ఆ͢ΔͨΊʹɼຊจͰਤ 3(b) Ͱࣔ͢Α͏ͳςΫηϧ ೲ͞ΕΔγΣʔσΟϯάΛΑΓଟ͘ΧϦϯά͢Δɽ֨ u, v, w) ͷ֤࣠ͷ࠷খʢ͢ͳΘͪࡾ) ܥͷ࢛۱ͷॏ৺࠲ඪ
419 The Journal of the Institute of Image Electronics Engineers of Japan Vol.47 No.4 (2018)
༻͍ͯੜ͍ͯ͠Δ͕ɼHLSL ͷγΣʔμʔϞσϧ 6.1 ͔Β ༻ՄೳͱͳΔ༧ఆͷ SV Barycentrics8)Λ༻͍Δ͜ͱͰ͜ ΕΒͷΛδΦϝτϦγΣʔμʔͰੜͤͣͱϐΫηϧ γΣʔμʔͰ༻͢Δ͜ͱ͕Ͱ͖ΔΑ͏ʹͳΔɽͨ͠ ͕ͬͯ͜ͷ࣮কདྷߴԽͰ͖ΔՄೳੑΛ͍࣋ͬͯΔɽ
2.2 อकతͳਂ (a) Hertel Βͷख๏ (b) ຊख๏ Өͨ͠ʹܗอकతγϟυϚοϓʹςΫηϧΛࡾ֯ Ͱશʹ෴ΘΕͨςΫηϧͷݕग़ํ๏ͷҧ͍ܗਤ 3 ࡾ֯ ͷਂΑΓԕ͍Λอଘ͢Δඞཁ͕͋Δʢਤ 2ʣɽ্ܗFig. 3 Difference between two detection techniques for ࢛֯ Ͱ͋Δͨܗฏߦ࢛ลܗa fully covered texel ແݶԕޫݯΛԾఆ͢Δͱ͜ͷ࢛֯ Ίɼର֯ઢϕΫτϧΑΓɼอଘ͖͢อकతͳਂͷԼݶ void main(float4 p : SV_Position, float2 s : BARYCENTRICS) { ΊΔ͜ͱ͕Ͱ͖Δɽٻfloat2 dx = ddx(s) * 0.5; zmin Λ float2 dy = ddy(s) * 0.5; ( ) ( ) ( ) ( ) float2a=dx+dy; max ∂z x,y + ∂z x,y , ∂z x,y − ∂z x,y float2b=dx-dy; ∂x ∂y ∂x ∂y zmin = z(x, y)+ (9) if(s.x < max(abs(a.x),abs(b.x)) || s.y < max(abs(a.y),abs(b.y)) 2 || 1.0 - s.x - s.y < max(abs(a.x + a.y),abs(b.x + b.y))) { ͷਂͰ͋ܗdiscard; ͜͜Ͱ z(x, y) ςΫηϧத৺ʹ͓͚Δࡾ֯ } } Δɽ͜ͷ zmin ϐΫηϧγΣʔμʔͰܭࢉ͢Δ͜ͱͰ ͖Δ͕ɼϐΫηϧγΣʔμʔͰܭࢉͨ͠ਂΛग़ྗ͢Δͱ ਤ 4 อकతγϟυϚοϓͷϐΫηϧγΣʔμʔʢHLSLʣ GPU ϥελϥΠβʔͷਂΧϦϯάΛ׆༻Ͱ͖ͳ͍ͱ͍ Fig. 4 Our pixel shader for conservative shadow maps (HLSL) ͏͕͋Δɽ͜ͷ Conservative Depth Output Λ Δ͜ͱͰܰݮ͢Δ͜ͱͰ͖Δ͕ຊจͰ࣮ͷ؆͍༻ ʣΛ༻͍Δɽແڑͷ֤ล͔ΒςΫηϧ·Ͱͷූ߸ܗ֯ ୯͞Λ༏ઌ͠ɼHertel Β6)ͱಉ༷ʹ GPU ϥελϥΠβʔ ݶԕޫݯΛԾఆ͢Δͱ͜ΕΒͷ࣍ͷࣜ (1)ʙ(3) Ͱ༩͑ ͕αϙʔτ͢Δ Slope Scaled Depth Bias Λ༻͢Δ͜ͱ ΒΕΔɽ Ͱ zmin Λۙࣅ͢ΔɽSlope Scaled Depth Bias Λ༻͍Δͱɼ
GPU ϥελϥΠβʔ͕ग़ྗ͢Δ zout ࣍ͷࣜ (10) Ͱ༩ u(x, y) − max(|au|, |bu|) (1) − | | | | ͑ΒΕΔɽ v(x, y) max( av , bv ) (2) − | | | | ∂z(x, y) ∂z(x, y) w(x, y) max( aw , bw ) (3) zout = z(x, y)+d max , (10) ∂x ∂y ͜͜Ͱ x, y อकతγϟυϚοϓͷςΫηϧத৺ͷҐ ͜͜Ͱ d Slope Scaled Depth Bias Ͱ͋Δɽd =1ʹઃ ஔɼͦͯ͠ [au,av,aw], [bu,bv,bw] ॏ৺࠲ඪܥʹ͓͚Δς ఆ͢Δ͜ͱͰ GPU ϥελϥΠβʔ zout ≥ zmin Λຬͨ Ϋηϧͷର֯ઢϕΫτϧͷ͞Λʹͨ͠ͷͰ͋Γɼ ͢อकతͳਂΛࣗಈతʹग़ྗ͢Δɽ s(x, y)=[u(x, y),v(x, y),w(x, y)] ͱ͢Δͱ࣍ͷࣜ (4)(5) Ͱ༩͑ΒΕΔɽ 3. ӨͷඳըύΠϓϥΠϯ
∂s(x, y) ∂s(x, y) ຊจͰ Frustum Traced Shadows ๏ͷύΠϓϥΠϯ [au,av,aw]= + (4) 2∂x 2∂y ʹ 2 ষͰड़ͨอकతγϟυϚοϓΛੜ͢Δॲཧͱɼ ∂s(x, y) ∂s(x, y) [b ,b ,b ]= − (5) ͦͷอकతγϟυϚοϓΛ༻͍ͨૈ͍ӨͷఆॲཧͱΛ u v w 2∂x 2∂y ৽ͨʹՃ͢Δ͜ͱͰ IZB ʹ֨ೲ͞ΕΔγΣʔσΟϯά ΛΧϦϯά͢Δɽ͜ͷΧϦϯάʹΑͬͯγΣʔσΟϯά ͔ܗࣜ (1)ʙ(3) ͕ͻͱͭͰ 0 ະຬͷͱ͖ςΫηϧࡾ֯ ϑϥܗͰશʹ෴Θ ͚ͩͰͳ͘ɼϑϥελϜτϨʔγϯάʹ͓͚Δࡾ֯ܗΒΈग़Δɽ͕ͨͬͯ͠ɼςΫηϧ͕ࡾ֯ Ε͍ͯΔ͔൱͔࣍ͷ݅ࣜ (6)ʙ(8) Ͱ༩͑ΒΕΔɽ άϝϯτͷݮ্ͤ͞Δ͜ͱ͕Ͱ͖Δɽ
u(x, y) ͍͓ͯʹ͜͜Ͱ w(x, y)=1− u(x, y) − v(x, y)ɼ|aw| = |au + ͨΊɼ྆ऀΛಉ͡ղ૾ʹઃఆ͢Δɽ͜ͷӨۭؒ av|, |bw| = |bu + bv| Ͱ͋ΔͷͰɼw ࣠ʹؔ͢Δఆ γΣʔσΟϯά͕Ө͔൱͔ΛอकతγϟυϚοϓͷਂ ֨ʹ ड़͢Δ͜ͱͰ͖Δɽݱࡏͷզʑ Λ༻͍ͯఆ͠ɼӨͱͳΔγΣʔσΟϯάΛ IZBه͍ͯ༻u(x, y),v(x, y) Λ ͷ࣮Ͱ͜ͷ u(x, y),v(x, y) ΛδΦϝτϦγΣʔμʔΛ ೲ͢ΔલʹΧϦϯά͢Δɽ͠γΣʔσΟϯά͕Өͱ 420 The Journal of the Institute of Image Electronics Engineers of Japan Vol.47 No.4 (2018) 4K ͷεΫϦʔϯղ૾ʢ3840×2160ʣʹରͯ͠ 2048×2048 ԼͰ࠷ྑ͍ڥղ૾ͷ IZB Λ༻͍ͨɽ͜ΕΒຊ࣮ݧ ύϑΥʔϚϯε͕ಘΒΕͨղ૾ͷΈ߹ΘͤͰ͋Δɽຊ ଘख๏1),3)ͱҧͬͯ IZB ͷׂۭؒΛߦ͍ͬͯطจͰ ଘख๏ͱಉطͳ͍͕ɼޫݯͷϏϡʔϑϥελϜͷେ͖͞ ༷ʹγΣʔσΟϯάͷόϯσΟϯάϘοΫεΛ༻͍ͯ ଘख๏ (b) ຊख๏ ܾఆͨ͠ɽط (a) ϑϥάϝϯτͷΧϦϯάͷҧ͍ܗਤ 5 ࡾ֯ ࢉ࣌ؒܭ Fig. 5 Difference of triangle fragment culling techniques 4.1 ଘͷطຊ࣮ݧͰ༻ͨ͠γʔϯΛਤ 6 ʹࣔ͢ɽͦͯ͠ ࢉ࣌ؒܭఆ͞Εͨ߹ɼ͜ͷγΣʔσΟϯάʹରԠ͢Δ visibility Frustum Traced Shadows ๏ͱຊख๏ͷ֤ύεͷ ʹࢉ࣌ؒΛൺֱ͢ΔͱߴԽͷ߹͍ܭmask buffer ͷϐΫηϧʹӨΛग़ྗ͠ʢਤ 1(a)ʣɼͦ͏Ͱͳ Λද 2 ʹࣔ͢ɽ Δͱ 2K ղ૾Ͱ 1.9 ഒɼ4K͢ۉ͚Ε IZB ͷϦετʹ͜ͷγΣʔσΟϯάΛՃ͢Δɽ ҧ͍͋Δͷͷɼฏ ͜ͷૈ͍Өͷఆ IZB ͷੜॲཧʹΦʔόʔϔουΛ ղ૾Ͱ 2.4 ഒӨͷඳը্͕͢Δ͜ͱ͕֬ೝ͞Ε ੜͤ͞Δ͕ɼγΣʔσΟϯά͕ݮ͞ΕΔϦετͷ ͨɽ͜ͷߴԽͷ߹͍ʹৼΕ෯͕͋ΔͷΧϦϯά͞Εൃ ͷେ͖͞ʹґଘ͢ΔͨΊܗߏஙίετ͕ܰݮ͞ΕΔͨΊɼ࣮ࡍʹੜ࣌ؒॖ͞ ΔγΣʔσΟϯάͷ͕ࡾ֯ ΕΔɽ Ͱ͋ΓɼอकతγϟυϚοϓͷςΫηϧΑΓେ͖ͳࡾ֯ 3.2 ϑϥελϜτϨʔγϯάʹΑΔӨͷఆ ܗʹΑͬͯःณ͞ΕΔγΣʔσΟϯά͕ଟ͍΄Ͳޮ͕ ϑϥελϜτϨʔγϯάʹΑΔӨͷఆͰγΣʔσΟ ྑ͘ͳΔ͔ΒͰ͋Δɽਤ 6(f) ͷγʔϯʹ͓͍ͯɼ2K ղ૾ ଘख๏ΑΓ͘ͳͬͯطࢉ͕࣌ؒܭܭΑͬͯःณ͞ΕΔ͔൱͔Λਖ਼֬ Ͱຊख๏ͷ߹ʹܗϯά͕γʔϯͷࡾ֯ ఆ͢Δɽ͜ͷϑϥελϜτϨʔγϯάޫݯ͔Βݟͨ ͍Δ͕ɼ4K ղ૾ͰΧϦϯάʹΑΔߴԽ͕อकతγϟʹ ΛϥελϥΠζ͠ɼϐΫηϧγΣʔμʔͰ IZB ʹ υϚοϓΛੜ͢ΔΦʔόʔϔουΛ্ճ͍ͬͯΔɽ3.1ܗࡾ֯ ೲ͞ΕͨγΣʔσΟϯάΛॱ࣍ࢀর͢Δ͜ͱͰߦΘ અͰड़ͨΑ͏ʹຊจͰ IZB ͱಉ͡ղ૾ʢ2K εΫ֨ × × ΕΔɽ୯७ͳ࣮ͩͱ͜ͷӨͷఆޫݯ͔Βݟ͑Δγʔ ϦʔϯͰ 1024 1024ɼ4K εΫϦʔϯͰ 2048 2048ʣͷอ ϑϥάϝϯτશͯʹର࣮ͯ͠ߦ͞Εͯ͠·͏ͨ कతγϟυϚοϓΛ༻͍ΔɽอकతγϟυϚοϓͷղܗϯͷࡾ֯ Ͱશʹ෴ΘΕͨςΫηϧ͕ଟ͘ͳܗΊɼWyman Β1) GPU ϥελϥΠβʔͰਂόοϑΝΛ ૾͕ߴ͍ͱɼࡾ֯ ϑϥάϝϯτΛΧϦϯά͠ɼߴԽΛߦͬͯ ΔͨΊγΣʔσΟϯάͷݮΛ্ͤ͞Δ͜ͱ͕Ͱ͖ܗࡾ͍֯ͯ༻ ଘखطࢉ͕࣌ؒܭΔɽ͜ͷਂόοϑΝਤ 5(a) Ͱࣔ͢Α͏ʹ IZB ʹ֨ Δɽ͕ͨͬͯ͠ 4K εΫϦʔϯʹ͓͚Δ͍ ೲ͞ΕΔγΣʔσΟϯάͷϦετͷதͰޫݯ͔Β࠷ԕ ๏ΑΓߴԽ͞Εͨͱߟ͑ΒΕΔɽ ͍ͷਂΛอଘ͢Δ͜ͱʹΑͬͯ࡞͞ΕΔɽຊύΠϓ 4.2 อकతγϟυϚοϓͷ࣮ߦ݁Ռ ϥΠϯͰ IZB ʹ֨ೲ͞ΕΔγΣʔσΟϯά͕ݮ͞ ද 3 ʹ 2048×2048 ղ૾ͷอकతγϟυϚοϓΛੜ ଘख๏ΑΓΑΓޫݯ ͨ͠ࡍͷ࣮ߦ݁ՌΛࣔ͢ɽγΣʔσΟϯάͷݮطʹΕΔͨΊɼਤ 5(b) Ͱࣔ͢Α͏ ܗେ͖ͳγʔϯʢਤ 6(b)ʣͰ 0.5%ɼࡾ͕֯ܗ͕ਂόοϑΝʹอଘ͞ΕΔɽ͕ͨͬͯ͠ຊ ൺֱతࡾ֯ਂ͍ۙʹ จ͕ఏҊ͢ΔγΣʔσΟϯάͷΧϦϯάΛ༻͍Δ͜ͱʹ ͕খ͞ͳγʔϯʢਤ 6(e)ʣͰ 1.6%ྑ͍݁ՌΛಘΔ͜ͱ͕ ϑϥάϝϯτ Ͱ͖ͨɽ͜ΕอकతγϟυϚοϓͷੜʹ͓͍ͯɼࡾܗΑͬͯγΣʔσΟϯά͚ͩͰͳ͘ɼࡾ֯ ʹଘख๏ΑΓݫີطͰશʹ෴ΘΕͨςΫηϧΛܗ֯ Wyman ΒΑΓଟ͘ΧϦϯά͢Δ͜ͱ͕Ͱ͖ΔΑ͏ʹͳ ΔɽຊύΠϓϥΠϯͰ͜ͷਂόοϑΝͷ࡞ IZB ͷ ఆͨͨ͠ΊͰ͋Δɽ·ͨݮ্͕͍ͯ͠ΔʹؔΘ ࢉ࣌ؒͰ࣮ߦ͢Δܭଘख๏ͱ΄΅ಉͷطੜ࣌ʹߦ͏ɽਂόοϑΝͷਂͷॻ͖ࠐΈ GPU Βͣɼຊख๏ ϥελϥΠβʔΛ༻͍ͯߦ͏ඞཁ͕͋ΔͷͰɼຊจͰ ͜ͱ͕Ͱ͖ͨɽݮͷ૿ՃʹΑΔશମͷॲཧͷߴԽ εΫϦʔϯͷ֤ϐΫηϧΛϙΠϯτϓϦϛςΟϒͱͯ͠ѻ Θ͔ͣͰ͋Δ͕࣮֬ʹ্͢Δ͜ͱ͕Ͱ͖ΔͷͰຊख ͏͜ͱͰγΣʔμʔΛ༻͍ͨ୯ҰͷύεͰ IZB ͱਂ ๏༗ޮͰ͋Δͱߟ͑ΒΕΔɽਤ 6(b) ͱਤ 6(e) ͷγʔϯ όοϑΝΛಉ࣌ʹ࡞͢Δɽ Ͱͷ࣮ߦ࣌ؒΛൺֱ͢Δͱ࣮ߦ࣌ؒʹ 4.3 ms ҧ͍͕͋ Δ͕ɼ֤γʔϯͷ 2K ղ૾ͱ 4K ղ૾ʹ͓͚Δอकత ࣮ݧ݁Ռ .4 ms ఔ͔͠ҧ͍͕ݟ 0.25 ۉγϟυϚοϓͷੜ࣌ؒฏ ఏҊͨ͠ύΠϓϥΠϯʹΑΔ࣮ݧ݁ՌΛࣔ͢ɽຊ࣮ݧͰ ΒΕͳ͍ɽ͜ΕอकతγϟυϚοϓͷੜίετ͕ϐ NVIDIA GeForce GTX 1070 GPU Λ༻͍ͯϋʔυγϟ ΫηϧγΣʔμʔͰͳ͘ɼόʔςοΫεγΣʔμʔɼδ ଌͨ͠ɽ࣮ݧʹ 2K ͷεΫϦʔϯղ ΦϝτϦγΣʔμʔɼͦͯ͠ϥελϥΠβʔʹେ͖͘ґଘܭࢉ࣌ؒΛܭυͷ ࢉίܭ͕ଟ͘ͳΔͱܗ૾ʢ1920×1080ʣʹରͯ͠ 1024×1024 ղ૾ͷ IZB Λɼ ͍ͯ͠Δ͜ͱΛ͓ࣔͯ͠Γɼࡾ֯ 421 The Journal of the Institute of Image Electronics Engineers of Japan Vol.47 No.4 (2018) (a) (b) (c) (d) (e) (f) ਤ 6 ຊ࣮ݧͰ༻ͨ͠γʔϯ Fig. 6 Scenes used in this experiment ࢉ࣌ؒʢmsʣܭද 2 ϋʔυγϟυͷ Table 2 Computation time of hard shadows (ms) : 12.8Mʣܗ:5.3Mʣ Power Plantʢࡾ֯ܗ: 279kʣ San Miguelʢࡾ֯ܗSponzaʢࡾ֯ (ຊ࣮ݧͰ༻ͨ͠γʔϯ ਤ 6(a) ਤ 6(b) ਤ 6(c) ਤ 6(d) ਤ 6(e) ਤ 6(f ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط 2K ղ૾ ݯͷϏϡʔϑϥελϜߏங 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03ޫ อकతγϟυϚοϓੜ – 0.17 – 0.20 – 1.89 – 1.90 – 4.34 – 4.30 IZB ߏங 0.74 0.43 0.38 0.28 0.63 0.32 0.70 0.42 0.30 0.26 0.30 0.27 ϑϥελϜτϨʔγϯά 1.11 0.54 3.51 0.29 10.67 8.18 21.15 8.74 21.68 9.96 12.62 10.24 ߹ܭ 1.88 1.17 3.92 0.80 11.33 10.42 21.88 11.09 22.01 14.59 12.95 14.84 ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط ଘख๏ ຊख๏ط 4K ղ૾ ݯͷϏϡʔϑϥελϜߏங 0.10 0.10 0.10 0.10 0.13 0.12 0.11 0.11 0.12 0.12 0.10 0.10ޫ อकతγϟυϚοϓੜ – 0.35 – 0.47 – 2.10 – 2.08 – 4.72 – 4.59 IZB ߏங 3.03 1.67 1.66 1.16 2.58 1.18 2.82 1.65 1.24 1.07 1.22 1.10 ϑϥελϜτϨʔγϯά 3.99 1.23 8.63 0.45 15.73 8.83 35.49 9.46 29.74 10.29 15.33 10.79 ߹ܭ 7.12 3.35 10.39 2.18 18.44 12.23 38.42 13.30 31.10 16.20 16.65 16.58 ද 3 อकతγϟυϚοϓͷ࣮ߦ݁Ռ 21.5 ms ⇒ 10.2 ms 8.9 ms ⇒ 10.9 ms Table 3 Rendering results of conservative shadow maps ਤ 6(b) ͷγʔϯ ਤ 6(e) ͷγʔϯ ੜ࣌ؒ ݮ ੜ࣌ؒ ݮ Hertel Βͷख๏ 0.48 ms 55.5% 4.75 ms 35.8% ຊख๏ 0.47 ms 56.0% 4.72 ms 37.4% (a) ޮతͳέʔε (b) ޮతͰͳ͍έʔε :6.7Mʣܗετ͕େ͖͘ͳΔɽ͜ͷΦʔόʔϔου 2.1 અͰड़ͨ ਤ 7 4K ղ૾ʹ͓͚Δ Rungholt Ϟσϧʢࡾ֯ ͷϨϯμϦϯά݁Ռ -Fig. 7 Rendering results of the Rungholt model (trian ͢ڈSV Barycentrics Λ༻͍ͯδΦϝτϦγΣʔμʔΛআ Δ͜ͱͰɼকདྷখ͘͞Ͱ͖Δͱߟ͑ΒΕΔɽ gle count: 6.7M) for 4K resolution 4.3 ࣦഊྫ ϓͷ྆ํͷղ૾͕ߴ͘ͳΔͱɼΑΓޮతʹϋʔυγϟ 7 ఏҊͨ͠ύΠϓϥΠϯਤ (a) ͷ߹ޮతͰ͋Δ͕ɼ υΛඳը͢Δ͜ͱ͕Ͱ͖Δɽैͬͯɼຊख๏ඞͣӨͱ ਤ 7(b) ͷ߹ैདྷͷύΠϓϥΠϯΑΓॲཧ͕࣌ؒେ͖ ͳΔྖҬ͕ଟ͍ͷΑ͏ͳγʔϯʹ͓͍ͯಛʹޮՌతͰ ͘ͳΔɽ͜Εਤ 7(b) ͷΑ͏ͳޫݯͷϏϡʔϑϥελϜ ͋Δͱߟ͑ΒΕΔɽຊύΠϓϥΠϯͰݱঢ়ɼอकతγϟ ༻υϚοϓΛੜ͢ΔͨΊʹδΦϝτϦγΣʔμʔΛ ͕ܗͷେ͖ͳγʔϯͩͱ IZB ͷςΫηϧΑΓখ͞ͳࡾ֯ ଟ͘ඳը͞ΕɼอकతγϟυϚοϓͷΦʔόʔϔου͕ ͍ͯ͠Δ͕ɼকདྷ HLSL ͷ SV Barycentrics ΛΘΓʹ༻ ظΧϦϯάʹΑΔߴԽΛ্ճͬͯ͠·͏͔ΒͰ͋Δɽ͜ͷ ͍Δ͜ͱͰੜΦʔόʔϔουΛখ͘͞Ͱ͖Δ͜ͱ͕ ʹର͢ΔղܾҊͱͯ͠ɼIZB ͷςΫηϧΑΓେ͖ͳ ͞ΕΔɽ ͛ΒڍͷΈอकతγϟυϚοϓʹඳը͢Δ͜ͱ͕ܗࡾ֯ 5.2 ࠓޙͷ՝ ΕΔɽ͜͏ͨ͠อकతγϟυϚοϓ༻ͷඳըΦϒδΣΫ • ιϑτγϟυͷԠ༻ɿຊจͰϋʔυγϟυ τͷࣗಈతͳબࠓޙͷ՝Ͱ͋Δɽ ͷ࣮ݧ݁ՌͷΈΛ͕ࣔͨ͠ɼఏҊͨ͠ύΠϓϥΠϯ 5. ݁ Frustum Traced Shadows ๏ʹΑΔίϯλΫτγϟυ 5.1 ·ͱΊ ͱγϟυϚοϐϯά๏ʹΑΔιϑτγϟυΛࠞ 3) ຊจͰɼFrustum Traced Shadows ๏ʹΑΔϋʔυ ߹ͨ͠ Hybrid Frustum Traced Shadows ๏ ʹద γϟυͷܭࢉΛߴԽ͢ΔඳըύΠϓϥΠϯΛఏҊ͠ ༻͢Δ͜ͱ͕Ͱ͖ΔɽͦͷͨΊιϑτγϟυͷ߹ ɽ4K ͷεΫϦʔϯղ૾Ͱ͜ͷύΠϓϥΠϯΛ༻͍Δ ͰίϯλΫτγϟυͷ෦ʹؔͯ͠ຊύΠϓϥͨ 2.4 ഒߴԽ͞ ΠϯΛ༻͍Δ͜ͱͰߴԽͰ͖Δͱߟ͑ΒΕΔɽۉࢉ͕࣌ؒฏܭ͜ͱͰϋʔυγϟυͷ ɿอकతγϟυϚοϓͷςΫηϧΑΓܗΕͨɽຊύΠϓϥΠϯεΫϦʔϯͱอकతγϟυϚο • খ͞ͳࡾ֯ 422 The Journal of the Institute of Image Electronics Engineers of Japan Vol.47 No.4 (2018) ͳ 5) J. C. Yang, J. Hensley, H. Gr¨un,N. Thibieroz: “Real-Time͞΅ٴΛڹอकతͳਂʹશ͘Өܗখ͞ͳࡾ֯ ΛอकతγϟυϚοϓʹඳը͢ Concurrent Linked List Construction on The GPU”, ComputerܗΊɼ͜ͷࡾ͍֯ͨ Graphics Forum, Vol. 29, Issue 4, pp.1297–1304 (2010). Δͱܭࢉίετ͕ߴ͘ͳͬͯ͠·͏ͱ͍͏͕͋Δɽ 6) S. Hertel, K. Hormann, R. Westermann: “A Hybrid GPU Ren- -͜ͷͷղܾํ๏ͱͯ͠ςΫηϧΑΓେ͖ͳࡾ֯ dering Pipeline for Alias-Free Hard Shadows”, Proc. of Euro ܗͷΈΛࣗಈతʹબͯ͠อकతγϟυϚοϓʹඳ graphics 2009 Areas Papers (2009). 7) Y. Tokuyoshi, T. Mizokuchi: “Conservative Z-Prepass for -ݕূ͍͖͍ͯͨ͠ɽ Frustum-Traced Irregular Z-Buffers”, Proc. of ACM SIGޙ͛ΒΕΔͷͰࠓڍը͢Δ͜ͱ͕ .(ଘख๏ͱҧ͍ɼຊจͰ GRAPH 2018 Posters, pp.34:1–34:2 (2018طIZB ͷӨۭؒͷׂɿ • 8) DirectX Shader Compiler, Shader Model 6.1, https://github.c ͷӨۭؒΛׂ͢Δख๏1),3)Λ༻͍ͯ͠ͳ͍ɽ IZB om/Microsoft/DirectXShaderCompiler/wiki/Shader-Model-6.1 .(࣮͍͖ͯ͠ (2018ޙࢉίετΛ͞Βʹݮ͢ΔͨΊʹࠓܭ ͱߟ͍͑ͯΔɽ͜ͷׂʹγΣʔσΟϯάͷ 9) Microsoft Corporation, SV InnerCoverage, https://docs.micro͍ͨ soft.com/en-us/windows/desktop/direct3dhlsl/sv-innercovera Λ༻͢Δ͕3)ɼอकతγϟυϚοϓʹΑΔΧ ge (2018). /ϦϯάΛ༻͍ͯແବͳγΣʔσΟϯάΛݮ͢Δ͜ 10) M. Dabrovic, Sponza Atrium, http://hdri.cgtechniques.com .(ͱ͕Ͱ͖ΔͷͰɼΑΓޮͷྑׂ͍Λߦ͑ΔͷͰ ∼sponza/files/ (2002 ͳ͍͔ͱߟ͑ΒΕΔɽ 11) F. Mainl, Crytek Sponza, https://www.crytek.com/cryengine /cryengine3/downloads (2010). Ͱશʹ෴Θ 12) University of North Carolina, Power Plant, http://gamma.csܗอकతγϟυϚοϓͷߴԽɿࡾ֯ • .(ΕͨςΫηϧͷݕग़ఏҊख๏Ҏ֎ʹ Conservative .unc.edu/Powerplant/ (1999 13) M. McGuire, Computer Graphics Archive, https://casual-effe Rasterization Tier 3 ରԠ GPUʢIntel HD 530 AMD cts.com/data (2017). RadeonTM RX Vega γϦʔζʣͰ͋ΕɼHLSL ͷ SV InnerCoverage9)Λ༻͍Δ͜ͱͰݕग़͢Δ͜ͱͰ A อकతγϟυϚοϓͷγΣʔμʔ Δɽ͜ͷػೳ୯७Ͱ͋Δ͕ɼҰํͰ Conservative Hertel Β6)ͷอकతγϟυϚοϓͷγΣʔμʔΛਤ͖ Rasterization Λ༻͠ͳͯ͘ͳΒͳ͍ͨΊɼϐΫη A. 1, A. 2 ʹࣔ͢ɽ͜ͷ࣮Ͱ·ͣอकతγϟυϚο ڑͷ͔Βରล·ͰͷܗϧγΣʔμʔͷ࣮ߦճ͕ଟ͘ͳͬͯ͠·͏ͱ͍͏ ϓʹӨͨ͠ը૾ۭؒͰࡾ֯ TM ࢉ͠ɼܭδΦϝτϦγΣʔμʔͰڑࢉ͢Δɽ͜ͷܭ͕͋ΔɽஶऀΒͷ࣋ͭ AMD Radeon RX Vega Λ ͕ϐΫηϧγΣʔμʔʹ͞ΕΔɽϐΫηͨؒ͠ิܗGPU ͰυϥΠόʔ͕·ͩରԠ͍ͯ͠ͳ͍ͨΊ͔ ઢ 56 ΑΓܘ͕ςΫηϧͷ֎ԁͷڑSV InnerCoverage ͕ਖ਼͘͠ಈ࡞͠ͳ͔ͬͨͷͰɼক ϧγΣʔμʔͰ͜ͷ Ͱશʹ෴ܗདྷ SV Barycentrics Λ༻͍ͨຊจͷ࣮ͱͲͪΒ͕ େ͖͍͔൱͔Λఆ͢Δ͜ͱʹΑͬͯࡾ֯ ߴʹಈ࡞͢Δ͔ݕূ͍͖͍ͯͨ͠ͱߟ͍͑ͯΔɽ ΘΕͨςΫηϧΛݕग़͢Δɽ struct Output{ ँࣙ float4 p : SV_Position; float3 d : DISTANCE; ຊจͷϙϦΰϯϞσϧʹ Marko Dabrovic ࢯͱ Frank }; [maxvertexcount(3)] 10),11) Meinl ࢯʹΑΔ Crytek Sponza Ϟσϧ ɼϊʔεΧϩϥ void main(triangle float4 inPos[3] : SV_Position, 12) inout TriangleStream 423 The Journal of the Institute of Image Electronics Engineers of Japan Vol.47 No.4 (2018) void main(float4 p : SV_Position, float3 d : DISTANCE) { float DIAG = 1.414213562373095; if (d.x < DIAG || d.y < DIAG || d.z < DIAG) { discard; } } ਤ A. 2 Hertel ΒͷอकతγϟυϚοϓͷϐΫηϧ γΣʔμʔʢHLSLʣ app.Fig. 2 Pixel shader for Hertel et al.’s conservative shadow mapsʢHLSLʣ (2018 7 ݄ 12 ड) ߔޱஐത 3 ݄ ՎࢁେֶେֶӃγεςϜֶݚ 2017 ࣜձࣾεΫΣΞɾΤג ݄ Պमྃɽಉ 4ڀ χοΫεʹೖࣾʢάϥϑΟοΫεΤϯδχΞʣɽ ʹ։ൃڀϦΞϧλΠϜϨϯμϦϯάʹؔ͢Δݚ ैࣄɽ ٢༤հʢਖ਼ձһʣ℄ Պγεڀݚܥ 3 ݄ ৴भେֶେֶӃֶ 2007 ՝ఔमྃɽത࢜ʢظޙςϜ։ൃֶઐ߈ത࢜ ࣜձཱࣾ࡞ॴγεςϜ։ג ݄ ʣɽಉ 4ֶ һʣɽ࠷దԽίϯύΠϥͷݚڀॴೖࣾʢݚڀݚൃ ࣜձࣾεΫΣג ݄ ։ൃʹैࣄɽ2010 7ڀ ΞɾΤχοΫεೖࣾʢγχΞϦαʔνϟʔʣɽά ϩʔόϧΠϧϛωʔγϣϯΛத৺ʹϨϯμϦϯ ຯΛ࣋ͭɽڵʹάٕज़ 424