
Complex Text on Simple Devices Pedro Navarro Sr. Software Engineer / Streaming Client Technologies 1 Background 2 Gibbon ● Codename for our JavaScriptCore-based application framework. It provides objects to create UI elements, access the device and perform video playback. ● Written in portable C++ targeted to the lowest common denominator ● Runs in Consumer Electronic devices (TVs, Blu-Ray players, Roku devices), Android TV and Game Consoles (from the Wii to the PS4 Pro) Netflix UI Powered by Gibbon since 2013 3 Constraints ● We are targeting devices with very low capabilities (128mb total RAM) and, in many cases, with a Read Only file system. ● We don’t control the host we are running on, so we have to be ready to work with old compilers and different versions, and combinations, of all third party libraries we use ● Very long release cycles: except for game consoles it takes 12 months from the time we provide our SDK until there are devices in the market with it. ● No upgrades! We are part of the device’s firmware. OLD DEVICE Luckily, we didn’t have to ● Small footprint. Our binaries and fonts have to be as small as support this one. possible, as flash storage is scarce ● Different graphics platforms. We need to run on DirectFB, OpenGL and Game Console graphics APIs. 4 2013’s Text Engine ● Our first iteration of Gibbon’s Text Layout Engine was very simple and provided just 1:1 mapping between characters in a text string and glyph indices in a font. ● The character set was WGL-4, which we extended later to add additional Latin glyphs. ● Support for CJK was added by introducing fall fallbacks. 2013 Supported Writing Systems ■ Latin (425 languages) ■ Cyrillic (106 languages) ■ Greek (1 language) ■ Chinese (5 languages) ■ Japanese (1 language) ■ Hangul (1 language) 5 Global Launch ● For the global launch we had to be ready well in advance because of the long release times. ● We defined our own Character Set (NGL-2), to standardize our fonts and our content. ● Supporting Complex Scripts meant that we had to integrate text shaping and BiDi processing. ● Research showed that the vast majority of devices we needed to support, besides game consoles, would be low performance set-top boxes. Global launch candidates: Indic writing systems: ■ Arabic (38 languages) ■ Bengali (5 languages) ● It’s important for us to get pixel fidelity between ■ Hebrew (9 languages) ■ Devanagari (19 languages) platforms, so the UI doesn’t have to account for ■ Ge’ez ■ Gurmukhi ■ Georgian ■ Gujarati (2 languages) differences. ■ Armenian ■ Kannada (4 languages) ■ Tibetan (4 languages) ■ Malayalam (2 languages) ■ Khmer ■ Oriya (2 languages) ■ Lao ■ Tamil ■ Thai ■ Telugu ■ Burmese 6 Global Text Layout Engine Features ● Font Handling ○ Modular fonts ○ Aliasing, fallbacks and substitution ○ Synthetic bold and italic support ● Text Shaping ○ Context based glyphs ○ Ligatures (substitution) ○ Positioning ○ Reordering ● Text Layout ○ Rich text ○ Bidirectional support ○ Line breaking (word wrapping) 7 Font fallbacks 8 Font fallbacks / Font linking ● Font linking automatically picks glyphs from other fonts, if not present in the active one, that offer the Unicode range where the missing glyph is. ● Font linking lets us ships fonts for each script as needed, making the deployments modular. ● Design points: ○ A writable file system is not guaranteed, so Fontconfig’s cache solution would not work for us. Fontconfig might not be available on the system so we would have to supply ours. ○ We are not generic: we control the fonts that are in our system and the content we are going to display. ○ We know the font we want to use for every writing system. 9 <settings> <aliases>Helvetica, Sans, serif</aliases> </settings> Font fallbacks <!-- Font file: Arial_for_Netflix-R.ttf Family: Arial_for_Netflix ● Build time: Style: Regular Glyph count: 919 At font licensing time, subset the font to leave only the glyphs --> for that particular writing system (no latin in CJK fonts, for <regular> <file>fonts/Arial_for_Netflix-R.ttf</file> example) plus the space (U+0020) <settings> <bbox>-136,-621+143x1864</bbox> Scan the fonts at build time and write to a configuration file <default_bbox>-1006,-665+2222x1864</default_bbox> which Unicode Blocks the font has glyphs in. </settings> <blocks> <block1>000000-00007F</block1> <!-- Basic Latin (95 characters) ● Run time: <block2>000080-0000FF</block2> <!-- Latin-1 Supplement (96 characters) <block3>000100-00017F</block3> <!-- Latin Extended-A (128 characters) When a glyph is not found in the current font, search for fonts <block4>000180-00024F</block4> <!-- Latin Extended-B (24 characters) that can supply the needed Unicode Block, sorted by language <block5>000250-0002AF</block5> <!-- IPA Extensions (9 characters) <block6>0002B0-0002FF</block6> <!-- Spacing Modifier Letters (9 characters) and priority. Keep going down the list until a match is found. <block7>000300-00036F</block7> <!-- Combining Diacritical Marks (10 characters) <block8>000370-0003FF</block8> <!-- Greek and Coptic (73 characters) Once a match is found, keep using the same font until a new <block9>000400-0004FF</block9> <!-- Cyrillic (122 characters) Unicode Block is needed. … <block26>00FE70-00FEFF</block26> <!-- Arabic Presentation Forms-B (1 character) </blocks> Spaces are always considered to be part of the current <languages>*-Latn,*-Grek,*-Cyrl</languages> Unicode Block, so we keep spacing consistent by using the <priority>200</priority> Font definition file </regular> Excerpt of our fonts.xml space glyph for each script’s font. configuration file 10 Text Layout 驩檤 <span color='yellow'>サ捯ひろ驚</span> 11 Attributes [00] 0 - 0: [00:000-000] Japanese 20 [LTR] [01] 1 - 2: [00:000-000] Traditional_Chinese 20 [LTR] [02] 3 - 3: [00:000-000] Japanese 20 [LTR] [03] 4 - 4: [00:000-000] Traditional_Chinese 20 [LTR] [04] 5 - 7: [00:000-000] Japanese 20 [LTR] Text direction runs Text Layout [00] 0 - 7: LTR (0:0-0) Embedding levels: 0 0 0 0 0 0 0 0 Visual map: 0 1 2 3 4 5 6 7 ● Infrastructure: Visual embeddings: 0 0 0 0 0 0 0 0 Text script runs [00] 0 - 2: Hani ICU for BiDi, Script categorization and line breaking. [01] 3 - 3: Kana [02] 4 - 4: Hani Freetype for rasterization. [03] 5 - 6: Hira [04] 7 - 7: Hani Harfbuzz for text shaping. Text locale runs [00] 0 - 7: ja ● Itemization: Word breaks [00] 0 - 0: |驩| White space is collapsed according to the HTML5 rules. [01] 1 - 2: |檤 | [02] 3 - 3: |サ| [03] 4 - 4: |捯| Fonts are resolved before shaping, so we shape the longest [04] 5 - 5: |ひ| possible run of the same font. We don’t fall back to the base [05] 6 - 6: |ろ| font for spaces. [06] 7 - 7: |驚| Line breaks Attributes We add synthesized bold and oblique styles to the list of [00] 0 - 0: [00:000-000] Japanese 20 [LTR] Hani available fonts. [01] 1 - 2: [00:000-000] Traditional_Chinese 20 [LTR] Hani [02] 3 - 3: [00:000-000] Japanese 20 [LTR] Kana We try to find locales, specified in a <span> or by inferring it [03] 4 - 4: [00:000-000] Traditional_Chinese 20 [LTR] Hani [04] 5 - 6: [00:000-000] Japanese 20 [LTR] Hira from the script, to use ICU’s dictionary based line breaking Sample text layout [05] 7 - 7: [00:000-000] Japanese 20 [LTR] Hani Debug information our itemizer when available. provides about a text object 驩檤 <span color='yellow'>サ捯ひろ驚</span> 12 Itemizer layout - Bounds: [0,0+146x22] - Desired: [0,0+300x200] Padding: [0x0] - Indent: 0 Mirror: false [00] Line: [0,0+146x22] | Dir: RTL | Padding: 0+0 [00] Text run: Bounds: [0,0+20x21] | Ascent: -18 Direction: LTR: [00:000-000] Buffer offsets: 0 - 0 Text Layout Buffer contents: gid6521=0 [01] Text run: Bounds: [20,1+26x21] | Ascent: -17 Direction: LTR: [00:000-000] ● Text layout: Buffer offsets: 0 - 1 Buffer contents: gid6606=1|gid3=2 [02] Text run: Bounds: [46,0+20x21] | Ascent: -18 Emphasis on being one-pass. We forget the text string as soon Direction: LTR: [00:000-000] as we have itemized it. Buffer offsets: 0 - 0 Buffer contents: gid134=3 Harfbuzz buffers are referenced by multiple “items”. Each [03] Text run: Bounds: [66,1+20x21] | Ascent: -17 Direction: LTR: [00:000-000] item has a harfbuzz buffer starting and ending offset. Buffer offsets: 0 - 0 Buffer contents: gid5179=4 We never shape text again. If we need to left/right trim, we [04] Text run: Bounds: [86,0+40x21] | Ascent: -18 operate directly on the items by modifying the offsets. For Direction: LTR: [00:000-000] Buffer offsets: 0 - 1 each font, we keep an in-memory codepoint to glyph index for Buffer contents: gid77=5|gid104=6 all spacing characters. [05] Text run: Bounds: [126,0+20x21] | Ascent: -18 Last word mark present: 0 We don’t support hyphenation or justification. Direction: LTR: [00:000-000] Buffer offsets: 0 - 0 Buffer contents: gid6515=7 For BiDi reordering we operate directly on the runs, as each Cache Reuse: 1[1]/0 (0/0) run has an embedding level property. DisplayList(0xdcdb2ee0) pixels=2,780 size=300x200: Text: txt:'驩檤 <span color='yellow'>サ捯ひ ろ驚</span>' A run can have any number of sub-runs associated with it, for emphasis marks or rubies. Sample text layout Debug information our itemizer Layouts are cached, as they are expensive, and a change in provides about a text object container attributes can trigger a relayout or reitemize. 13 Text Layout Facts ● We were able to fit in 128 mb devices, where we have only 20-30 mb available for our app. ● Text layout is, by far, the most expensive operation. Smart caching of text layouts helped us reach 30-45 fps when scrolling movie titles: ○ Try to never itemize text a second time. ○ When changing the container dimensions or alignment adjust the layout lines.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages21 Page
-
File Size-