Hyphenation for HTML

Mathias Nater [email protected] http://mnn.ch/

The T X hyphenation applied to HTML Motivation E layout w/o hyphenation About Frank M. Liangs hyphenation algorithm and its layout with hyphenation The TEX port to Javascript hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX Mathias Nater hyphenation algorithm (1983) [email protected] Creating the patterns (patgen) Using the patterns http://mnn.ch/ (hyphenation)

HTML and the soft

The Port to BachoT X 2010 Javascript E Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Organisation HTML Mathias Nater [email protected] Motivation http://mnn.ch/

Text layout without hyphenation Motivation Text layout with hyphenation layout w/o hyphenation layout with hyphenation

The TEX The TEX hyphenation algorithm hyphenation The original T X hyphenation algorithm (1977) algorithm E The original TEX hyphenation algorithm The current TEX hyphenation algorithm (1983) (1977) The current TEX Creating the patterns (patgen) hyphenation algorithm (1983) Using the patterns (hyphenation) Creating the patterns (patgen) Using the patterns HTML and the soft hyphen (hyphenation) HTML and the The Port to Javascript soft hyphen The Port to Server side or Client side? Javascript Server side or Client side? How it works How it works Differences and Differences and Improvements Improvements Back to the Future Back to the Future Hyphenation for Organisation HTML Mathias Nater [email protected] Motivation http://mnn.ch/

Text layout without hyphenation Motivation Text layout with hyphenation layout w/o hyphenation layout with hyphenation

The TEX The TEX hyphenation algorithm hyphenation The original T X hyphenation algorithm (1977) algorithm E The original TEX hyphenation algorithm The current TEX hyphenation algorithm (1983) (1977) The current TEX Creating the patterns (patgen) hyphenation algorithm (1983) Using the patterns (hyphenation) Creating the patterns (patgen) Using the patterns HTML and the soft hyphen (hyphenation) HTML and the The Port to Javascript soft hyphen The Port to Server side or Client side? Javascript Server side or Client side? How it works How it works Differences and Differences and Improvements Improvements Back to the Future Back to the Future Hyphenation for Organisation HTML Mathias Nater [email protected] Motivation http://mnn.ch/

Text layout without hyphenation Motivation Text layout with hyphenation layout w/o hyphenation layout with hyphenation

The TEX The TEX hyphenation algorithm hyphenation The original T X hyphenation algorithm (1977) algorithm E The original TEX hyphenation algorithm The current TEX hyphenation algorithm (1983) (1977) The current TEX Creating the patterns (patgen) hyphenation algorithm (1983) Using the patterns (hyphenation) Creating the patterns (patgen) Using the patterns HTML and the soft hyphen (hyphenation) HTML and the The Port to Javascript soft hyphen The Port to Server side or Client side? Javascript Server side or Client side? How it works How it works Differences and Differences and Improvements Improvements Back to the Future Back to the Future Hyphenation for Organisation HTML Mathias Nater [email protected] Motivation http://mnn.ch/

Text layout without hyphenation Motivation Text layout with hyphenation layout w/o hyphenation layout with hyphenation

The TEX The TEX hyphenation algorithm hyphenation The original T X hyphenation algorithm (1977) algorithm E The original TEX hyphenation algorithm The current TEX hyphenation algorithm (1983) (1977) The current TEX Creating the patterns (patgen) hyphenation algorithm (1983) Using the patterns (hyphenation) Creating the patterns (patgen) Using the patterns HTML and the soft hyphen (hyphenation) HTML and the The Port to Javascript soft hyphen The Port to Server side or Client side? Javascript Server side or Client side? How it works How it works Differences and Differences and Improvements Improvements Back to the Future Back to the Future Hyphenation for Organisation HTML Mathias Nater [email protected] http://mnn.ch/

need Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation word patgen algorithm list The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) hyphenation hyphenation Creating the patterns soft hyphen (patgen) patterns algorithm Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to hyphenator Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout without hyphenation HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation Current Browsers layout with hyphenation I MS IE 6/7/8 (∼ 44%) The TEX hyphenation algorithm I Firefox 3.5 (∼ 42%) The original TEX hyphenation algorithm I Safari 4 (∼ 4%) (1977) The current TEX hyphenation algorithm I Opera 10 (∼ 3%) (1983) Creating the patterns (patgen) do not hyphenate text automatically! Using the patterns (hyphenation) I align left: overfull boxes and unbalanced line endings HTML and the soft hyphen

I justified: big word spaces and rivers The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout without hyphenation HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation Current Browsers layout with hyphenation I MS IE 6/7/8 (∼ 44%) The TEX hyphenation algorithm I Firefox 3.5 (∼ 42%) The original TEX hyphenation algorithm I Safari 4 (∼ 4%) (1977) The current TEX hyphenation algorithm I Opera 10 (∼ 3%) (1983) Creating the patterns (patgen) do not hyphenate text automatically! Using the patterns (hyphenation) I align left: overfull boxes and unbalanced line endings HTML and the soft hyphen

I justified: big word spaces and rivers The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout without hyphenation HTML Mathias Nater [email protected] http://mnn.ch/

Motivation Current Browsers layout w/o hyphenation layout with hyphenation

I MS IE 6/7/8 (∼ 44%) The TEX hyphenation I Firefox 3.5 (∼ 42%) algorithm The original TEX hyphenation algorithm I Safari 4 (∼ 4%) (1977) The current TEX hyphenation algorithm I Opera 10 (∼ 3%) (1983) Creating the patterns do not hyphenate text automatically! (patgen) Using the patterns This leads to poor typography: (hyphenation) HTML and the I align left: overfull boxes and unbalanced line endings soft hyphen The Port to I justified: big word spaces and rivers Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout without hyphenation HTML Mathias Nater [email protected] http://mnn.ch/

Motivation Current Browsers layout w/o hyphenation layout with hyphenation

I MS IE 6/7/8 (∼ 44%) The TEX hyphenation I Firefox 3.5 (∼ 42%) algorithm The original TEX hyphenation algorithm I Safari 4 (∼ 4%) (1977) The current TEX hyphenation algorithm I Opera 10 (∼ 3%) (1983) Creating the patterns do not hyphenate text automatically! (patgen) Using the patterns This leads to poor typography: (hyphenation) HTML and the I align left: overfull boxes and unbalanced line endings soft hyphen The Port to I justified: big word spaces and rivers Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout without hyphenation HTML Mathias Nater [email protected] http://mnn.ch/

Motivation Current Browsers layout w/o hyphenation layout with hyphenation

I MS IE 6/7/8 (∼ 44%) The TEX hyphenation I Firefox 3.5 (∼ 42%) algorithm The original TEX hyphenation algorithm I Safari 4 (∼ 4%) (1977) The current TEX hyphenation algorithm I Opera 10 (∼ 3%) (1983) Creating the patterns do not hyphenate text automatically! (patgen) Using the patterns This leads to poor typography: (hyphenation) HTML and the I align left: overfull boxes and unbalanced line endings soft hyphen The Port to I justified: big word spaces and rivers Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout without hyphenation HTML text-align: left; Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout without hyphenation HTML text-align: justify; Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout with hyphenation HTML text-align: left; Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout with hyphenation HTML text-align: left; Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout with hyphenation HTML text-align: justify; Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout with hyphenation HTML text-align: justify; Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for HTML

Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation We need (automatic) algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation in hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns HTML! (hyphenation) HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The original TEX hyphenation algorithm HTML The original hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1977 by Donald E. Knuth and Franklin M. Liang hyphenation algorithm

The original TEX I for english only hyphenation algorithm (1977)

The current TEX I suffix and prefix removal hyphenation algorithm (1983) I vowel-consonant-consonant-vowel breaking Creating the patterns (patgen) Using the patterns I special case rules (e.g. “break after ck!”) (hyphenation) HTML and the I small exception dictionary soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The original TEX hyphenation algorithm HTML The original hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1977 by Donald E. Knuth and Franklin M. Liang hyphenation algorithm

The original TEX I for english only hyphenation algorithm (1977)

The current TEX I suffix and prefix removal hyphenation algorithm (1983) I vowel-consonant-consonant-vowel breaking Creating the patterns (patgen) Using the patterns I special case rules (e.g. “break after ck!”) (hyphenation) HTML and the I small exception dictionary soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The original TEX hyphenation algorithm HTML The original hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1977 by Donald E. Knuth and Franklin M. Liang hyphenation algorithm

The original TEX I for english only hyphenation algorithm (1977)

The current TEX I suffix and prefix removal hyphenation algorithm (1983) I vowel-consonant-consonant-vowel breaking Creating the patterns (patgen) Using the patterns I special case rules (e.g. “break after ck!”) (hyphenation) HTML and the I small exception dictionary soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The original TEX hyphenation algorithm HTML The original hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1977 by Donald E. Knuth and Franklin M. Liang hyphenation algorithm

The original TEX I for english only hyphenation algorithm (1977)

The current TEX I suffix and prefix removal hyphenation algorithm (1983) I vowel-consonant-consonant-vowel breaking Creating the patterns (patgen) Using the patterns I special case rules (e.g. “break after ck!”) (hyphenation) HTML and the I small exception dictionary soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The original TEX hyphenation algorithm HTML The original hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1977 by Donald E. Knuth and Franklin M. Liang hyphenation algorithm

The original TEX I for english only hyphenation algorithm (1977)

The current TEX I suffix and prefix removal hyphenation algorithm (1983) I vowel-consonant-consonant-vowel breaking Creating the patterns (patgen) Using the patterns I special case rules (e.g. “break after ck!”) (hyphenation) HTML and the I small exception dictionary soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The original TEX hyphenation algorithm HTML The original hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1977 by Donald E. Knuth and Franklin M. Liang hyphenation algorithm

The original TEX I for english only hyphenation algorithm (1977)

The current TEX I suffix and prefix removal hyphenation algorithm (1983) I vowel-consonant-consonant-vowel breaking Creating the patterns (patgen) Using the patterns I special case rules (e.g. “break after ck!”) (hyphenation) HTML and the I small exception dictionary soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The original TEX hyphenation algorithm HTML The original hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation 1977 by Donald E. Knuth and Franklin M. Liang The TEX hyphenation for english only algorithm I The original TEX hyphenation algorithm I suffix and prefix removal (1977) The current TEX hyphenation algorithm I vowel-consonant-consonant-vowel breaking (1983) Creating the patterns (patgen) I special case rules (e.g. “break after ck!”) Using the patterns (hyphenation) I small exception dictionary HTML and the soft hyphen

Found ∼ 40% of the allowable hyphen points with 1% error The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The current TEX hyphenation algorithm HTML The current hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1983 PhD thesis by Franklin M. Liang hyphenation algorithm

I use of hyphenation patterns The original TEX hyphenation algorithm two algorithms: (1977) I The current TEX hyphenation algorithm I pattern creation (patgen) (1983) Creating the patterns I applying the patterns (TEX) (patgen) Using the patterns (hyphenation) I support for a wide range of languages HTML and the I small, easy, fast soft hyphen The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The current TEX hyphenation algorithm HTML The current hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1983 PhD thesis by Franklin M. Liang hyphenation algorithm

I use of hyphenation patterns The original TEX hyphenation algorithm two algorithms: (1977) I The current TEX hyphenation algorithm I pattern creation (patgen) (1983) Creating the patterns I applying the patterns (TEX) (patgen) Using the patterns (hyphenation) I support for a wide range of languages HTML and the I small, easy, fast soft hyphen The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The current TEX hyphenation algorithm HTML The current hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1983 PhD thesis by Franklin M. Liang hyphenation algorithm

I use of hyphenation patterns The original TEX hyphenation algorithm two algorithms: (1977) I The current TEX hyphenation algorithm I pattern creation (patgen) (1983) Creating the patterns I applying the patterns (TEX) (patgen) Using the patterns (hyphenation) I support for a wide range of languages HTML and the I small, easy, fast soft hyphen The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for The current TEX hyphenation algorithm HTML The current hyphenation algorithm Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX 1983 PhD thesis by Franklin M. Liang hyphenation algorithm

I use of hyphenation patterns The original TEX hyphenation algorithm two algorithms: (1977) I The current TEX hyphenation algorithm I pattern creation (patgen) (1983) Creating the patterns I applying the patterns (TEX) (patgen) Using the patterns (hyphenation) I support for a wide range of languages HTML and the I small, easy, fast soft hyphen The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Creating patterns with patgen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation I INPUT: a list of hyphenated words [, precomputed layout w/o hyphenation layout with hyphenation pattern, translate file] The TEX hyphenation I takes up to 9 runs (asking for many settings, adding a algorithm The original TEX new level in each run) hyphenation algorithm (1977)

The current TEX I OUTPUT: pattern file, statistics (a lot!) hyphenation algorithm (1983) Creating the patterns (patgen) I old code Using the patterns (hyphenation) I no UTF-8 HTML and the soft hyphen

I refactored by David Antoš (OPatGen), but doesn’t The Port to compile Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Creating patterns with patgen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation I INPUT: a list of hyphenated words [, precomputed layout w/o hyphenation layout with hyphenation pattern, translate file] The TEX hyphenation I takes up to 9 runs (asking for many settings, adding a algorithm The original TEX new level in each run) hyphenation algorithm (1977)

The current TEX I OUTPUT: pattern file, statistics (a lot!) hyphenation algorithm (1983) Creating the patterns (patgen) I old code Using the patterns (hyphenation) I no UTF-8 HTML and the soft hyphen

I refactored by David Antoš (OPatGen), but doesn’t The Port to compile Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Creating patterns with patgen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation I INPUT: a list of hyphenated words [, precomputed layout w/o hyphenation layout with hyphenation pattern, translate file] The TEX hyphenation I takes up to 9 runs (asking for many settings, adding a algorithm The original TEX new level in each run) hyphenation algorithm (1977)

The current TEX I OUTPUT: pattern file, statistics (a lot!) hyphenation algorithm (1983) Creating the patterns (patgen) I old code Using the patterns (hyphenation) I no UTF-8 HTML and the soft hyphen

I refactored by David Antoš (OPatGen), but doesn’t The Port to compile Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Creating patterns with patgen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation I INPUT: a list of hyphenated words [, precomputed layout w/o hyphenation layout with hyphenation pattern, translate file] The TEX hyphenation I takes up to 9 runs (asking for many settings, adding a algorithm The original TEX new level in each run) hyphenation algorithm (1977)

The current TEX I OUTPUT: pattern file, statistics (a lot!) hyphenation algorithm (1983) Creating the patterns (patgen) I old code Using the patterns (hyphenation) I no UTF-8 HTML and the soft hyphen

I refactored by David Antoš (OPatGen), but doesn’t The Port to compile Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Creating patterns with patgen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation I INPUT: a list of hyphenated words [, precomputed layout w/o hyphenation layout with hyphenation pattern, translate file] The TEX hyphenation I takes up to 9 runs (asking for many settings, adding a algorithm The original TEX new level in each run) hyphenation algorithm (1977)

The current TEX I OUTPUT: pattern file, statistics (a lot!) hyphenation algorithm (1983) Creating the patterns (patgen) I old code Using the patterns (hyphenation) I no UTF-8 HTML and the soft hyphen

I refactored by David Antoš (OPatGen), but doesn’t The Port to compile Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Creating patterns with patgen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation I INPUT: a list of hyphenated words [, precomputed layout w/o hyphenation layout with hyphenation pattern, translate file] The TEX hyphenation I takes up to 9 runs (asking for many settings, adding a algorithm The original TEX new level in each run) hyphenation algorithm (1977)

The current TEX I OUTPUT: pattern file, statistics (a lot!) hyphenation algorithm (1983) Creating the patterns (patgen) I old code Using the patterns (hyphenation) I no UTF-8 HTML and the soft hyphen

I refactored by David Antoš (OPatGen), but doesn’t The Port to compile Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for applying the patterns HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX I .in1 b2l2 4edi b4le. hyphenation algorithm

I patterns: short strings with integer values The original TEX hyphenation algorithm I odd values: valid breakpoints (1977) The current TEX hyphenation algorithm I even values: forbidden breakpoints (1983) Creating the patterns (patgen) I lower values are overwritten by higher values Using the patterns (hyphenation) I points mark begin/end of the word HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for applying the patterns HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX I .in1 b2l2 4edi b4le. hyphenation algorithm

I patterns: short strings with integer values The original TEX hyphenation algorithm I odd values: valid breakpoints (1977) The current TEX hyphenation algorithm I even values: forbidden breakpoints (1983) Creating the patterns (patgen) I lower values are overwritten by higher values Using the patterns (hyphenation) I points mark begin/end of the word HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for applying the patterns HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX I .in1 b2l2 4edi b4le. hyphenation algorithm

I patterns: short strings with integer values The original TEX hyphenation algorithm I odd values: valid breakpoints (1977) The current TEX hyphenation algorithm I even values: forbidden breakpoints (1983) Creating the patterns (patgen) I lower values are overwritten by higher values Using the patterns (hyphenation) I points mark begin/end of the word HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for applying the patterns HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX I .in1 b2l2 4edi b4le. hyphenation algorithm

I patterns: short strings with integer values The original TEX hyphenation algorithm I odd values: valid breakpoints (1977) The current TEX hyphenation algorithm I even values: forbidden breakpoints (1983) Creating the patterns (patgen) I lower values are overwritten by higher values Using the patterns (hyphenation) I points mark begin/end of the word HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for applying the patterns HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX I .in1 b2l2 4edi b4le. hyphenation algorithm

I patterns: short strings with integer values The original TEX hyphenation algorithm I odd values: valid breakpoints (1977) The current TEX hyphenation algorithm I even values: forbidden breakpoints (1983) Creating the patterns (patgen) I lower values are overwritten by higher values Using the patterns (hyphenation) I points mark begin/end of the word HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for applying the patterns (example) HTML Mathias Nater [email protected] http://mnn.ch/ incredible Motivation .incredible. layout w/o hyphenation . i n1 layout with hyphenation The TEX b2l2 hyphenation algorithm 4e d i The original TEX hyphenation algorithm i1b l (1977) The current TEX hyphenation algorithm n1c r (1983) Creating the patterns b4l e . (patgen) Using the patterns 5c r e d (hyphenation) HTML and the e d3i b soft hyphen

2r2e d The Port to Javascript ––––––––––––- Server side or Client side? . i n5c2r4e d3i1b4l2e . How it works Differences and Improvements in-cred-i-ble Back to the Future Hyphenation for applying the patterns (example) HTML Mathias Nater [email protected] http://mnn.ch/ incredible Motivation .incredible. layout w/o hyphenation . i n1 layout with hyphenation The TEX b2l2 hyphenation algorithm 4e d i The original TEX hyphenation algorithm i1b l (1977) The current TEX hyphenation algorithm n1c r (1983) Creating the patterns b4l e . (patgen) Using the patterns 5c r e d (hyphenation) HTML and the e d3i b soft hyphen

2r2e d The Port to Javascript ––––––––––––- Server side or Client side? . i n5c2r4e d3i1b4l2e . How it works Differences and Improvements in-cred-i-ble Back to the Future Hyphenation for applying the patterns (example) HTML Mathias Nater [email protected] http://mnn.ch/ incredible Motivation .incredible. layout w/o hyphenation . i n1 layout with hyphenation The TEX b2l2 hyphenation algorithm 4e d i The original TEX hyphenation algorithm i1b l (1977) The current TEX hyphenation algorithm n1c r (1983) Creating the patterns b4l e . (patgen) Using the patterns 5c r e d (hyphenation) HTML and the e d3i b soft hyphen

2r2e d The Port to Javascript ––––––––––––- Server side or Client side? . i n5c2r4e d3i1b4l2e . How it works Differences and Improvements in-cred-i-ble Back to the Future Hyphenation for applying the patterns (example) HTML Mathias Nater [email protected] http://mnn.ch/ incredible Motivation .incredible. layout w/o hyphenation . i n1 layout with hyphenation The TEX b2l2 hyphenation algorithm 4e d i The original TEX hyphenation algorithm i1b l (1977) The current TEX hyphenation algorithm n1c r (1983) Creating the patterns b4l e . (patgen) Using the patterns 5c r e d (hyphenation) HTML and the e d3i b soft hyphen

2r2e d The Port to Javascript ––––––––––––- Server side or Client side? . i n5c2r4e d3i1b4l2e . How it works Differences and Improvements in-cred-i-ble Back to the Future Hyphenation for applying the patterns (example) HTML Mathias Nater [email protected] http://mnn.ch/ incredible .incredible. Motivation layout w/o hyphenation . i n1 layout with hyphenation The TEX b2l2 hyphenation 4e d i algorithm The original TEX hyphenation algorithm i1b l (1977)

The current TEX n1c r hyphenation algorithm (1983) b4l e . Creating the patterns (patgen) Using the patterns 5c r e d (hyphenation)

e d3i b HTML and the 2r2e d soft hyphen The Port to ––––––––––––- Javascript Server side or Client side? . i n5c2r4e d3i1b4l2e . How it works Differences and in-cred-i-ble Improvements Back to the Future

example 2 Hyphenation for HTML and the Soft Hyphen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation I limited control over textlayout layout with hyphenation (text-align: left | right | justify) The TEX hyphenation algorithm I manual line breaks (
) The original TEX hyphenation algorithm I manually inserted soft (1977) The current TEX hyphenation algorithm (­ – discretionary hyphen) (1983) Creating the patterns (patgen) I some more controls are upcoming with CSS3 Using the patterns (hyphenation)

HTML and the I laying out text is up to the browser soft hyphen

I developer has no control over how text is displayed The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for HTML and the Soft Hyphen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation I limited control over textlayout layout with hyphenation (text-align: left | right | justify) The TEX hyphenation algorithm I manual line breaks (
) The original TEX hyphenation algorithm I manually inserted soft hyphens (1977) The current TEX hyphenation algorithm (­ – discretionary hyphen) (1983) Creating the patterns (patgen) I some more controls are upcoming with CSS3 Using the patterns (hyphenation)

HTML and the I laying out text is up to the browser soft hyphen

I developer has no control over how text is displayed The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for HTML and the Soft Hyphen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation I limited control over textlayout layout with hyphenation (text-align: left | right | justify) The TEX hyphenation algorithm I manual line breaks (
) The original TEX hyphenation algorithm I manually inserted soft hyphens (1977) The current TEX hyphenation algorithm (­ – discretionary hyphen) (1983) Creating the patterns (patgen) I some more controls are upcoming with CSS3 Using the patterns (hyphenation)

HTML and the I laying out text is up to the browser soft hyphen

I developer has no control over how text is displayed The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for HTML and the Soft Hyphen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation I limited control over textlayout layout with hyphenation (text-align: left | right | justify) The TEX hyphenation algorithm I manual line breaks (
) The original TEX hyphenation algorithm I manually inserted soft hyphens (1977) The current TEX hyphenation algorithm (­ – discretionary hyphen) (1983) Creating the patterns (patgen) I some more controls are upcoming with CSS3 Using the patterns (hyphenation)

HTML and the I laying out text is up to the browser soft hyphen

I developer has no control over how text is displayed The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for HTML and the Soft Hyphen HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation I limited control over textlayout layout with hyphenation (text-align: left | right | justify) The TEX hyphenation algorithm I manual line breaks (
) The original TEX hyphenation algorithm I manually inserted soft hyphens (1977) The current TEX hyphenation algorithm (­ – discretionary hyphen) (1983) Creating the patterns (patgen) I some more controls are upcoming with CSS3 Using the patterns (hyphenation)

HTML and the I laying out text is up to the browser soft hyphen

I developer has no control over how text is displayed The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout with hyphenation HTML text-align: justify; Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Text layout with hyphenation HTML text-align: justify; Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation algorithm

The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) Creating the patterns (patgen) Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Putting all together HTML Mathias Nater [email protected] http://mnn.ch/

need Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation word patgen algorithm list The original TEX hyphenation algorithm (1977)

The current TEX hyphenation algorithm (1983) hyphenation hyphenation Creating the patterns soft hyphen (patgen) patterns algorithm Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to hyphenator Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for Server side or Client side hyphenation? HTML Mathias Nater [email protected] http://mnn.ch/

Pro server side: Motivation layout w/o hyphenation I lower bandwidth usage layout with hyphenation I faster The TEX hyphenation algorithm I only hyphenate once, store the result The original TEX hyphenation algorithm Pro client side: (1977) The current TEX hyphenation algorithm I cleaner HTML (search engines!) (1983) Creating the patterns (patgen) I takes in count client oddities Using the patterns (hyphenation) I can be switched on/off HTML and the soft hyphen

I hyphenation is part of CSS3, so even the W3C believes The Port to that hyphenation belongs to the client Javascript Server side or Client side? user generated text can be hyphenated on the fly How it works I Differences and Improvements Back to the Future Hyphenation for My Decision HTML Mathias Nater [email protected] http://mnn.ch/ I server side solutions already existed: php, perl, java, Motivation python layout w/o hyphenation layout with hyphenation I I believe that hyphenation has to be done in the client The TEX hyphenation I Javascript is a very interesting language algorithm The original TEX hyphenation algorithm I the acceptance of Javascript is growing (1977)

The current TEX hyphenation algorithm I Firefox 2 didn’t support ­ (1983) Creating the patterns I I like bookmarklets (patgen) Using the patterns (hyphenation) hyphenator.js: client-side hyphenation HTML and the I soft hyphen

I it’s proofing to be a good decision: The Port to Javascript I other – webkit based – programs are using hyphenator Server side or Client side? I it’s easy to use How it works Differences and I there’s a big effort on making javascript faster Improvements Back to the Future Hyphenation for My Decision HTML Mathias Nater [email protected] http://mnn.ch/ I server side solutions already existed: php, perl, java, Motivation python layout w/o hyphenation layout with hyphenation I I believe that hyphenation has to be done in the client The TEX hyphenation I Javascript is a very interesting language algorithm The original TEX hyphenation algorithm I the acceptance of Javascript is growing (1977)

The current TEX hyphenation algorithm I Firefox 2 didn’t support ­ (1983) Creating the patterns I I like bookmarklets (patgen) Using the patterns (hyphenation) hyphenator.js: client-side hyphenation HTML and the I soft hyphen

I it’s proofing to be a good decision: The Port to Javascript I other – webkit based – programs are using hyphenator Server side or Client side? I it’s easy to use How it works Differences and I there’s a big effort on making javascript faster Improvements Back to the Future Hyphenation for My Decision HTML Mathias Nater [email protected] http://mnn.ch/ I server side solutions already existed: php, perl, java, Motivation python layout w/o hyphenation layout with hyphenation I I believe that hyphenation has to be done in the client The TEX hyphenation I Javascript is a very interesting language algorithm The original TEX hyphenation algorithm I the acceptance of Javascript is growing (1977)

The current TEX hyphenation algorithm I Firefox 2 didn’t support ­ (1983) Creating the patterns I I like bookmarklets (patgen) Using the patterns (hyphenation) hyphenator.js: client-side hyphenation HTML and the I soft hyphen

I it’s proofing to be a good decision: The Port to Javascript I other – webkit based – programs are using hyphenator Server side or Client side? I it’s easy to use How it works Differences and I there’s a big effort on making javascript faster Improvements Back to the Future Hyphenation for My Decision HTML Mathias Nater [email protected] http://mnn.ch/ I server side solutions already existed: php, perl, java, Motivation python layout w/o hyphenation layout with hyphenation I I believe that hyphenation has to be done in the client The TEX hyphenation I Javascript is a very interesting language algorithm The original TEX hyphenation algorithm I the acceptance of Javascript is growing (1977)

The current TEX hyphenation algorithm I Firefox 2 didn’t support ­ (1983) Creating the patterns I I like bookmarklets (patgen) Using the patterns (hyphenation) hyphenator.js: client-side hyphenation HTML and the I soft hyphen

I it’s proofing to be a good decision: The Port to Javascript I other – webkit based – programs are using hyphenator Server side or Client side? I it’s easy to use How it works Differences and I there’s a big effort on making javascript faster Improvements Back to the Future Hyphenation for My Decision HTML Mathias Nater [email protected] http://mnn.ch/ I server side solutions already existed: php, perl, java, Motivation python layout w/o hyphenation layout with hyphenation I I believe that hyphenation has to be done in the client The TEX hyphenation I Javascript is a very interesting language algorithm The original TEX hyphenation algorithm I the acceptance of Javascript is growing (1977)

The current TEX hyphenation algorithm I Firefox 2 didn’t support ­ (1983) Creating the patterns I I like bookmarklets (patgen) Using the patterns (hyphenation) hyphenator.js: client-side hyphenation HTML and the I soft hyphen

I it’s proofing to be a good decision: The Port to Javascript I other – webkit based – programs are using hyphenator Server side or Client side? I it’s easy to use How it works Differences and I there’s a big effort on making javascript faster Improvements Back to the Future Hyphenation for My Decision HTML Mathias Nater [email protected] http://mnn.ch/ I server side solutions already existed: php, perl, java, Motivation python layout w/o hyphenation layout with hyphenation I I believe that hyphenation has to be done in the client The TEX hyphenation I Javascript is a very interesting language algorithm The original TEX hyphenation algorithm I the acceptance of Javascript is growing (1977)

The current TEX hyphenation algorithm I Firefox 2 didn’t support ­ (1983) Creating the patterns I I like bookmarklets (patgen) Using the patterns (hyphenation) hyphenator.js: client-side hyphenation HTML and the I soft hyphen

I it’s proofing to be a good decision: The Port to Javascript I other – webkit based – programs are using hyphenator Server side or Client side? I it’s easy to use How it works Differences and I there’s a big effort on making javascript faster Improvements Back to the Future Hyphenation for My Decision HTML Mathias Nater [email protected] http://mnn.ch/ I server side solutions already existed: php, perl, java, Motivation python layout w/o hyphenation layout with hyphenation I I believe that hyphenation has to be done in the client The TEX hyphenation I Javascript is a very interesting language algorithm The original TEX hyphenation algorithm I the acceptance of Javascript is growing (1977)

The current TEX hyphenation algorithm I Firefox 2 didn’t support ­ (1983) Creating the patterns I I like bookmarklets (patgen) Using the patterns (hyphenation) hyphenator.js: client-side hyphenation HTML and the I soft hyphen

I it’s proofing to be a good decision: The Port to Javascript I other – webkit based – programs are using hyphenator Server side or Client side? I it’s easy to use How it works Differences and I there’s a big effort on making javascript faster Improvements Back to the Future Hyphenation for My Decision HTML Mathias Nater [email protected] http://mnn.ch/ I server side solutions already existed: php, perl, java, Motivation python layout w/o hyphenation layout with hyphenation I I believe that hyphenation has to be done in the client The TEX hyphenation I Javascript is a very interesting language algorithm The original TEX hyphenation algorithm I the acceptance of Javascript is growing (1977)

The current TEX hyphenation algorithm I Firefox 2 didn’t support ­ (1983) Creating the patterns I I like bookmarklets (patgen) Using the patterns (hyphenation) hyphenator.js: client-side hyphenation HTML and the I soft hyphen

I it’s proofing to be a good decision: The Port to Javascript I other – webkit based – programs are using hyphenator Server side or Client side? I it’s easy to use How it works Differences and I there’s a big effort on making javascript faster Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and ) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and URLs) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the script and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and URLs) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the script and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and URLs) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the script and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and URLs) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the script and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and URLs) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the script and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and URLs) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the script and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and URLs) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the script and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for How it works HTML Mathias Nater [email protected] http://mnn.ch/

1. register all elements that need hyphenation Motivation layout w/o hyphenation 2. if the language is not set, ask for it layout with hyphenation The TEX 3. download the patterns, if not already done hyphenation algorithm

The original TEX 4. split the paragraphs in words (and URLs) hyphenation algorithm (1977)

The current TEX 5. process each word, put ­ at every valid breakpoint hyphenation algorithm (1983) Creating the patterns 6. The browser will re-render the text automatically, taking (patgen) Using the patterns in account the soft hyphens. (hyphenation)

HTML and the soft hyphen I execution is fast The Port to I downloading the script and the patterns takes time Javascript Server side or Client side? script: 25 KB, en: 25 KB, pl: 37 KB, de: 74 KB How it works Differences and Improvements Back to the Future Hyphenation for Main Differences HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation I don’t care about in RAM, care about program size layout with hyphenation I no Trie (retrieval tree) The TEX hyphenation algorithm I no special data structures in Javascript The original TEX I using a trie is faster in execution (10ms) hyphenation algorithm (1977) I but: building the tree from the patterns takes time The current TEX hyphenation algorithm (1983) I but: for a tree extra code is used (uses bandwith) Creating the patterns I but: transferring the hardcoded trie is no solution, either (patgen) Using the patterns (overhead: 50%) (hyphenation) HTML and the I using a hash table (Javascript: object) instead soft hyphen The Port to I UTF-8 (Thanks to Arthur and Mojca) Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Main Differences HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation I don’t care about space in RAM, care about program size layout with hyphenation I no Trie (retrieval tree) The TEX hyphenation algorithm I no special data structures in Javascript The original TEX I using a trie is faster in execution (10ms) hyphenation algorithm (1977) I but: building the tree from the patterns takes time The current TEX hyphenation algorithm (1983) I but: for a tree extra code is used (uses bandwith) Creating the patterns I but: transferring the hardcoded trie is no solution, either (patgen) Using the patterns (overhead: 50%) (hyphenation) HTML and the I using a hash table (Javascript: object) instead soft hyphen The Port to I UTF-8 (Thanks to Arthur and Mojca) Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Main Differences HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation I don’t care about space in RAM, care about program size layout with hyphenation I no Trie (retrieval tree) The TEX hyphenation algorithm I no special data structures in Javascript The original TEX I using a trie is faster in execution (10ms) hyphenation algorithm (1977) I but: building the tree from the patterns takes time The current TEX hyphenation algorithm (1983) I but: for a tree extra code is used (uses bandwith) Creating the patterns I but: transferring the hardcoded trie is no solution, either (patgen) Using the patterns (overhead: 50%) (hyphenation) HTML and the I using a hash table (Javascript: object) instead soft hyphen The Port to I UTF-8 (Thanks to Arthur and Mojca) Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Improvements I HTML Mathias Nater [email protected] http://mnn.ch/

Packing the patterns (helper: compressor): Motivation layout w/o hyphenation I size of the pattern file does matter layout with hyphenation

The TEX I no whitespace (> 12% saved!) hyphenation algorithm

a1 ą1 e1 ę1 i1 o1 ó1 u1 y1 _a1 _b8 _c8 _ć8 _d8 The original TEX hyphenation algorithm (1977) 2:’a1ą1e1ę1i1o1ó1u1y1’, The current TEX hyphenation algorithm (1983) 3:’_a1_b8_c8_ć8_d8_e1_f8 Creating the patterns (patgen) Using the patterns (hyphenation) I http-requests take time HTML and the I merge the script and the necessary patterns (usualy just soft hyphen The Port to one) in one file Javascript Server side or Client side? I saves 2 requests per pattern-file How it works Differences and Improvements Back to the Future Hyphenation for Improvements I HTML Mathias Nater [email protected] http://mnn.ch/

Packing the patterns (helper: compressor): Motivation layout w/o hyphenation I size of the pattern file does matter layout with hyphenation

The TEX I no whitespace (> 12% saved!) hyphenation algorithm

a1 ą1 e1 ę1 i1 o1 ó1 u1 y1 _a1 _b8 _c8 _ć8 _d8 The original TEX hyphenation algorithm (1977) 2:’a1ą1e1ę1i1o1ó1u1y1’, The current TEX hyphenation algorithm (1983) 3:’_a1_b8_c8_ć8_d8_e1_f8 Creating the patterns (patgen) Using the patterns (hyphenation) I http-requests take time HTML and the I merge the script and the necessary patterns (usualy just soft hyphen The Port to one) in one file Javascript Server side or Client side? I saves 2 requests per pattern-file How it works Differences and Improvements Back to the Future Hyphenation for Improvements I HTML Mathias Nater [email protected] http://mnn.ch/ Packing the patterns (helper: compressor): Motivation size of the pattern file does matter layout w/o hyphenation I layout with hyphenation

I no whitespace (> 12% saved!) The TEX hyphenation a1 ą1 e1 ę1 i1 o1 ó1 u1 y1 _a1 _b8 _c8 _ć8 _d8 algorithm The original TEX hyphenation algorithm (1977)

2:’a1ą1e1ę1i1o1ó1u1y1’, The current TEX hyphenation algorithm 3:’_a1_b8_c8_ć8_d8_e1_f8 (1983) Creating the patterns (patgen) Merging script and patterns in one file (helper: merge+pack) Using the patterns (hyphenation) I http-requests take time HTML and the soft hyphen

I merge the script and the necessary patterns (usualy just The Port to one) in one file Javascript Server side or Client side? How it works I saves 2 requests per pattern-file Differences and Improvements Back to the Future Hyphenation for Improvements I HTML Mathias Nater [email protected] http://mnn.ch/ Packing the patterns (helper: compressor): Motivation size of the pattern file does matter layout w/o hyphenation I layout with hyphenation

I no whitespace (> 12% saved!) The TEX hyphenation a1 ą1 e1 ę1 i1 o1 ó1 u1 y1 _a1 _b8 _c8 _ć8 _d8 algorithm The original TEX hyphenation algorithm (1977)

2:’a1ą1e1ę1i1o1ó1u1y1’, The current TEX hyphenation algorithm 3:’_a1_b8_c8_ć8_d8_e1_f8 (1983) Creating the patterns (patgen) Merging script and patterns in one file (helper: merge+pack) Using the patterns (hyphenation) I http-requests take time HTML and the soft hyphen

I merge the script and the necessary patterns (usualy just The Port to one) in one file Javascript Server side or Client side? How it works I saves 2 requests per pattern-file Differences and Improvements Back to the Future Hyphenation for Improvements II HTML Mathias Nater [email protected] http://mnn.ch/

Using reduced pattern sets for static sites (helper: Motivation reducePatternSet) layout w/o hyphenation layout with hyphenation

I most patterns are not used The TEX hyphenation I if the the text will not change, use a precomputed subset algorithm The original TEX hyphenation algorithm I savings vary (1977) The current TEX hyphenation algorithm (1983) I only take in account breakpoints of composite words: Creating the patterns (patgen) Zeilen-ende instead of Zei-len-en-de Using the patterns (hyphenation)

de patterns are now 37 KB instead of 74 KB HTML and the 265683 good, 22837 bad, 995752 missed soft hyphen The Port to 21.06 %, 1.81 %, 78.94 % Javascript Server side or Client side? I or use different settings for patgen! How it works Differences and Improvements Back to the Future Hyphenation for Improvements II HTML Mathias Nater [email protected] http://mnn.ch/

Using reduced pattern sets for static sites (helper: Motivation reducePatternSet) layout w/o hyphenation layout with hyphenation

I most patterns are not used The TEX hyphenation I if the the text will not change, use a precomputed subset algorithm The original TEX hyphenation algorithm I savings vary (1977) The current TEX hyphenation algorithm (1983) I only take in account breakpoints of composite words: Creating the patterns (patgen) Zeilen-ende instead of Zei-len-en-de Using the patterns (hyphenation)

de patterns are now 37 KB instead of 74 KB HTML and the 265683 good, 22837 bad, 995752 missed soft hyphen The Port to 21.06 %, 1.81 %, 78.94 % Javascript Server side or Client side? I or use different settings for patgen! How it works Differences and Improvements Back to the Future Hyphenation for Improvements II HTML Mathias Nater [email protected] http://mnn.ch/ Using reduced pattern sets for static sites (helper: Motivation reducePatternSet) layout w/o hyphenation layout with hyphenation most patterns are not used I The TEX hyphenation I if the the text will not change, use a precomputed subset algorithm The original TEX hyphenation algorithm I savings vary (1977)

The current TEX hyphenation algorithm (Recompute the patterns) (1983) Creating the patterns I only take in account breakpoints of composite words: (patgen) Using the patterns Zeilen-ende instead of Zei-len-en-de (hyphenation) HTML and the de patterns are now 37 KB instead of 74 KB soft hyphen

265683 good, 22837 bad, 995752 missed The Port to Javascript 21.06 %, 1.81 %, 78.94 % Server side or Client side? How it works I or use different settings for patgen! Differences and Improvements Back to the Future Hyphenation for Improvements II HTML Mathias Nater [email protected] http://mnn.ch/ Using reduced pattern sets for static sites (helper: Motivation reducePatternSet) layout w/o hyphenation layout with hyphenation most patterns are not used I The TEX hyphenation I if the the text will not change, use a precomputed subset algorithm The original TEX hyphenation algorithm I savings vary (1977)

The current TEX hyphenation algorithm (Recompute the patterns) (1983) Creating the patterns I only take in account breakpoints of composite words: (patgen) Using the patterns Zeilen-ende instead of Zei-len-en-de (hyphenation) HTML and the de patterns are now 37 KB instead of 74 KB soft hyphen

265683 good, 22837 bad, 995752 missed The Port to Javascript 21.06 %, 1.81 %, 78.94 % Server side or Client side? How it works I or use different settings for patgen! Differences and Improvements Back to the Future Hyphenation for Improvements II HTML Mathias Nater [email protected] http://mnn.ch/ Using reduced pattern sets for static sites (helper: Motivation reducePatternSet) layout w/o hyphenation layout with hyphenation most patterns are not used I The TEX hyphenation I if the the text will not change, use a precomputed subset algorithm The original TEX hyphenation algorithm I savings vary (1977)

The current TEX hyphenation algorithm (Recompute the patterns) (1983) Creating the patterns I only take in account breakpoints of composite words: (patgen) Using the patterns Zeilen-ende instead of Zei-len-en-de (hyphenation) HTML and the de patterns are now 37 KB instead of 74 KB soft hyphen

265683 good, 22837 bad, 995752 missed The Port to Javascript 21.06 %, 1.81 %, 78.94 % Server side or Client side? How it works I or use different settings for patgen! Differences and Improvements Back to the Future Hyphenation for Hyphenator.js Problems and Oddities HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation I Problems upon copy/paste of hyphenated text algorithm The original TEX hyphenation algorithm I Problems with loaded fonts (@font-face) (1977) The current TEX hyphenation algorithm I Patterns very different in size (1983) Creating the patterns (patgen) I some rare misplaced hyphenation breaks may happen Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Hyphenator.js Problems and Oddities HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation I Problems upon copy/paste of hyphenated text algorithm The original TEX hyphenation algorithm I Problems with loaded fonts (@font-face) (1977) The current TEX hyphenation algorithm I Patterns very different in size (1983) Creating the patterns (patgen) I some rare misplaced hyphenation breaks may happen Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Hyphenator.js Problems and Oddities HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation I Problems upon copy/paste of hyphenated text algorithm The original TEX hyphenation algorithm I Problems with loaded fonts (@font-face) (1977) The current TEX hyphenation algorithm I Patterns very different in size (1983) Creating the patterns (patgen) I some rare misplaced hyphenation breaks may happen Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Hyphenator.js Problems and Oddities HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation I Problems upon copy/paste of hyphenated text algorithm The original TEX hyphenation algorithm I Problems with loaded fonts (@font-face) (1977) The current TEX hyphenation algorithm I Patterns very different in size (1983) Creating the patterns (patgen) I some rare misplaced hyphenation breaks may happen Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for What the future shall/may bring HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation I CSS3: browsers do hyphenation (w/o hyphenator.js) algorithm The original TEX hyphenation algorithm I TUG: maintained hyphenation patterns (beware of size!) (1977) The current TEX hyphenation algorithm I Wish: better typography in web sites. (1983) Creating the patterns (patgen) I Me: Try to rewrite PatGen for UTF-8 Support Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for What the future shall/may bring HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation I CSS3: browsers do hyphenation (w/o hyphenator.js) algorithm The original TEX hyphenation algorithm I TUG: maintained hyphenation patterns (beware of size!) (1977) The current TEX hyphenation algorithm I Wish: better typography in web sites. (1983) Creating the patterns (patgen) I Me: Try to rewrite PatGen for UTF-8 Support Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for What the future shall/may bring HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation I CSS3: browsers do hyphenation (w/o hyphenator.js) algorithm The original TEX hyphenation algorithm I TUG: maintained hyphenation patterns (beware of size!) (1977) The current TEX hyphenation algorithm I Wish: better typography in web sites. (1983) Creating the patterns (patgen) I Me: Try to rewrite PatGen for UTF-8 Support Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for What the future shall/may bring HTML Mathias Nater [email protected] http://mnn.ch/

Motivation layout w/o hyphenation layout with hyphenation

The TEX hyphenation I CSS3: browsers do hyphenation (w/o hyphenator.js) algorithm The original TEX hyphenation algorithm I TUG: maintained hyphenation patterns (beware of size!) (1977) The current TEX hyphenation algorithm I Wish: better typography in web sites. (1983) Creating the patterns (patgen) I Me: Try to rewrite PatGen for UTF-8 Support Using the patterns (hyphenation)

HTML and the soft hyphen

The Port to Javascript Server side or Client side? How it works Differences and Improvements Back to the Future Hyphenation for Soft hyphen HTML Mathias Nater [email protected] http://mnn.ch/

In chapter 9.3.3. the HTML 4.01 Specification tells us the Appendix following about hyphenation in HTML: [. . . ] The soft hyphen tells the user agent where a line break can occur. [. . . ] If a line is broken at a soft hyphen, a hyphen character must be displayed at the end of the first line. If a line is not broken at a soft hyphen, the user agent must not display a hyphen character. [. . . ] The soft hyphen is represented by the character entity reference ­ (­ or ­)

Return Hyphenation for applying the patterns – example 2 HTML Mathias Nater [email protected] http://mnn.ch/ hyphenation .hyphenation. Appendix 2i o 1n a o2n h e2n n2a t 1t i o h e n a4 h y3p h h e n5a t ––––––––––––- . h y3p h e2n5a4t2i o2n . hy-phen-ation

Return Hyphenation for For Further ReadingI HTML Mathias Nater [email protected] http://mnn.ch/

David Antoš (2001): Appendix PatLib, Pattern Manipulating Library – Master Thesis Masaryk University Brno, Faculty of Informatics Donald E. Knuth (1999): Digital Typography. Stanford, California: Center for the Study of Language and Information ISBN 1-57586-010-4 Franklin Mark Liang (1983): Word Hy-phen-a-tion by Com-put-er. PhD thesis Department of Computer Science, Stanford University: Stanford, CA 94305. http://www.tug.org/docs/liang/liang-thesis.pdf Hyphenation for For Further ReadingII HTML Mathias Nater [email protected] http://mnn.ch/

Appendix Christine Römer, Herbert Voß (2008): Deutsche Silbentrennmuster – aus linguistischer und TEXnischer Sicht. PDF, Jena 06. 03. 2008 http://www.personal.uni- jena.de/˜xcr/v2/Dateien/File/Jena2008.pdf Raggett Dave, Le Hors Arnaud, Jacobs Ian (1999): HTML 4.01 Specification – W3C Recommendation 24 December 1999. http://www.w3.org/TR/html401/