Background the Subject of Braille Translation Is Not a Widely Familiar
Total Page:16
File Type:pdf, Size:1020Kb
DOTSYS II1: A Portable Braille Translator J. E. Sullivan, The Mitre Corporation and then the right. Each position may have Background a dot or not; thus there are 26=64 possible combinations. In a given piece of text, there The subject of braille translation will typically be a number of cells that is not a widely familiar one, so that a stand one-for-one for a character in the brief introduction to that topic would inkprint counterpart. However, the general seem to be in order before taking up rule is that a group of one or more braille DOTSYS III. Most people know what braille cells will correspond to a group of one or is in general--a coding system employing more inkprint characters. In order to re- raised dots so that the sense of touch duce the space required by braille and alone suffices to read. However, contrary correspondingly improve the reading rate, to the impression one gets from those the rules are so contrived that the braille little cards, the most widely used codes group is almost always shorter. The chief are not "substitution ciphers"--that is, device for doing this is the contraction. the braille equivalent of a given "ink- One-hundred-elghty-nine commonly-occurring print" text is generally not a simple letter groups have a short braille form transliteration but rather a kind of that substitutes for full spelling in some translation. This is because the rules circumstances. Generally, these circum- for transcription involve not only the stances are dictated by the fact that a spelling of words, but also their syllabi- given braille sign may have different mean- fication and pronounciation. And since ings depending on context--e.g., the sign these occasionally vary with the meaning for a period and the letter-group "dis" are of the word (e.g., the verb "do" and the one and the same, so that a contraction may musical note "do") the transcriber must be made when the letters appear at the be- understand what he is transcribing, at ginning of a word (e.g., "dislike") but for least at a superficial semantic level. the obvious reason full spelling must be This is characteristics of translation, used when they occur at the end ("Chary- although of course the process is not bdis"). Braille is riddled with such con- nearly as difficult as the translation of text dependencies, including "case" and one natural language to another. "escape" codes to permit representation of inkprint effects such as italics and to The actual rules of braille will not shift interpretation modes--even the digits be treated in any great depth here; the use the same sign as the letters a - j. various codes differ from language to lan- However intricate rules covering these guage and, even within a language, from cases may be, they are still mechanical and situation to situation. In this country thus not the worst complication in braille there are three standard codes: one for translation. The difficulty lles in a general, "literary" use [i], one for sci- group of rules that forbid use of contrac- entific notation [2], and another for mu- tions where syllabification or pronuncia- sical notation [3]. A fourth standard [4] tion would likely be thrown off; for ex- covers formatting rules. In addition to ample, the "the" contraction should not be these, a number of more-or-less widely used in "sweetheart" because the letter- used codes exist, ranging from simple group straddles a strong syllable boundary. tranliteration codes such as "grade I" and And, as previously noted, syllabification "computer braille" to a braille shorthand. and pronunciation cannot always be deter- Unless otherwise indicated, we shall be mined without ascending to the level of referring to the literary code; sometimes meaning. called grade II braille. This code re- quires some sixty pages of rules to define. General Function of DOTSYS III The basic unit of braille is a "cell" comprising six dot positions, three posi- DOTSYS III is a computer program to tions high and two wide. These are label- translate inkprint into braille as auto- ed i-6, counting downward first on the left matically as possible. But since the 398 determination and representation of deep DIS I L 6 3 50 semantics is beyond the current state of the programming art--at least for natural In order for an entry to qualify, three languages used in unrestricted context--it distinct tests must be passed. First, the is clear that a program to produce perfect left end of the window must match the en- braille automatically is not technically try as far as the "I" • Secondly, the feasible. DOTSYS IIl represents a compro- next character beyond this point must fall mise between the perfect and the possible. into the "right-context class" designated The occasional flaws in its translation by the second field of the entry. In this can be removed by human intervention or, case, "L" has been defined (by another in some situations, simply left in as an table-) to mean "letter." Thus it is clear acceptable level of "misprints." at a glance that the first two qualifica- tions are met. The third qualification is The program proper does not embody determined by the "input class"--in this the rules of English Braille, but rather a case, class 6. This class selects an en- generalized process appropriate to the try in a decision table and in turn this larger inkprint-to-braille problem. A set entry imposes a boolean test on a set of of tables specify the process for a parti- state variables. Roughly speaking, the cular code or related group of codes. A test is a disjunction of conjunctions. The single set of tables (with many versions~) state variables themselves are simply true- presently implements English Braille, with false indicators deriving "meaning" only intermixed Grade I and computer braille from the logic of a particular set of permitted• It appears that a table for tables. For the literary tables, the set Grade II Spanish Braille can be implemented of state variables and their approximate quite easily. This flexibility of changing meaning is as follows: translation rules by altering tables is motivated not only by the possibility of I. after the start of a number switching codes but also by the expectation 2. after the start of a word that, even for a particular code, a long process of refinement (with occasional 3. grade i translation special-purpose temporary modifications) 4. inside a quotation will take place. 5. inside italicized text Thus the program was designed with 6. not at the start of a prefix two primary goals: fidelity to the standard or stem braille rules and flexibility to adapt to the various codes, their interpretations, 7. part-way through a word or and application contexts. Two additional phrase too long for one entry goals were: "portability" of the system 8. just after a space or A-J, fol- from environment to environment, and com- lowing a number prehensibility of the system so that it would be a known quantity to its users, Getting back to our example, the decision and to facilitate local adaptation. It was table entry for class 6 will demand that chiefly with portability in mind that COBOL states 2, 3, and 7 all be "off"--i.e., we was chosen as the system programming lan- must be at the beginning of a word, in guage [ii], even though languages more grade 2 translation, and not looking for capable in other respects were available. the second part of a double entry• All these will apply in this case, so the entry qualifies. Had it not qualified for any The Alsorithm reason, then the translation table search would have continued; and if no entry The basic algorithm is a modified qualified, the single leftmost character finite-state process devised by Dr. Jonathan would be translated according to an assum- Millen [5]. It is perhaps easiest to ed (default) entry. understand the algorithm by observing it in action on a particular case, using the Once an entry is accepted, three English Braille tables. actions take place. First up to four braille signs and/or program control codes First, let us assume that some text are issued. In this case "50" selects the has already been translated. The input, dots-2-5-6 cell representing the contrac- seen through a 10-character "sliding win- tion "dis" Second, the window is shifted dow" (underlined), looks like a certain number of places, in this case 3: • THIS DISEASE IS NOT SERIOUS • THIS DISEASE IS NOT SERIOUS . The first step is to locate a qualifying Finally the input class selects an entry entry in the main translation table, which in the transition table. This entry deter- in this case contains about i000 entries. mines the new setting of each state vari- In principle, though not in actuality, the able: set, reset, changed or unaltered. table is searched serially• The entry in In this case state variable i, 7, and 8 question is: will be left unchanged. 399 The input class is thus the key to much of the program was running to consider the state setting and testing logic. It a strict restructuring along these lines. tends to be the same for most contractions Nevertheless, these structural modularity of a given type--class 6, for example, is concepts are present in sufficient degree characteristic of those contractions used to form a model for explanation. only at the beginning of a word. The first module is the primary input It may be noted at this point that processor.