Turn Yourself Into a SAS® Internationalization Detective in One Day: a Macro for Analyzing SAS Data of Any Encoding Houliang Li, HL SASBIPROS INC

Total Page:16

File Type:pdf, Size:1020Kb

Turn Yourself Into a SAS® Internationalization Detective in One Day: a Macro for Analyzing SAS Data of Any Encoding Houliang Li, HL SASBIPROS INC Paper 4645-2020 Turn Yourself into a SAS® Internationalization Detective in One Day: A Macro for Analyzing SAS Data of Any Encoding Houliang Li, HL SASBIPROS INC ABSTRACT With the growing popularity of using Unicode in SAS®, it is more and more likely that you will be handed SAS data sets encoded in UTF-8. Or at least they claim those are genuine UTF-8 SAS data. How can you tell? What happens if you are actually required to generate output in plain old ASCII encoding from data sources of whatever encoding? This paper provides a solution that gives you all the details about the source data regarding potential multibyte Unicode and other non-ASCII characters in addition to ASCII control characters. Armed with the report, you can go back to your data provider and ask them to fix the problems that they did not realize they have been causing you. If that is not feasible, you can at least run your production jobs successfully against cleaned data after removing all the exotic characters. INTRODUCTION Unicode is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems, according to Wikipedia. It concerns character variables and their values only. UTF-8, or utf-8, is the most popular implementation of Unicode. When you install SAS here in North America, whether on a PC or server or in a multitier deployment, UTF-8 has been one of the three encodings installed by default for many years, with the other two being Wlatin1 / Latin1 (English) and Shift-JIS (DBCS). Of course, you are free to install additional encodings for other languages such as French, German, Russian, and Chinese, etc. if you feel you may need to process data or generate output encoded in those languages. With the introduction of SAS Viya in recent years, UTF-8 became the default encoding, period! Although you do get the option to change to one of several encodings at the system level with more current versions of SAS Viya, being the default encoding says a lot about the importance of UTF-8. Don’t be surprised if all SAS data sets from SAS Viya come in UTF-8 encoding. Say goodbye to your favorite Wlatin1 encoding on Windows or Latin1 encoding on Linux / Unix. In reality, however, not many places are actively using SAS Viya at the moment, even though it is trending up. If you are somehow worried about this SAS Internationalization thing (I18N for short because there are 18 letters between the first letter I and last letter N), it is probably premature for the foreseeable future. For most companies and government agencies in North America, their data sources and output will still be in English, and Wlatin1 or Latin1 encoding should continue to serve them well. If you are new, or relatively new, to the I18N band wagon or the concept of character encoding in general, encoding is literally mapping characters such as letters and symbols to numbers (called code points) on a character map (called a code page). The ASCII table is perhaps the most famous character map. Computers look smart and scary, but they are actually dumb. All they know and can handle is only bits, i.e., binary digits, or 0s and 1s. It is just that they can process those bits incredibly fast. ASCII, or American Standard Code for Information Interchange, is one of the earliest computer encoding schemes. It initially used only 7 bits, i.e., from 000 0000 to 111 1111, or 0 to 127 in decimal form, which means it could only represent a total of 128 unique characters. The first 32 (values 0 to 31) are control characters 1 and usually unprintable or invisible, and the last one is the “Delete” control character, also unprintable. The rest include uppercase and lowercase letters, decimal digits, some common symbols and punctuation marks, and the space. For example, the uppercase letter “S” is represented by the decimal number 83 (or 101 0011 in binary), which is its code point in ASCII. Similarly, the lowercase letter “s” has code point 115 (or 111 0011 in binary), and the space character, arguably the first printing or visible character in ASCII, occupies code point 32 (or 010 0000 in binary). Just Google “ASCII table” to see how your favorite letter or lucky number is represented in ASCII. For the purpose of understanding this paper, we only need to know about three more encodings really: Wlatin1, Latin1, and UTF-8. ISO 8859-1, also known in SAS as Latin1, adds one more bit to ASCII, so it ranges from 0000 0000 to 1111 1111 in binary, or 0 to 255 in decimal. That means it can represent 256 unique characters. The first 128 are the same ASCII characters, and the added 96 contain characters sufficient for the most common Western European languages. Wlatin1, or Windows Latin1, is officially known as Windows 1252, or Code Page 1252 or cp1252. It is very similar to Latin1, except it has some additional symbols in the range 128 to 159 whereas Latin1 has nothing for those code points (hence only 96 additional characters out of 128 possible spots). Latin1 is the default encoding for SAS on Linux / Unix, while Wlatin1 is the default for Windows SAS. They are both exactly one byte in size. Table 1 below shows all the printable characters that are available in WLATIN1. LATIN1 includes all of the same characters except those enclosed in the blue rectangle: Table 1: The Wlatin1 Character Set Just like Wlatin1 and Latin1, UTF-8 also has its first 128 characters identical to ASCII. However, that is where the similarity ends. While both Wlatin1 and Latin1 use 8 bits (or one 2 byte) and can represent up to 256 characters (remember Latin1 does not use all the available code points), UTF-8 uses 8 bits / one byte for the 128 ASCII characters only. It does not utilize the remaining 128 code points for anything at all! What makes UTF-8 so powerful is that it can use 16, 24, even 32 bits (or two, three, even four bytes!) to represent all the other characters in the world. It is an MBCS (Multi Byte Character Set). All those characters in Wlatin1 and Latin1 beyond ASCII have their corresponding representations in UTF-8 as either a two-byte or three-byte code point. In addition, UTF-8 has a unique bit pattern for each of those multibyte representations. For example, the first byte will always begin with bits 110, 1110, or 11110, indicating either two, three, or four bytes in total for the character. The remaining bytes always start with bits 10 to indicate they are continuing with the same character. After removing those indicator bits, a 16-bit / two-byte UTF-8 representation has 11 bits left (110X XXXX 10XX XXXX) to represent 2**11 = 2048 characters. The first 128 code points overlap ASCII as represented by the one-byte UTF-8, so the net increase is actually 2048 – 128 = 1920 characters. Similarly, a 24-bit / three-byte UTF-8 encoding can represent a total of 2**16 = 65536 characters (again notice the overlap with 2-byte UTF-8), and a 32-bit / four-byte UTF-8 representation can encode 2**21, or 2,097,152 characters. Clearly, two million+ is sufficient to represent all written human languages, dead or alive, plus a lot of other symbols such as emojis. So far, Unicode has actually defined less than 1.2 million code points. There are simply not enough alphabets and logograms and other symbols in the world to encode! I know you all like this emoji: 11110000:10011111:10011000:10000011 (hexadecimal equivalent: F0 9F 98 83), which is the smiley face. But do you also like this emoji? 11110000:10011111:10010010:10101001 (hexadecimal equivalent: F0 9F 92 A9). Maybe you do and have even used it too, but you need to figure it out first! Just in case you are not so familiar with hexadecimal representations, it is a more efficient (read: shorter) way of representing binary values. It ranges from 0 to 15, with each of the 16 values denoted by decimal digits 0 through 9 and continuing from letters A through F (or a through f, since case does not matter), i.e. A = 10, B = 11, and F = 15, etc. Each hexadecimal (hex for short) symbol / digit corresponds to exactly four binary digits. For example, the letter “S” is 0101 0011 in binary, which adds up to 2**6 + 2**4 + 2**1 + 1 = 83 (in decimal). Its hex equivalent is 53, which adds up to 5 * 16**1 + 3 = 83 (in decimal). We know its code point in Wlatin1 / Latin1 / UTF-8 is 83. Everything matches perfectly. To really bring the concept home, let’s examine a multibyte UTF-8 character. Given the following binary representation in UTF-8 (Please note: colons between bytes are for easy identification purposes only and not required): 11100010:10001001:10100100, or its hex equivalent E2 89 A4, our task is to figure out its code point in Unicode. For the binary representation, we need to first remove the italicized bits because they are multibyte indicators. Next we assemble ALL the remaining bits (depicted in Bold) in the same order as listed and treat the resultant binary string as a single number, which should reveal the code point for the character: 0010 001001 100100 = 10 0010 0110 0100 = 2**13 + 2**9 + 2**6 + 2**5 + 2**2 = 8192 + 512 + 64 + 32 + 4 = 8804.
Recommended publications
  • IPDS Technical Reference 1
    IPDS Technical Reference 1 TABLE OF CONTENTS Manuals for the IPDS card.................................................................................................................................4 Notice..................................................................................................................................................................5 Important.........................................................................................................................................................5 How to Read This Manual................................................................................................................................. 6 Symbols...........................................................................................................................................................6 About This Book..................................................................................................................................................7 Audience.........................................................................................................................................................7 Terminology.................................................................................................................................................... 7 About IPDS.......................................................................................................................................................... 8 Capabilities of IPDS............................................................................................................................................9
    [Show full text]
  • Proposal for Generation Panel for Sinhala Script Label
    Proposal for the Generation Panel for the Sinhala Script Label Generation Ruleset for the Root Zone 1 General information Sinhala belongs to the Indo-European language family with its roots deeply associated with Indo-Aryan sub family to which the languages such as Persian and Hindi belong. Although it is not very clear whether people in Sri Lanka spoke a dialect of Prakrit at the time of arrival of Buddhism in the island, there is enough evidence that Sinhala evolved from mixing of Sanskrit, Magadi (the language which was spoken in Magada Province of India where Lord Buddha was born) and local language which was spoken by people of Sri Lanka prior to the arrival of Vijaya, the founder of Sinhala Kingdom. It is also surmised that Sinhala had evolved from an ancient variant of Apabramsa (middle Indic) which is known as ‘Elu’. When tracing history of Elu, it was preceded by Hela or Pali Sihala. Sinhala though has close relationships with Indo Aryan languages which are spoken primarily in the north, north eastern and central India, was very much influenced by Tamil which belongs to the Dravidian family of languages. Though Sinhala is related closely to Indic languages, it also has its own unique characteristics: Sinhala has symbols for two vowels which are not found in any other Indic languages in India: ‘æ’ (ඇ) and ‘æ:’ (ඈ). Sinhala script had evolved from Southern Brahmi script from which almost all the Southern Indic Scripts such as Telugu and Oriya had evolved. Later Sinhala was influenced by Pallava Grantha writing of Southern India.
    [Show full text]
  • Database Globalization Support Guide
    Oracle® Database Database Globalization Support Guide 19c E96349-05 May 2021 Oracle Database Database Globalization Support Guide, 19c E96349-05 Copyright © 2007, 2021, Oracle and/or its affiliates. Primary Author: Rajesh Bhatiya Contributors: Dan Chiba, Winson Chu, Claire Ho, Gary Hua, Simon Law, Geoff Lee, Peter Linsley, Qianrong Ma, Keni Matsuda, Meghna Mehta, Valarie Moore, Cathy Shea, Shige Takeda, Linus Tanaka, Makoto Tozawa, Barry Trute, Ying Wu, Peter Wallack, Chao Wang, Huaqing Wang, Sergiusz Wolicki, Simon Wong, Michael Yau, Jianping Yang, Qin Yu, Tim Yu, Weiran Zhang, Yan Zhu This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S.
    [Show full text]
  • Onetouch 4.0 Sanned Documents
    TO: MSPM Distribution FROM: J. H. Saltzer SUBJECT: 88.3.02 DATE: 02/05/68 This revision of BB.3.02 is because 1. The ASCII standard character set has been approved. References are altered accordingly. 2. The latest proposed ASCII standard card code has been revised slightly. Since the Multics standard card code matches the ASCII standard wherever convenient# 88.3.02 is changed. Codes for the grave accent# left and right brace, and tilde are affected. 3. One misprint has been corrected; the code for capita 1 11 S" is changed. MULTICS SYSTEM-PROGRAMMERS' MANUAL SECTION BB.3.02 PAGE 1 Published: 02/05/68; (Supersedes: BB.3.02; 03/30/67; BC.2.06; 11/10/66) Identification Multics standard card punch codes and Relation between ASCII and EBCDIC J • H • Sa 1 tze r Purpose This section defines standard card punch codes to be used in representing ASCII characters for use with Multics. Since the card punch codes are based on the punch codes defined for the IBM EBCDIC standard, automatically a correspondence between the EBCDIC and ASCII character sets is also defined. Note The Multics standard card punch codes described in this section are DQ! identical to the currently proposed ASCII punched card code. The proposed ASCII standard code is not supported by any currently available punched card equipment; until such support exists it is not a practical standard for Multics work. The Multics standard card punch code described here is based on widely available card handling equipment used with IBM System/360 computers. The six characters for which the Multics standard card code differs with the ASCII card code are noted in the table below.
    [Show full text]
  • Unicode and Code Page Support
    Natural Unicode and Code Page Support Version 8.2.4 November 2016 This document applies to Natural Version 8.2.4. Specifications contained herein are subject to change and these changes will be reported in subsequent release notes or new editions. Copyright © 1979-2016 Software AG, Darmstadt, Germany and/or Software AG USA, Inc., Reston, VA, USA, and/or its subsidiaries and/or its affiliates and/or their licensors. The name Software AG and all Software AG product names are either trademarks or registered trademarks of Software AG and/or Software AG USA, Inc. and/or its subsidiaries and/or its affiliates and/or their licensors. Other company and product names mentioned herein may be trademarks of their respective owners. Detailed information on trademarks and patents owned by Software AG and/or its subsidiaries is located at http://softwareag.com/licenses. Use of this software is subject to adherence to Software AG's licensing conditions and terms. These terms are part of the product documentation, located at http://softwareag.com/licenses/ and/or in the root installation directory of the licensed product(s). This software may include portions of third-party products. For third-party copyright notices, license terms, additional rights or re- strictions, please refer to "License Texts, Copyright Notices and Disclaimers of Third-Party Products". For certain specific third-party license restrictions, please refer to section E of the Legal Notices available under "License Terms and Conditions for Use of Software AG Products / Copyright and Trademark Notices of Software AG Products". These documents are part of the product documentation, located at http://softwareag.com/licenses and/or in the root installation directory of the licensed product(s).
    [Show full text]
  • LG Programmer’S Reference Manual
    LG Programmer’s Reference Manual Line Matrix Series Printers Trademark Acknowledgements ANSI is a registered trademark of American National Standards Institute, Inc. Code V is a trademark of Quality Micro Systems. Chatillon is a trademark of John Chatillon & Sons, Inc. Ethernet is a trademark of Xerox Corporation. IBM is a registered trademark of International Business Machines Corporation. IGP is a registered trademark of Printronix, LLC. Intelligent Printer Data Stream and IPDS are trademarks of International Business Machines Corporation. LinePrinter Plus is a registered trademark of Printronix, LLC. MS-DOS is a registered trademark of Microsoft Corporation. PC-DOS is a trademark of International Business Machines Corporation. PGL is a registered trademark of Printronix, LLC. PrintNet is a registered trademark of Printronix, LLC. Printronix is a registered trademark of Printronix, LLC. PSA is a trademark of Printronix, LLC. QMS is a registered trademark of Quality Micro Systems. RibbonMinder is a trademark of Printronix, LLC. Torx is a registered trademark of Camcar/Textron Inc. Utica is a registered trademark of Cooper Power Tools. Printronix, LLC. makes no representations or warranties of any kind regarding this material, including, but not limited to, implied warranties of merchantability and fitness for a particular purpose. Printronix, LLC. shall not be held responsible for errors contained herein or any omissions from this material or for any damages, whether direct, indirect, incidental or consequential, in connection with the furnishing, distribution, performance or use of this material. The information in this manual is subject to change without notice. This document contains proprietary information protected by copyright. No part of this document may be reproduced, copied, translated or incorporated in any other material in any form or by any means, whether manual, graphic, electronic, mechanical or otherwise, without the prior written consent of Printronix, LLC.
    [Show full text]
  • IPDS and SCS Technical Reference
    IBMNetworkPrinters12,17,24 IBMInfoprint20,21,32,40,45 IBM Infoprint 70 IBM IPDS and SCS Technical Reference S544-5312-07 IBMNetworkPrinters12,17,24 IBMInfoprint20,21,32,40,45 IBM Infoprint 70 IBM IPDS and SCS Technical Reference S544-5312-07 Note Before using this information and the product it supports, be sure to read the general information under “Notices” on page xiii. Eighth Edition (April 2000) This version obsoletes S544-5312-06. Requests for IBM publications should be made to your IBM representative or to the IBM branch office serving your locality. If you request publications from the address given below, your order will be delayed because publications are not stocked there. Many of the IBM Printing Systems Company publications are available from the web page listed below. Internet Visit our home page at: http://www.ibm.com/printers A Reader’s Comment Form is provided at the back of this publication. You may also send comments by fax to 1-800-524-1519, by e-mail to [email protected], or by regular mail to: IBM Printing Systems Department H7FE Building 003G Information Development PO Box 1900 Boulder CO USA 80301-9191 IBM may use or distribute whatever information you supply in any way it believes appropriate without incurring any obligation to you. © Copyright International Business Machines Corporation 1996, 2000. All rights reserved. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Tables ...............ix Chapter 4. Device Control Command Set................23 Notices ..............xiii Acknowledgement Reply ..........23 Trademarks ..............xiii Activate Resource ............25 Resource ID example with RIDF = GRID .
    [Show full text]
  • Control Characters in ASCII and Unicode
    Control characters in ASCII and Unicode Tens of odd control characters appear in ASCII charts. The same characters have found their way to Unicode as well. CR, LF, ESC, CAN... what are all these codes for? Should I care about them? This is an in-depth look into control characters in ASCII and its descendants, including Unicode, ANSI and ISO standards. When ASCII first appeared in the 1960s, control characters were an essential part of the new character set. Since then, many new character sets and standards have been published. Computing is not the same either. What happened to the control characters? Are they still used and if yes, for what? This article looks back at the history of character sets while keeping an eye on modern use. The information is based on a number of standards released by ANSI, ISO, ECMA and The Unicode Consortium, as well as industry practice. In many cases, the standards define one use for a character, but common practice is different. Some characters are used contrary to the standards. In addition, certain characters were originally defined in an ambiguous or loose way, which has resulted in confusion in their use. Contents Groups of control characters Control characters in standards o ASCII control characters o C1 control characters o ISO 8859 special characters NBSP and SHY o Control characters in Unicode Control characters in modern applications Character list o ASCII o C1 o ISO 8859 Categories Translations Character index Sources This article starts by looking at the history of control characters in standards. We then move to modern times.
    [Show full text]
  • International Character Code Standard for the BE2
    °, , CMU-ITC-87-091 International Character Code Standard for the BE2 June 18, 1987 Tomas Centerlind Information Technology Center (ITC) Camegie Mellon University 1. Major problems with foreign languages All European languages have a set of unique characters, even Great Britain with their Pound sign. Most of these characters are static and do not change if they are in the end or in the middle of the word. The Greek sigma sign however is an example of a character that changes look depending on the position. If we move on to the non-Roman alphabets like Arabic, they have a more complex structure. A basic rule is that certain of the characters are written together if they follow each other but not otherwise. This complicates representation and requires look ahead. In addition to this many of the languages have leftwards or downwards writing. All together these properties makes it very difficult to integrate them with Roman languages. Lots of guidelines have to be established to solve these problems, and before any coding can be done a set of standards must be selected or defined. In this paper I intend to gather all that I can think of that must be considered before selecting standards. Only the basic level of the implementation will be designed, so therefore routines that for example display complex languages like Arabic must be written by the user. A basic method that dislpays Roman script correctly will be supported. 1. Standards 1.1 Existing standards Following is a list of currently existing and used standards. 1.1.1 ASCII, ISO646 The ASCII standard, that is in use ahnost anywhere, will probably have to rcmain as a basic part of the system, and without any doubt it must be possible to read and write ASCII coded documents in the foreseeable future.
    [Show full text]
  • International Register of Coded Character Sets to Be Used with Escape Sequences for Information Interchange in Data Processing
    INTERNATIONAL REGISTER OF CODED CHARACTER SETS TO BE USED WITH ESCAPE SEQUENCES 1 Introduction 1.1 General This document is the ISO International Register of Coded Character Sets To Be Used With Escape Sequences for information interchange in data processing. It is compiled in accordance with the provisions of ISO/IEC 2022, "Code Extension Technique" and of ISO 2375 "Procedure for Registration of Escape Sequences". This International Register contains coded character sets which have been registered in accordance with procedures given in ISO 2375. Its purpose is to identify widely used coded character sets and associate with each a unique escape sequence by means of which it can be designated according to ISO/IEC 2022 and ISO/IEC 4873. The publication of this International Register should promote compatibility in international information interchange and avoid duplication of effort in developing application-oriented coded character sets. Registration provides an identification for a coded character set but implies nothing about its status; it may or may not be part of a standard of an international, national or a corporate body. However, if such a standard is published subsequently to the registration, it would be appropriate for the escape sequence identifying the character set to be specified in the standard. If it is desired to register a set, application should be made to the Registration Authority through an appropriate Sponsoring Authority as specified in ISO 2375. Any character set can be a candidate for registration if it meets the requirements of ISO 2375. The Registration Authority ascertains that the proposals received are formally in accordance with this International Standard, technically in accordance with ISO/IEC 2022, and, where applicable, with ISO/IEC 646 and ISO/IEC 4873, and meet the presentation practice of the Registration Authority.
    [Show full text]
  • L2/98-354 Terminal Graphics for Unicode
    L2/98-354 TERMINAL GRAPHICS FOR UNICODE Frank da Cruz The Kermit Project Columbia University New York City USA [email protected] http://www.columbia.edu/kermit/ D R A F T # 4 Tue Nov 3 19:08:58 1998 THIS IS A PREFORMATTED PLAIN-TEXT ASCII DOCUMENT. IT IS DESIGNED TO BE VIEWED AS-IS IN A FIXED-PITCH FONT. ITS WIDEST LINE IS 79 COLUMNS. IT CONTAINS NO TABS. IF IT LOOKS MESSY TO YOU, PLEASE FEEL FREE TO PICK UP A CLEAN COPY OF THIS OR THE RELATED PROPOSALS BY ANONYMOUS FTP: HEX BYTE PICTURES FOR UNICODE (plain text) ftp://kermit.columbia.edu/kermit/ucsterminal/hex.txt ADDITIONAL CONTROL PICTURES FOR UNICODE (plain text) ftp://kermit.columbia.edu/kermit/ucsterminal/control.txt TERMINAL GRAPHICS FOR UNICODE (plain text) ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal.txt Glyph Map (PDF, contributed by Michael Everson) ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-emulation.pdf Clarification of SNI Glyphs (Microsoft Word 7.0) ftp://kermit.columbia.edu/kermit/ucsterminal/sni-charsets.doc Discussion (plain text) ftp://kermit.columbia.edu/kermit/ucsterminal/mail.txt (Note, the Exhibits are on paper and not available at the FTP site.) ABSTRACT A selection of terminal graphics characters is proposed for Unicode [24] and ISO 10646 [19] to allow Unicode-based terminal emulation software to display glyphs that are found on popular types of terminals but currently are not available in Unicode, and to exchange these characters with other Unicode-based applications. 1 CONTENTS 1. Introduction 2. Scope 3. Organization 4. (deleted) 5. 3270 Terminal Operator Status Indicators 6.
    [Show full text]
  • Bbedit 10.5.5 User Manual
    User Manual BBEdit™ Professional HTML and Text Editor for the Macintosh Bare Bones Software, Inc. ™ BBEdit 10.5.5 Product Design Jim Correia, Rich Siegel, Steve Kalkwarf, Patrick Woolsey Product Engineering Jim Correia, Seth Dillingham, Jon Hueras, Steve Kalkwarf, Rich Siegel, Steve Sisak Engineers Emeritus Chris Borton, Tom Emerson, Pete Gontier, Jamie McCarthy, John Norstad, Jon Pugh, Mark Romano, Eric Slosser, Rob Vaterlaus Documentation Philip Borenstein, Stephen Chernicoff, John Gruber, Simon Jester, Jeff Mattson, Jerry Kindall, Caroline Rose, Rich Siegel, Patrick Woolsey Additional Engineering Polaschek Computing Icon Design Byran Bell Packaging Design Ultra Maroon Design Consolas for BBEdit included under license from Ascender Corp. PHP keyword lists contributed by Ted Stresen-Reuter http://www.tedmasterweb.com/ Exuberant ctags ©1996-2004 Darren Hiebert http://ctags.sourceforge.net/ Info-ZIP ©1990-2009 Info-ZIP. Used under license. HTML Tidy Technology ©1998-2006 World Wide Web Consortium http://tidy.sourceforge.net/ LibNcFTP ©1996-2010 Mike Gleason & NcFTP Software NSTimer+Blocks ©2011 Random Ideas, LLC. Used under license. PCRE Library Package written by Philip Hazel and ©1997-2004 University of Cambridge, England Quicksilver string ranking Adapted from available sources and used under Apache License 2.0 terms. BBEdit and the BBEdit User Manual are copyright ©1992-2013 Bare Bones Software, Inc. All rights reserved. Produced/published in USA. Bare Bones Software, Inc. 73 Princeton Street, Suite 206 North Chelmsford, MA 01863 USA (978) 251-0500 main (978) 251-0525 fax http://www.barebones.com/ Sales & customer service: [email protected] Technical support: [email protected] BBEdit and “It Doesn’t Suck” are registered trademarks of Bare Bones Software, Inc.
    [Show full text]