Non-Printable and Special Characters? ... BYTE me! Louise Sims PHASTAR, Unit 2A, Bollo Lane, London, W4 5LE [email protected]

The NOTPRINT function NPSC in Outputs What are NPSC? shows the first occurrence Computer Encoding Sets of NPSC is at position 76 Adding directly to the data Example: Adding Greek letter µ to replace the u in the laboratory unit umol/L.  American Standard Code for Information Interchange (ASCII): a 7-bit SBCS RANK function data adlb; which contains 128 coding points (0 to 127). Syntax: RANK (expression) where expression is a character string. set adam.adlb;

 Extended ASCII set: an 8-bit extension of the ASCII coding set which contains where paramcd=“CREAT_S”; coding points 128 to 255. BEWARE: Different versions of the Extended ASCII set, Example: si_unit=BYTE(181)||SUBSTR(param,14,5); data ie; e.g. ISO 8859-1 version, CP437 version. run; set sdtm.ie; RANK function shows the  Extended Binary Coded Decimal Interchange Code (EBCDIC) set: an 8-bit BYTE function can be used to ntprnt=NOTPRINT(ietest); NPSC is ASCII coding SBCS with coding points 0 to 255. point 10; the line feed add NPSC to data to follow- id=RANK(SUBSTR(ietest,ntprnt,1)); through to the output.  set: Much larger than ASCII and EBCDIC with each character between run; . 8-bits and 32-bits in size, contains characters which cover most of the world’s languages. Adding directly to the output code

ods escapechar=“^”;

compute before page / style={just=l}; line @1 “Total Bilirubin (^{Unicode mu}mol/L)”; Removing NPSC from Data endcomp;

COMPRESS function The ODS escape character can Syntax: COMPRESS (source <, characters><, modifiers>) be used to add NPSC directly to

where source is the character string, characters (optional) are specified characters an output title, heading or label.

which should be removed from the string, modifiers (optional) are constants which modify the function.

Hidden Dragon/Invisible Character Example: Example: data co_raw; data final1; Blank spaces obtained by pressing set raw.comments_all; Alt+255 on the number keypad 7 times. set final; cmnt=COMPRESS(comment, , “kw”); if ord=3 then do; run; if LENGTH(txt)>52 then col1=SUBSTR(txt,1,51)||” “|| NPSC removed from string SUBSTR(txt,52); using the k and w modifiers else col1=txt; in the COMPRESS function. end; run;

Can cause NPSC Do not cause NPSC Can cause NPSC BYTE function Since the control characters Since these are Since there are different versions of the Syntax: BYTE (n) where n is a numeric value between 0 and 255 and represents are non-printable, these can printable, they usually do extended ASCII set, if the input encoding set is the coding point in ASCII/EDCDIC. appear as blank spaces or as not cause any problems different to the encoding set used to view the odd symbols. Note this does within data. data, different coding points will correspond to Example: different characters. These cannot be displayed not apply to 127 (Delete) which data cm; BEFORE AFTER properly so can cause NPSC. is printable. set raw.conmeds; check=NOTPRINT(cmverb); Importing Data with NPSC into SAS Check your SAS session encoding version! if check>0 then id=RANK(SUBSTR(cmverb,check,1)); Step 1: Open the external data within Notepad++. proc options option=encoding; cmverb_=COMPBL(TRANWRD(cmverb, BYTE(id), “ “)); Step 2: Open the Find and Replace tool in Notepad++, ensuring the option run; run; “Extended (\n, \r, \t, \0, \x...)” is selected under Search Mode. Shows the current encoding Step 3: Search for NPSC and replace with a blank . version is WLATIN1. This is \n = line feeds, \r = carriage returns, \t = tab, \0 = character and \xddd = ASCII the default encoding version in Western European countries. Replacing NPSC character where ddd is the ASCII/Unicode coding value. BYTE and TRANWRD function BEFORE AFTER Example: Identifying NPSC data co; NOTPRINT Function set sdtm.co; Syntax: NOTPRINT (“character string” <,start>) id=RANK(SUBSTR(coval,83,1)); coval_=TRANWRD(coval, BYTE(id), “’”); where character string is the text in which to search for NPSC, start (optional) is the The BYTE function identifies run; starting position within the character string to start searching from. the ASCII value of the arrow External data with NPSC imported into SAS. External data with NPSC removed imports into SAS correctly. and the TRANWRD function Example: replaces all occurrences of data ie; the arrow with an apostrophe. set sdtm.ie; ntprnt=NOTPRINT(ietest); External data in Notepad++, with NPSC removed. Each record External data with NPSC in Notepad++ where appears on one line, enabling SAS to process the data correctly run; it is easier to see the NPSC affecting the data. when it is imported.