IBM Education Assistance for z/OS V2R1

Item: Services Exploitation Element/Component: z/OS System Services

Material is current as of March 2013 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode

Agenda

■ Trademarks ■ Presentation Objectives ■ Overview ■ Usage & Invocation ■ Interactions & Dependencies ■ Migration & Coexistence Considerations ■ Presentation Summary ■ Appendix

Page 2 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Trademarks

■ See url http://www.ibm.com/legal/copytrade.shtml for a list of trademarks.

■ Additional trademarks: Unicode is a registered trademark of Unicode, Inc. in the United States and other countries.

Page 3 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Presentation Objectives

■ Description of Unicode Services Exploitation by UNIX System Services and its need. ■ External interfaces. ■ Examples of usage.

Page 4 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Overview

■ Problem: Files can reside in different systems or geographic locations from which it originated. Conversion between code pages may be necessary.

■ Solution: 1) Allow z/OS UNIX files to be tagged with any code page. 2) Allow auto conversion to occur when file I/O is issued by a program.

 z/OS UNIX will use z/OS Unicode Services to allow a program to read/write a file tagged with a code page and convert file data to/from the program's code page.

 Benefit/Value:  z/OS UNIX now fully participates in the text conversion arena and supports auto conversion for all IBM code pages.

Page 5 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Overview

■ Design is based on the z/OS V1R2 Enhanced ASCII function: 1) Each program or thread has a code page (default ccsid = 1047). 2) Unix files can have a code page (i.e., file tag). 3) Run time environment can be enabled for automatic conversion.

Page 6 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Usage & Invocation

■ Assign a ccsid (numeric id for a code page) to a thread/program: ➢ Default = 1047 ➢ _BPXK_PCCSID=ccsid ➢ Compile C program with ASCII option (ccsid = 819) ➢ fcntl(fd,f_control_cvt,...) can assign program ccsid for open file

 Tag a file: ➢ Default = no tag, i.e., no conversion ➢ /bin/chtag -tc 1208 file1 ➢ /bin/cp file1 file2 (tag propagation) ➢ Compile C program with FILETAG(...,AUTOTAG) run option ➢ fcntl(fd,f_settag,&tag); /* assign file ccsid for this file */ ➢ fcntl(fd,f_control_cvt,...); /* just for life of this open */

Page 7 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Usage & Invocation  Enable the conversion environment: ➢ For System: SYS1.PARMLIB(BPXPRMxx): ➢ AUTOCVT(ALL) ➢ SETOMVS AUTOCVT=ALL ➢ SET OMVS=(xx) ➢ D OMVS,OPTIONS displays the Parmlib values ➢ For Session or Program: ➢ export _BPXK_AUTOCVT=ALL ➢ setenv(“_BPXK_AUTOCVT”,”ALL”,1); ➢ For single open file before 1st read or write: ➢ fcntl(fd,f_control_cvt,....) enables only for this open file

Example: > _BPXK_AUTOCVT=ALL mypgm mytaggedfile

Page 8 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Details

 Function resides in Logical (LFS).  Standard read/write I/O supported; special v_rdwr not supported  Unlike Enhanced ASCII, character special file conversion is not allowed. ➢ Existing ASCII programs using std streams need Autocvt=On  REXX syscalls are also supported: ➢ f_settag, f_control_cvt, environment()  Environment variables: ➢ _BPXK_AUTOCVT = OFF | ON | ALL ➢ _BPXK_PCCSID = ccsid ➢ _BPXK_UNICODE_TECHNIQUE = (LMREC0-9) ➢ _BPXK_UNICODE_SUB = YES | NO ( action) ➢ _BPXK_UNICODE_MAL = YES | NO (mal-formed character action) ➢ Tip: Set these up before the first read/write

Page 9 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Details

 Passthru of Unicode Services conversion errors: ➢ z/OS UNIX reason code E400ccrr and E401ccrr is a Unicode Services error with return code cc and reason code rr

 LFS gets three large buffers to hold and convert the data ➢ For reads, any extra data is discarded when the I/O completes. ➢ For reads and writes, any ending partial character is cached for the next read or write.  close() can cause loss of a (cached) character ➢ Only converted data is supplied to the program or file. ➢ The read/write return value reflects the bytes supplied to/from the program, not the amount read from or written to the file.  The cursor reflects the actual amount LFS reads from or writes to the physical file system.

Page 10 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Details

 The internal buffers used for conversion is persistent kernel above the bar (64 bit) storage. ➢ Excessive usage can cause significant paging and virtual 31 storage accumulation. ➢ Parmlib keyword MAXIOBUFUSER(nnnnn) limits each user's storage for conversioni to nnnnnM (Mbytes).  MAXIOBUFUSER(2G) = 2P (petabytes) of storage  MAXIOBUFUSER(2048) = 2G is the default  SETOMVS MAXIOBUFUSER=nnnnn is supported  Multi-threaded effects ➢ Each thread sharing a single open file must have the same program ccsid. ➢ Simultaneous reads or writes resulting in partial characters can cause an error. ➢ Additional threads use temporary above the bar storage for buffers.

Page 11 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Details

 lseek problems: 1) lseek to a non-character boundary, followed by a read or write. 2) lseek from one code page to another (MBCS files). 3) lseek to a non-character boundary causing a false positive conversion.

An lseek to other than the current position or file beginning results in a subsequent I/O conversion error for DBCS/MBCS files.

Bypass: (????)

lseek(fd,position,seek_set); ------> lseek(fd,0,seek_set); read(fd,buf,position);  Character boundary problems only occur with DBCS/MBCS code pages.  Sequential reading/writing is the preferred I/O method when converting.

Page 12 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Interactions & Dependencies

■ Software Dependencies None ■ Hardware Dependencies None ■ Exploiters Unix Shell & Utilities (e.g., /bin/cat, …) C Run Time Library (e.g., fopen,fread,fwrite,fcntl)

Page 13 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Presentation Summary

■ z/OS UNIX has extended Enhanced ASCII function to provide any code page to any code page conversion.

■ z/OS UNIX is exploiting z/OS Unicode Services to provide this function.

■ Read and write conversion occurs when the environment is enabled and the program CCSID and file CCSID do not match.

Page 14 of 15 © 2013 IBM Corporation Filename: zOS V2R1 UNIX Unicode IBM Presentation Template Full Version Appendix

■ z/OS UNIX System Services Planning (GA22-7800) ■ z/OS UNIX System Services File System Interface Reference SA23-2285 ■ z/OS UNIX System Services Command Reference (SA23-2280) ■ z/OS Using REXX and z/OS UNIX System Services (SA23-2283) ■ z/OS MVS System Commands (SA38-0666) ■ z/OS MVS Initialization and Tuning Reference (SA23-1380) ■ z/OS XL C/C++ Run-Time Library Reference ■ z/OS Unicode Services User's Guide & Reference (SA38-0689) ■ Character Data Representation Architecture (SC09-2190) and http://www-01.ibm.com/software/globalization/cdra

Page 15 of 15 © 2013 IBM Corporation