NAME SYNOPSIS ABSTRACT DESCRIPTION Note on ISO-2022-JP(-1)?

NAME SYNOPSIS ABSTRACT DESCRIPTION Note on ISO-2022-JP(-1)?

Perl version 5.10.0 documentation - Encode::JP NAME Encode::JP - Japanese Encodings SYNOPSIS use Encode qw/encode decode/; $euc_jp = encode("euc-jp", $utf8); # loads Encode::JP implicitly $utf8 = decode("euc-jp", $euc_jp); # ditto ABSTRACT This module implements Japanese charset encodings. Encodingssupported are as follows. Canonical AliasDescription -------------------------------------------------------------------- euc-jp /\beuc.*jp$/iEUC (Extended Unix Character) /\bjp.*euc/i /\bujis$/i shiftjis /\bshift.*jis$/iShift JIS (aka MS Kanji) /\bsjis$/i 7bit-jis /\bjis$/i7bit JIS iso-2022-jpISO-2022-JP [RFC1468] = 7bit JIS with all Halfwidth Kana converted to Fullwidth iso-2022-jp-1ISO-2022-JP-1 [RFC2237] = ISO-2022-JP with JIS X 0212-1990 support. See below MacJapanese Shift JIS + Apple vendor mappings cp932 /\bwindows-31j$/i Code Page 932 = Shift JIS + MS/IBM vendor mappings jis0201-raw JIS0201, raw format jis0208-raw JIS0201, raw format jis0212-raw JIS0201, raw format -------------------------------------------------------------------- DESCRIPTION To find out how to use this module in detail, see Encode. Note on ISO-2022-JP(-1)? ISO-2022-JP-1 (RFC2237) is a superset of ISO-2022-JP (RFC1468) whichadds support for JIS X 0212-1990. That means you can use the samecode to decode to utf8 but not vice versa. $utf8 = decode('iso-2022-jp-1', $stream); and $utf8 = decode('iso-2022-jp', $stream); yield the same result but $with_0212 = encode('iso-2022-jp-1', $utf8); is now different from $without_0212 = encode('iso-2022-jp', $utf8 ); In the latter case, characters that map to 0212 are first convertedto U+3013 (0xA2AE in EUC-JP; a http://perldoc.perl.org Page 1 Perl version 5.10.0 documentation - Encode::JP white square also known as 'Tofu' or'geta mark') then fed to the decoding engine. U+FFFD is not used,in order to preserve text layout as much as possible. BUGS The ASCII region (0x00-0x7f) is preserved for all encodings, eventhough this conflicts with mappings by the Unicode Consortium. See http://www.debian.or.jp/~kubota/unicode-symbols.html.en to find out why it is implemented that way. SEE ALSO Encode http://perldoc.perl.org Page 2.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    2 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us