=encoding euc-jp

=head1 NAME

=begin original

encoding - allows you to write your script in non-ascii or non-utf8

=end original

encoding - 非 ascii や非 utf8 でスクリプトを書けるようにする

=head1 SYNOPSIS

  use encoding "greek";  # Perl like Greek to you?
  use encoding "euc-jp"; # Jperl!

=begin original

  # or you can even do this if your shell supports your native encoding

=end original

  # ネイティブエンコーディングをシェルが対応しているのなら以下のようにも書ける

  perl -Mencoding=latin2 -e '...' # Feeling centrally European?
  perl -Mencoding=euc-kr -e '...' # Or Korean?

  # more control

=begin original

  # A simple euc-cn => utf-8 converter
  use encoding "euc-cn", STDOUT => "utf8";  while(<>){print};

=end original

  # 単純な euc-cn → utf-8 コンバータ
  use encoding "euc-cn", STDOUT => "utf8";  while(<>){print};

=begin original

  # "no encoding;" supported (but not scoped!)
  no encoding;

=end original

  # "no encoding;" もサポートされている (しかしスコープはありません!)
  no encoding;

=begin original

  # an alternate way, Filter
  use encoding "euc-jp", Filter=>1;
  # now you can use kanji identifiers -- in euc-jp!

=end original

  # 別のやり方、Filter
  use encoding "euc-jp", Filter=>1;
  # これで漢字の識別子が使えます -- euc-jp で!

  # switch on locale -
  # note that this probably means that unless you have a complete control
  # over the environments the application is ever going to be run, you should
  # NOT use the feature of encoding pragma allowing you to write your script
  # in any recognized encoding because changing locale settings will wreck
  # the script; you can of course still use the other features of the pragma.
  use encoding ':locale';

=head1 ABSTRACT

=begin original

Let's start with a bit of history: Perl 5.6.0 introduced Unicode
support.  You could apply C<substr()> and regexes even to complex CJK
characters -- so long as the script was written in UTF-8.  But back
then, text editors that supported UTF-8 were still rare and many users
instead chose to write scripts in legacy encodings, giving up a whole
new feature of Perl 5.6.

=end original

ちょっとした歴史から始めましょう: Perl 5.6.0 で Unicode サポートが
導入されました。
C<substr()> や正規表現を複雑な CJK 文字に適用することができるように
なりました -- スクリプトが UTF-8 で書かれていれば。
しかし以前は、UTF-8 をサポートしたテキストエディタはあまり存在していなくて、
多くのユーザは従来のエンコーディングでスクリプトを書いてしまって
Perl5.6 の新機能全体を使うことをあきらめていました。

=begin original

Rewind to the future: starting from perl 5.8.0 with the B<encoding>
pragma, you can write your script in any encoding you like (so long
as the C<Encode> module supports it) and still enjoy Unicode support.
This pragma achieves that by doing the following:

=end original

未来へ話を戻しましょう: perl 5.8.0 で B<encoding> プラグマが導入され、
(C<Encode> モジュールが対応していれば)任意のエンコーディングで
スクリプトを書けるようになり、Unicode サポートも以前同様に
使うことができます。
このプラグマは以下に挙げるようなことをすることで達成されます:

=over

=item *

=begin original

Internally converts all literals (C<q//,qq//,qr//,qw///, qx//>) from
the encoding specified to utf8.  In Perl 5.8.1 and later, literals in
C<tr///> and C<DATA> pseudo-filehandle are also converted.

=end original

すべてのリテラル(C<q//,qq//,qr//,qw///, qx//>)を内部的に指定された
エンコーディングから utf8 に変換します。
Perl 5.8.1 以降では、C<tr///> のリテラルと擬似ファイルハンドル C<DATA> も
同様に変換されます。

=item *

=begin original

Changing PerlIO layers of C<STDIN> and C<STDOUT> to the encoding
specified.

=end original

C<STDIN> と C<STDOUT> の PerlIO 層を指定されたエンコーディングに変更します。

=back

=head2 Literal Conversions

(リテラル変換)

=begin original

You can write code in EUC-JP as follows:

=end original

EUC-JP で次のようにコードを書くことができます:

=begin original

  my $Rakuda = "\xF1\xD1\xF1\xCC"; # Camel in Kanji
               #<-char-><-char->   # 4 octets
  s/\bCamel\b/$Rakuda/;

=end original

  my $Rakuda = "\xF1\xD1\xF1\xCC"; # 駱駝
               #<-char-><-char->   # 4 オクテット
  s/\bCamel\b/$Rakuda/;

=begin original

And with C<use encoding "euc-jp"> in effect, it is the same thing as
the code in UTF-8:

=end original

C<use encoding "euc-jp"> が有効の場合、これは UTF-8 で
次のように書いたコードと同じです:

=begin original

  my $Rakuda = "\x{99F1}\x{99DD}"; # two Unicode Characters
  s/\bCamel\b/$Rakuda/;

=end original

  my $Rakuda = "\x{99F1}\x{99DD}"; # 二つの Unicode 文字
  s/\bCamel\b/$Rakuda/;

=head2 PerlIO layers for C<STD(IN|OUT)>

(C<STD(IN|OUT)> のための PerlIO 層)

=begin original

The B<encoding> pragma also modifies the filehandle layers of
STDIN and STDOUT to the specified encoding.  Therefore,

=end original

B<encoding> プラグマはまた、STDIN および STDOUT のファイルハンドル層を、
指定されたエンコーディングに変更します。
したがって、

  use encoding "euc-jp";
  my $message = "Camel is the symbol of perl.\n";
  my $Rakuda = "\xF1\xD1\xF1\xCC"; # Camel in Kanji
  $message =~ s/\bCamel\b/$Rakuda/;
  print $message;

=begin original

Will print "\xF1\xD1\xF1\xCC is the symbol of perl.\n",
not "\x{99F1}\x{99DD} is the symbol of perl.\n".

=end original

これは "\x{99F1}\x{99DD} is the symbol of perl.\n" ではなく
"\xF1\xD1\xF1\xCC is the symbol of perl.\n" を出力します。

=begin original

You can override this by giving extra arguments; see below.

=end original

追加の引数を与えることによってこれをオーバーライドできます。
以下を参照してください。

=head2 Implicit upgrading for byte strings

(バイト文字列の暗黙の昇格)

=begin original

By default, if strings operating under byte semantics and strings
with Unicode character data are concatenated, the new string will
be created by decoding the byte strings as I<ISO 8859-1 (Latin-1)>.

=end original

デフォルトでは、バイトセマンティクスで操作している文字列と
Unicode 文字データの文字列を連結すると、新しい文字列は
バイト文字列を I<ISO 8859-1 (Latin-1)> としてデコードしたものから
作られます。

=begin original

The B<encoding> pragma changes this to use the specified encoding
instead.  For example:

=end original

B<encoding> プラグマは、代わりに指定されたエンコーディングを使うことで
これを変更します。
例えば:

    use encoding 'utf8';
    my $string = chr(20000); # a Unicode string
    utf8::encode($string);   # now it's a UTF-8 encoded byte string
    # concatenate with another Unicode string
    print length($string . chr(20000));

=begin original

Will print C<2>, because C<$string> is upgraded as UTF-8.  Without
C<use encoding 'utf8';>, it will print C<4> instead, since C<$string>
is three octets when interpreted as Latin-1.

=end original

これは C<2> を表示します; なぜなら C<$string> は UTF-8 に
昇格されるからです。
C<use encoding 'utf8';> がない場合、代わりに C<4> を表示します;
なぜなら C<$string> は Latin-1 として解釈されると 3 オクテットだからです。

=head2 Side effects

(副作用)

=begin original

If the C<encoding> pragma is in scope then the lengths returned are
calculated from the length of C<$/> in Unicode characters, which is not
always the same as the length of C<$/> in the native encoding.

=end original

スコープ内に C<encoding> プラグマがある場合、
返される長さは Unicode 文字での C<$/> の長さから計算され、
これはネイティブエンコーディングでの C<$/> の長さと常に同じとは
限りません。

=begin original

This pragma affects utf8::upgrade, but not utf8::downgrade.

=end original

このプラグマは utf8::upgrade に影響を与えますが、utf8::downgrade には
影響を与えません。

=head1 FEATURES THAT REQUIRE 5.8.1

(5.8.1 が必要な機能)

=begin original

Some of the features offered by this pragma requires perl 5.8.1.  Most
of these are done by Inaba Hiroto.  Any other features and changes
are good for 5.8.0.

=end original

本プラグマで提供される機能の幾つかは perl 5.8.1 を要求します。
これらの機能の大部分は Inaba Hiroto により行われました。
その他の機能と変更点は 5.8.0 で使えます。

=over

=item "NON-EUC" doublebyte encodings

(「非 EUC」2 バイトエンコーディング)

=begin original

Because perl needs to parse script before applying this pragma, such
encodings as Shift_JIS and Big-5 that may contain '\' (BACKSLASH;
\x5c) in the second byte fails because the second byte may
accidentally escape the quoting character that follows.  Perl 5.8.1
or later fixes this problem.

=end original

perl はこのプラグマを適用する前にスクリプトを解析する必要があるので、
Shift_JIS や Big-5 のような、2 バイト目に'\'(バックスラッシュ; \x5c)を含む
可能性があるエンコーディングで失敗していました。
なぜなら、二バイト目が誤って後続するクォート文字をエスケープしてしまう
可能性があるからです。
Perl 5.8.1 以降ではこの問題は解決されました。

=item tr// 

=begin original

C<tr//> was overlooked by Perl 5 porters when they released perl 5.8.0
See the section below for details.

=end original

C<tr//> は perl 5.8.0 リリースのときには Perl 5 porters が見逃して
しまっていました。
詳しくは後述のセクションを参照してください。

=item DATA pseudo-filehandle

(DATA 疑似ファイルハンドル)

=begin original

Another feature that was overlooked was C<DATA>. 

=end original

もう一つ見逃されていた機能は C<DATA> です。

=back

=head1 USAGE

(使用法)

=over 4

=item use encoding [I<ENCNAME>] ;

=begin original

Sets the script encoding to I<ENCNAME>.  And unless ${^UNICODE} 
exists and non-zero, PerlIO layers of STDIN and STDOUT are set to
":encoding(I<ENCNAME>)".

=end original

スクリプトのエンコーディングを I<ENCNAME> に設定します。
${^UNICODE} が存在していてそれが非ゼロでない限り、STDIN および STDOUT の
PerlIO 層は ":encoding(I<ENCNAME>)" に設定されます。

=begin original

Note that STDERR WILL NOT be changed.

=end original

STDERR は変更されないことに注意してください。

=begin original

Also note that non-STD file handles remain unaffected.  Use C<use
open> or C<binmode> to change layers of those.

=end original

同様に、非 STD ファイルハンドルも影響を受けないことに注意してください。
これらの層を変更するには C<use open> または C<binmode> を使います。

=begin original

If no encoding is specified, the environment variable L<PERL_ENCODING>
is consulted.  If no encoding can be found, the error C<Unknown encoding
'I<ENCNAME>'> will be thrown.

=end original

エンコーディングが指定されていない場合、環境変数 L<PERL_ENCODING> が
参照されます。
エンコーディングが見つからなかった場合には、
C<Unknown encoding 'I<ENCNAME>'> というエラーになります。

=item use encoding I<ENCNAME> [ STDIN =E<gt> I<ENCNAME_IN> ...] ;

=begin original

You can also individually set encodings of STDIN and STDOUT via the
C<< STDIN => I<ENCNAME> >> form.  In this case, you cannot omit the
first I<ENCNAME>.  C<< STDIN => undef >> turns the IO transcoding
completely off.

=end original

C<< STDIN => ENCNAME >> 形式を使うことによって、STDIN と STDOUT の
エンコーディングを独立に設定できます。
この場合、最初の I<ENCNAME> を省略することはできません。
C<< STDIN => undef >> は入出力の変換(transcoding)を完全にオフにします。

=begin original

When ${^UNICODE} exists and non-zero, these options will completely
ignored.  ${^UNICODE} is a variable introduced in perl 5.8.1.  See
L<perlrun> see L<perlvar/"${^UNICODE}"> and L<perlrun/"-C"> for
details (perl 5.8.1 and later).

=end original

${^UNICODE} が存在していており、かつそれが非ゼロであった場合、これらの
オプションは完全に無視されます。
$<^UNICDOE>は perl 5.8.1 で導入された変数です。
L<perlrun>, L<perlvar/"${^UNICODE}">, L<perlrun/"-C"> を
参照してください (perl 5.8.1 以降)。

=item use encoding I<ENCNAME> Filter=E<gt>1;

=begin original

This turns the encoding pragma into a source filter.  While the
default approach just decodes interpolated literals (in qq() and
qr()), this will apply a source filter to the entire source code.  See
L</"The Filter Option"> below for details.

=end original

これはエンコーディングプラグラマをソースフィルタに適用します。
デフォルトのアプローチが(qq()やqr()中で) 変数展開されたリテラルを
デコードするだけなのに対して、本プラグマはソースコード全体に
ソースフィルタを適用します。
詳しくは後述する L</"The Filter Option"> を参照してください。

=item no encoding;

=begin original

Unsets the script encoding. The layers of STDIN, STDOUT are
reset to ":raw" (the default unprocessed raw stream of bytes).

=end original

スクリプトエンコーディングを解除します。
STDIN、STDOUT の層は ":raw" (デフォルトの、バイトの生ストリームを
処理しない)にリセットされます。

=back

=head1 The Filter Option

(Filter オプション)

=begin original

The magic of C<use encoding> is not applied to the names of
identifiers.  In order to make C<${"\x{4eba}"}++> ($human++, where human
is a single Han ideograph) work, you still need to write your script
in UTF-8 -- or use a source filter.  That's what 'Filter=>1' does.

=end original

C<use encoding> の魔法は識別子には適用されません。
C<${"\x{4eba}"}++> (漢字一文字の'人'、$human++)が動作するようにするには、
UTF-8でスクリプトを記述する必要があります -- あるいはソースフィルタを
使います。
つまり 'Filter=>1' とします。

=begin original

What does this mean?  Your source code behaves as if it is written in
UTF-8 with 'use utf8' in effect.  So even if your editor only supports
Shift_JIS, for example, you can still try examples in Chapter 15 of
C<Programming Perl, 3rd Ed.>.  For instance, you can use UTF-8
identifiers.

=end original

これは何を意味するのでしょう?
ソースコードは 'use utf8' を指定して UTF-8で書いたかのように振る舞います。
だから使っているエディタがたとえば Shift_JIS しかサポートしていなくても、
C<Programming Perl, 3rd Ed.> の第 15 章にある例を試すことがでます。
たとえば、UTF-8 識別子を使えます。

=begin original

This option is significantly slower and (as of this writing) non-ASCII
identifiers are not very stable WITHOUT this option and with the
source code written in UTF-8.

=end original

このオプションは非常に遅く、(これを書いている時点では) ASCII でない
識別子は、このオプション抜きでかつソースコードが UTF-8 で記述されている
場合には、全く安定していません。

=head2 Filter-related changes at Encode version 1.87

(Encode バージョン 1.87 での Filter 関連の変更)

=over

=item *

=begin original

The Filter option now sets STDIN and STDOUT like non-filter options.
And C<< STDIN=>I<ENCODING> >> and C<< STDOUT=>I<ENCODING> >> work like
non-filter version.

=end original

現在、Filter オプションは非フィルタオプションのように STDIN および STDOUT を
設定します。
そして C<< STDIN=>I<ENCODING> >> や C<< STDOUT=>I<ENCODING> >> は
非フィルタ版と同様に動作します。

=item *

=begin original

C<use utf8> is implicitly declared so you no longer have to C<use
utf8> to C<${"\x{4eba}"}++>.

=end original

C<use utf8> は暗黙に宣言されるので、 C<${"\x{4eba}"}++> とするために
C<use utf8> とする必要はもはやありません。

=back

=head1 CAVEATS

(警告)

=head2 NOT SCOPED

(スコープではありません)

=begin original

The pragma is a per script, not a per block lexical.  Only the last
C<use encoding> or C<no encoding> matters, and it affects 
B<the whole script>.  However, the <no encoding> pragma is supported and 
B<use encoding> can appear as many times as you want in a given script. 
The multiple use of this pragma is discouraged.

=end original

このプラグマはスクリプト毎のものであってブロックレキシカル毎では
ありません。
最後に現れた C<use encoding> もしくは C<no encoding> だけが意味を持ち、
スクリプト全体に影響を及ぼします。
しかしながら、B<no encoding> プラグマはサポートされ、
B<use encoding> はスクリプトの中で何回現れてもかまいません。
このプラグマを複数回使用することは非推奨です。

=begin original

By the same reason, the use this pragma inside modules is also
discouraged (though not as strongly discouraged as the case above.  
See below).

=end original

同じ理由で、本プラグマをモジュールの中で使うことも非推奨です
(しかし、上述の場合ほど強い非推奨ではありません。以降を参照してください)。

=begin original

If you still have to write a module with this pragma, be very careful
of the load order.  See the codes below;

=end original

それでもこのプラグマを使ったモジュールを書く必要があるのなら、ロードされる
順番に十分に注意してください。
以下のコードを見てみましょう;

  # called module
  package Module_IN_BAR;
  use encoding "bar";
  # stuff in "bar" encoding here
  1;

  # caller script
  use encoding "foo"
  use Module_IN_BAR;
  # surprise! use encoding "bar" is in effect.

=begin original

The best way to avoid this oddity is to use this pragma RIGHT AFTER
other modules are loaded.  i.e.

=end original

この現象を避ける最善の方法は他のモジュールをロードした後でこのプラグマを
使うというものです。
例:

  use Module_IN_BAR;
  use encoding "foo";

=head2 DO NOT MIX MULTIPLE ENCODINGS

(複数のエンコーディングを混ぜてはいけません)

=begin original

Notice that only literals (string or regular expression) having only
legacy code points are affected: if you mix data like this

=end original

従来の文字位置を持っているリテラル(文字列もしくは正規表現)のみが
影響されるということに注意してください: 次のように書いた場合

    \xDF\x{100}

=begin original

the data is assumed to be in (Latin 1 and) Unicode, not in your native
encoding.  In other words, this will match in "greek":

=end original

このデータはネイティブエンコーディングではなく(Latin 1 と)Unicode で
あるとみなされます。
言い換えると、以下の例は "greek" でマッチします:

    "\xDF" =~ /\x{3af}/

=begin original

but this will not

=end original

しかし次の例では違います

    "\xDF\x{100}" =~ /\x{3af}\x{100}/

=begin original

since the C<\xDF> (ISO 8859-7 GREEK SMALL LETTER IOTA WITH TONOS) on
the left will B<not> be upgraded to C<\x{3af}> (Unicode GREEK SMALL
LETTER IOTA WITH TONOS) because of the C<\x{100}> on the left.  You
should not be mixing your legacy data and Unicode in the same string.

=end original

なぜならば左辺にある C<\xDF>
(ISO 8859-7 GREEK SMALL LETTER IOTA WITH TONOS) は、同じく左辺に
C<\x{100}> があるために C<\x{3af}> に B<昇格されない> からです。
従来のデータと Unicode を同じ文字列の中で混ぜるべきではありません。

=begin original

This pragma also affects encoding of the 0x80..0xFF code point range:
normally characters in that range are left as eight-bit bytes (unless
they are combined with characters with code points 0x100 or larger,
in which case all characters need to become UTF-8 encoded), but if
the C<encoding> pragma is present, even the 0x80..0xFF range always
gets UTF-8 encoded.

=end original

本プラグマは 0x80 から 0xFF の範囲の文字位置のエンコーディングにも
影響します: この範囲にある通常の文字は (UTF-8 エンコードが必要となる
0x100 以上の文字と組み合わされていない限り) 8 ビットバイトで
あるかのように放っておかれますが、
もし C<encoding> プラグマが使われているなら、0x80 から 0xFF の
範囲であっても UTF-8 エンコードされます。

=begin original

After all, the best thing about this pragma is that you don't have to
resort to \x{....} just to spell your name in a native encoding.
So feel free to put your strings in your encoding in quotes and
regexes.

=end original

以上のことを踏まえて、このプラグマに関して最も良いことは、
ネイティブエンコーディングであなたの名前を書くのに \x{....} と
書かなくてもすむということです。
だから、希望するエンコーディングで文字列や正規表現を自由に書いて
構いません。

=head2 tr/// with ranges

(tr/// の範囲指定)

=begin original

The B<encoding> pragma works by decoding string literals in
C<q//,qq//,qr//,qw///, qx//> and so forth.  In perl 5.8.0, this
does not apply to C<tr///>.  Therefore,

=end original

B<encoding> プラグマは C<q//,qq//,qr//,qw///, qx//> 中の文字列リテラルを
デコードすることによって動作します。
perl 5.8.0 では、これは C<tr///> には適用されていませんでした。
このため、

  use encoding 'euc-jp';
  #....
  $kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/;
  #           -------- -------- -------- --------

=begin original

Does not work as

=end original

これは次の例のようには動作しませんでした

  $kana =~ tr/\x{3041}-\x{3093}/\x{30a1}-\x{30f3}/;

=over

=item Legend of characters above

(上の例で使った文字)

  utf8     euc-jp   charnames::viacode()
  -----------------------------------------
  \x{3041} \xA4\xA1 HIRAGANA LETTER SMALL A
  \x{3093} \xA4\xF3 HIRAGANA LETTER N
  \x{30a1} \xA5\xA1 KATAKANA LETTER SMALL A
  \x{30f3} \xA5\xF3 KATAKANA LETTER N

=back

=begin original

This counterintuitive behavior has been fixed in perl 5.8.1.

=end original

この非直感的な動作は perl 5.8.1で修正されました。

=head3 workaround to tr///;

(tr///; の回避策)

=begin original

In perl 5.8.0, you can work around as follows;

=end original

perl 5.8.0 では、以下のような回避策がありました。

  use encoding 'euc-jp';
  #  ....
  eval qq{ \$kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/ };

=begin original

Note the C<tr//> expression is surrounded by C<qq{}>.  The idea behind
is the same as classic idiom that makes C<tr///> 'interpolate'.

=end original

C<tr//> 式が C<qq{}> に囲まれている点に注意してください。
このアイデアは C<tr///> に '展開'(interpolate)させる古典的なイディオムと
同じです。

   tr/$from/$to/;            # wrong!
   eval qq{ tr/$from/$to/ }; # workaround.

=begin original

Nevertheless, in case of B<encoding> pragma even C<q//> is affected so
C<tr///> not being decoded was obviously against the will of Perl5
Porters so it has been fixed in Perl 5.8.1 or later.

=end original

いずれにしろ、B<encoding> プラグマは C<q//> の場合であっても
影響を及ぼすので、C<tr///> は Perl 5 Porters の目にはデコードすることが
明らかなものとして写りませんでした。
5.8.1 以降のPerlではこれは修正されています。

=head1 EXAMPLE - Greekperl

(例 - Greekperl)

    use encoding "iso 8859-7";

    # \xDF in ISO 8859-7 (Greek) is \x{3af} in Unicode.

    $a = "\xDF";
    $b = "\x{100}";

    printf "%#x\n", ord($a); # will print 0x3af, not 0xdf

    $c = $a . $b;

    # $c will be "\x{3af}\x{100}", not "\x{df}\x{100}".

    # chr() is affected, and ...

    print "mega\n"  if ord(chr(0xdf)) == 0x3af;

    # ... ord() is affected by the encoding pragma ...

    print "tera\n" if ord(pack("C", 0xdf)) == 0x3af;

    # ... as are eq and cmp ...

    print "peta\n" if "\x{3af}" eq  pack("C", 0xdf);
    print "exa\n"  if "\x{3af}" cmp pack("C", 0xdf) == 0;

    # ... but pack/unpack C are not affected, in case you still
    # want to go back to your native encoding

    print "zetta\n" if unpack("C", (pack("C", 0xdf))) == 0xdf;

=head1 KNOWN PROBLEMS

(既知の問題)

=over

=item literals in regex that are longer than 127 bytes

(127 バイトを超える正規表現中のリテラル)

=begin original

For native multibyte encodings (either fixed or variable length),
the current implementation of the regular expressions may introduce
recoding errors for regular expression literals longer than 127 bytes.

=end original

ネイティブのマルチバイトエンコーディング(固定長であれ可変長であれ)に対して、
現在の正規表現の実装は 127 バイトを超える正規表現リテラルに対して
エラーを発生します。

=item EBCDIC

=begin original

The encoding pragma is not supported on EBCDIC platforms.
(Porters who are willing and able to remove this limitation are
welcome.)

=end original

エンコーディングプラグマはEBCDICプラットフォームをサポートしていません
(この制限を取り払おうとする Porters を歓迎します)。

=item format

(フォーマット)

=begin original

This pragma doesn't work well with format because PerlIO does not
get along very well with it.  When format contains non-ascii
characters it prints funny or gets "wide character warnings".
To understand it, try the code below.

=end original

このプラグマはフォーマットと一緒にはうまく使えません;
なぜなら、PerlIO がフォーマットにうまく対処していないからです。
フォーマットが非 ASCII 文字を含んでいた場合おかしな結果となるか、
"wide character warnings" となる。
これを理解するために、次のコードを試してみましょう。

  # Save this one in utf8
  # replace *non-ascii* with a non-ascii string
  my $camel;
  format STDOUT =
  *non-ascii*@>>>>>>>
  $camel
  .
  $camel = "*non-ascii*";
  binmode(STDOUT=>':encoding(utf8)'); # bang!
  write;              # funny 
  print $camel, "\n"; # fine

=begin original

Without binmode this happens to work but without binmode, print()
fails instead of write().

=end original

binmode がない場合にはうまくいくように見えますが、binmode がなければ
write() の代わりに print() が失敗することになります。

=begin original

At any rate, the very use of format is questionable when it comes to
unicode characters since you have to consider such things as character
width (i.e. double-width for ideographs) and directions (i.e. BIDI for
Arabic and Hebrew).

=end original

とにかく、Unicode 文字がある場合のフォーマットの使用は、
文字の幅(例: 表意文字の倍幅)や方向(例: アラビア語やヘブライ語のBIDI) を
考えねばならないので、疑わしいです。

=item Thread safety

(スレッドセーフ性)

=begin original

C<use encoding ...> is not thread-safe (i.e., do not use in threaded
applications).

=end original

C<use encoding ...> はスレッドセーフではありません (つまり、スレッドを
使うアプリケーションでは使わないでください)。

=back

=head2 The Logic of :locale

(:locale のロジック)

=begin original

The logic of C<:locale> is as follows:

=end original

:localeのロジックは以下の通りです:

=over 4

=item 1.

=begin original

If the platform supports the langinfo(CODESET) interface, the codeset
returned is used as the default encoding for the open pragma.

=end original

プラットフォームが langinfo(CODESET) インターフェースに
対応していれば、それが返すコードセットが open プラグマの
デフォルトエンコーディングとして使用されます。

=item 2.

=begin original

If 1. didn't work but we are under the locale pragma, the environment
variables LC_ALL and LANG (in that order) are matched for encodings
(the part after C<.>, if any), and if any found, that is used 
as the default encoding for the open pragma.

=end original

1. が成り立たないが、locale プラグマが有効の場合、環境変数 LC_ALL と
LANG が(この順番で検索されます)エンコーディング(もしあれば C<.> の後の部分)と
マッチしてそれが見つかれば、open プラグマのデフォルトエンコーディングとして
使用されます。

=item 3.

=begin original

If 1. and 2. didn't work, the environment variables LC_ALL and LANG
(in that order) are matched for anything looking like UTF-8, and if
any found, C<:utf8> is used as the default encoding for the open
pragma.

=end original

1. も2. も失敗したならば、環境変数 LC_ALL と LANG から(この順番で)
UTF-8 のような何かを見つけ出そうとし、もし見つかれば C<:utf8> が
open プラグマのデフォルトエンコーディングとして使用されます。

=back

=begin original

If your locale environment variables (LC_ALL, LC_CTYPE, LANG)
contain the strings 'UTF-8' or 'UTF8' (case-insensitive matching),
the default encoding of your STDIN, STDOUT, and STDERR, and of
B<any subsequent file open>, is UTF-8.

=end original

ロケール関係の環境変数(LC_ALL, LC_CTYPE, LANG)が
'UTF-8' もしくは 'UTF8' (大小文字の違いは無視されます)という文字列を
含んでいたならば、STDIN, STDOUT, STDERR および
それ以降にオープンされたファイルのエンコーディングは UTF-8 となります。

=head1 HISTORY

(歴史)

=begin original

This pragma first appeared in Perl 5.8.0.  For features that require 
5.8.1 and better, see above.

=end original

この本プラグマは Perl 5.8.0 で最初に導入されました。
5.8.1 以降を要求する機能については前のほうで説明しました。

=begin original

The C<:locale> subpragma was implemented in 2.01, or Perl 5.8.6.

=end original

C<:locale> サブプラグマはバージョン 2.01、つまり Perl 5.8.6 で
実装されました。

=head1 SEE ALSO

L<perlunicode>, L<Encode>, L<open>, L<Filter::Util::Call>,

Ch. 15 of C<Programming Perl (3rd Edition)>
by Larry Wall, Tom Christiansen, Jon Orwant;
O'Reilly & Associates; ISBN 0-596-00027-8

=begin meta

Translate: KIMURA Koichi (1.47)
Update: Kentaro Shirakata <argrath@ub32.org> (2.6)
Status: completed

=end meta

=cut