Perlの組み込み関数 split の翻訳

perl-5.42.0

編集

変更履歴

誤訳の報告

原文を表示/隠す

split /PATTERN/,EXPR,LIMIT

split /PATTERN/,EXPR

split /PATTERN/

split

Splits the string EXPR into a list of strings and returns the list in list context, or the size of the list in scalar context. (Prior to Perl 5.11, it also overwrote @_ with the list in void and scalar context. If you target old perls, beware.)

文字列 EXPR を文字列のリストに分割して、リストコンテキストではそのリストを返し、スカラコンテキストではリストの大きさを返します。 (Perl 5.11 以前では、無効コンテキストやスカラコンテキストの場合は @_ をリストで上書きします。もし古い perl を対象にするなら、注意してください。)

If only PATTERN is given, EXPR defaults to $_.

PATTERN のみが与えられた場合、EXPR のデフォルトは $_ です。

Anything in EXPR that matches PATTERN is taken to be a separator that separates the EXPR into substrings (called "fields") that do not include the separator. Note that a separator may be longer than one character or even have no characters at all (the empty string, which is a zero-width match).

EXPR の中で PATTERN にマッチングするものは何でも EXPR を("fields" と呼ばれる)セパレータを 含まない 部分文字列に分割するためのセパレータとなります。セパレータは一文字より長くてもよく、全く文字がなくてもよい(空文字列はゼロ幅マッチングです)ということに注意してください。

The PATTERN need not be constant; an expression may be used to specify a pattern that varies at runtime.

PATTERN は定数である必要はありません; 実行時に変更されるパターンを指定するために式を使えます。

If PATTERN matches the empty string, the EXPR is split at the match position (between characters). As an example, the following:

PATTERN が空文字列にマッチングする場合、EXPR はマッチング位置 (文字の間)で分割されます。例えば、以下のものは:

    my @x = split(/b/, "abc"); # ("a", "c")

uses the b in 'abc' as a separator to produce the list ("a", "c"). However, this:

'abc' の b をセパレータとして使ってリスト ("a", "c") を生成します。しかし、これは:

    my @x = split(//, "abc"); # ("a", "b", "c")

uses empty string matches as separators; thus, the empty string may be used to split EXPR into a list of its component characters.

空文字列マッチングをセパレータとして使います; 従って、空文字列は EXPR を構成する文字のリストに分割するために使われます。

As a special case for splitPATTERN/,EXPR,LIMIT>, the empty pattern given in match operator syntax (//) specifically matches the empty string, which is contrary to its usual interpretation as the last successful match.

split の特殊な場合として、マッチング演算子文法で与えられた空パターン (//) は特に空文字列にマッチングし、最後に成功したマッチングという普通の解釈と異なります。

If PATTERN is /^/, then it is treated as if it used the multiline modifier (/^/m), since it isn't much use otherwise.

PATTERN が /^/ の場合、複数行修飾子 (/^/m) が使われたかのように扱われます; そうでなければほとんど使えないからです。

/m and any of the other pattern modifiers valid for qr (summarized in "qr/STRING/msixpodualn" in perlop) may be specified explicitly.

qr で有効な /m 及びその他のパターン修飾子 ("qr/STRING/msixpodualn" in perlop にまとめられています) は明示的に定義されます。

As another special case, splitPATTERN/,EXPR,LIMIT> emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a string composed of a single space character (such as ' ' or "\x20", but not e.g. / /). In this case, any leading whitespace in EXPR is removed before splitting occurs, and the PATTERN is instead treated as if it were /\s+/; in particular, this means that any contiguous whitespace (not just a single space character) is used as a separator.

もう一つの特別な場合として、 split は PATTERN が省略されるか単一のスペース文字からなる文字列 (つまり例えば / / ではなく ' ' や "\x20") の場合、コマンドラインツール awk のデフォルトの振る舞いをエミュレートします。この場合、EXPR の先頭の空白は分割を行う前に削除され、PATTERN は /\s+/ であったかのように扱われます; 特に、これは (単に単一のスペース文字ではなく) あらゆる 連続した空白がセパレータとして使われるということです。

    my @x = split(" ", "  Quick brown fox\n");
    # ("Quick", "brown", "fox")

    my @x = split(" ", "RED\tGREEN\tBLUE");
    # ("RED", "GREEN", "BLUE")

Using split in this fashion is very similar to how qw// works.

この方法で split を使うのは、 qw// の動作と非常に似ています。

However, this special treatment can be avoided by specifying the pattern / / instead of the string " ", thereby allowing only a single space character to be a separator. In earlier Perls this special case was restricted to the use of a plain " " as the pattern argument to split; in Perl 5.18.0 and later this special case is triggered by any expression which evaluates to the simple string " ".

しかし、この特別の扱いは文字列 " " の代わりにパターン / / を指定することで回避でき、それによってセパレータとして単一のスペース文字のみが使われます。以前の Perl ではこの特別な場合は split のパターン引数として単に " " を使った場合に制限されていました; Perl 5.18.0 以降では、この特別な場合は単純な文字列 " " と評価される任意の式によって引き起こされます。

As of Perl 5.28, this special-cased whitespace splitting works as expected in the scope of "use feature 'unicode_strings'". In previous versions, and outside the scope of that feature, it exhibits "The "Unicode Bug"" in perlunicode: characters that are whitespace according to Unicode rules but not according to ASCII rules can be treated as part of fields rather than as field separators, depending on the string's internal encoding.

Perl 5.28 から、この特別な場合の空白分割は "use feature 'unicode_strings'" のスコープの中では想定通りに動作します。以前のバージョンでは、そしてこの機能のスコープの外側では、これは "The "Unicode Bug"" in perlunicode を引き起こします: Unicode によれば空白だけれども ASCII の規則ではそうではない文字は、文字列の内部エンコーディングに依存して、フィールドの区切りではなくフィールドの一部として扱われることがあります。

As of Perl 5.39.9 the /x default modifier does NOT affect split STRING but does affect split PATTERN, this means that split " " will produce the expected awk emulation regardless as to whether it is used in the scope of a use re "/x" statement. If you want to split by spaces under use re "/x" you must do something like split /(?-x: )/ or split /\x{20}/ instead of split / /.

Perl 5.39.9 から、/x デフォルト修飾子は split STRING には影響しませんが、split PATTERN には影響します; つまり、split " " は、use re "/x" 文のスコープ内で使われているかどうかにかかわらず、期待される awk エミュレーションを生成します。 use re "/x" の下でスペースで分割したい場合は、 split / / の代わりに split /(?-x: )/ または split /\x{20}/ のようにしなければなりません。

If omitted, PATTERN defaults to a single space, " ", triggering the previously described awk emulation.

省略されると、PATTERN のデフォルトは単一のスペース " " になり、先に記述した awk エミュレーションを起動します。

If LIMIT is specified and positive, it represents the maximum number of fields into which the EXPR may be split; in other words, LIMIT is one greater than the maximum number of times EXPR may be split. Thus, the LIMIT value 1 means that EXPR may be split a maximum of zero times, producing a maximum of one field (namely, the entire value of EXPR). For instance:

LIMIT が指定された正数の場合、EXPR が分割されるフィールドの最大数を表現します; 言い換えると、 LIMIT は EXPR が分割される数より一つ大きい数です。従って、LIMIT の値 1 は EXPR が最大 0 回分割されるということで、最大で一つのフィールドを生成します (言い換えると、EXPR 全体の値です)。例えば:

    my @x = split(/,/, "a,b,c", 1); # ("a,b,c")
    my @x = split(/,/, "a,b,c", 2); # ("a", "b,c")
    my @x = split(/,/, "a,b,c", 3); # ("a", "b", "c")
    my @x = split(/,/, "a,b,c", 4); # ("a", "b", "c")

If LIMIT is negative, it is treated as if it were instead arbitrarily large; as many fields as possible are produced.

LIMIT が負数なら、非常に大きい数であるかのように扱われます; できるだけ多くのフィールドが生成されます。

If LIMIT is omitted (or, equivalently, zero), then it is usually treated as if it were instead negative but with the exception that trailing empty fields are stripped (empty leading fields are always preserved); if all fields are empty, then all fields are considered to be trailing (and are thus stripped in this case). Thus, the following:

LIMIT が省略されると(あるいは等価な 0 なら)、普通は負数が指定されたかのように動作しますが、末尾の空フィールドは取り除かれるという例外があります (先頭の空フィールドは常に保存されます); もし全てのフィールドが空なら、全てのフィールドが末尾として扱われます(そしてこの場合取り除かれます)。従って、以下のようにすると:

    my @x = split(/,/, "a,b,c,,,"); # ("a", "b", "c")

produces only a three element list.

3 要素だけのリストを生成します。

    my @x = split(/,/, "a,b,c,,,", -1); # ("a", "b", "c", "", "", "")

produces a six element list.

6 要素のリストを生成します。

In time-critical applications, it is worthwhile to avoid splitting into more fields than necessary. Thus, when assigning to a list, if LIMIT is omitted (or zero), then LIMIT is treated as though it were one larger than the number of variables in the list; for the following, LIMIT is implicitly 3:

時間に厳しいアプリケーションでは、必要でないフィールドの分割を避けるのは価値があります。従って、リストに代入される場合に、LIMIT が省略される(または 0)と、 LIMIT はリストにある変数の数より一つ大きい数のように扱われます; 次の場合、LIMIT は暗黙に 3 になります:

    my ($login, $passwd) = split(/:/);

Note that splitting an EXPR that evaluates to the empty string always produces zero fields, regardless of the LIMIT specified.

LIMIT の指定に関わらず、空文字列に評価される EXPR を分割すると常に 0 個のフィールドを生成することに注意してください。

An empty leading field is produced when there is a positive-width match at the beginning of EXPR. For instance:

EXPR の先頭で正数幅でマッチングしたときには先頭に空のフィールドが生成されます。例えば:

    my @x = split(/ /, " abc"); # ("", "abc")

splits into two elements. However, a zero-width match at the beginning of EXPR never produces an empty field, so that:

これは 2 要素に分割します。しかし、EXPR の先頭でのゼロ幅マッチングは決して空フィールドを生成しないので:

    my @x = split(//, " abc"); # (" ", "a", "b", "c")

splits into four elements instead of five.

これは 5 要素ではなく 4 要素に分割します。

An empty trailing field, on the other hand, is produced when there is a match at the end of EXPR, regardless of the length of the match (of course, unless a non-zero LIMIT is given explicitly, such fields are removed, as in the last example). Thus:

一方、末尾の空のフィールドは、マッチングの長さに関わらず、EXPR の末尾でマッチングしたときに生成されます(もちろん非 0 の LIMIT が明示的に指定されていない場合です; このようなフィールドは前の例のように取り除かれます)。従って:

    my @x = split(//, " abc", -1); # (" ", "a", "b", "c", "")

If the PATTERN contains capturing groups, then for each separator, an additional field is produced for each substring captured by a group (in the order in which the groups are specified, as per backreferences); if any group does not match, then it captures the undef value instead of a substring. Also, note that any such additional field is produced whenever there is a separator (that is, whenever a split occurs), and such an additional field does not count towards the LIMIT. Consider the following expressions evaluated in list context (each returned list is provided in the associated comment):

PATTERN が捕捉グループを含んでいる場合、それぞれのセパレータについて、 (後方参照のようにグループが指定された) グループによって捕捉されたそれぞれの部分文字列について追加のフィールドが生成されます; どのグループもマッチングしなかった場合、部分文字列の代わりに undef 値を捕捉します。また、このような追加のフィールドはセパレータがあるとき(つまり、分割が行われるとき)はいつでも生成され、このような追加のフィールドは LIMIT に関してはカウント されない ことに注意してください。リストコンテキストで評価される以下のような式を考えます (それぞれの返されるリストは関連づけられたコメントで提供されます):

    my @x = split(/-|,/    , "1-10,20", 3);
    # ("1", "10", "20")

    my @x = split(/(-|,)/  , "1-10,20", 3);
    # ("1", "-", "10", ",", "20")

    my @x = split(/-|(,)/  , "1-10,20", 3);
    # ("1", undef, "10", ",", "20")

    my @x = split(/(-)|,/  , "1-10,20", 3);
    # ("1", "-", "10", undef, "20")

    my @x = split(/(-)|(,)/, "1-10,20", 3);
    # ("1", "-", undef, "10", undef, ",", "20")