=encoding euc-jp

=head1 NAME

=begin original

perlrequick - Perl regular expressions quick start

=end original

perlrequick - Perl Àµµ¬É½¸½¤Î¥¯¥¤¥Ã¥¯¥¹¥¿¡¼¥È

=head1 DESCRIPTION

=begin original

This page covers the very basics of understanding, creating and
using regular expressions ('regexes') in Perl.

=end original

¤³¤Î¥Ú¡¼¥¸¤Ï¡¢Perl ¤ÎÀµµ¬É½¸½ ('regexes') ¤ÎÍý²ò¡¢ºîÀ®¡¢»ÈÍÑ¤Î´ðËÜÃæ¤Î
´ðËÜ¤ËÂÐ±þ¤·¤Æ¤¤¤Þ¤¹¡£

=head1 The Guide

(¥¬¥¤¥É)

=begin original

This page assumes you already know things, like what a "pattern" is, and
the basic syntax of using them.  If you don't, see L<perlretut>.

=end original

¤³¤Î¥Ú¡¼¥¸¤Ï¡¢¤¢¤Ê¤¿¤¬¡Ö¥Ñ¥¿¡¼¥ó¤È¤Ï²¿¤«¡×¤ä¤½¤ì¤ò»È¤¦¤¿¤á¤Î´ðËÜÅª¤Ê
Ê¸Ë¡¤ò´û¤ËÃÎ¤Ã¤Æ¤¤¤ë¤³¤È¤ò²¾Äê¤·¤Æ¤¤¤Þ¤¹¡£
¤â¤·¤½¤¦¤Ç¤Ê¤±¤ì¤Ð¡¢L<perlretut> ¤ò»²¾È¤·¤Æ¤¯¤À¤µ¤¤¡£

=head2 Simple word matching

(Ã±½ã¤ÊÃ±¸ì¤Î¥Þ¥Ã¥Á¥ó¥°)

=begin original

The simplest regex is simply a word, or more generally, a string of
characters.  A regex consisting of a word matches any string that
contains that word:

=end original

ºÇ¤âÃ±½ã¤ÊÀµµ¬É½¸½¤ÏÃ±¤Ê¤ëÃ±¸ì¡¢¤è¤ê°ìÈÌÅª¤Ë¤ÏÊ¸»ú¤ÎÊÂ¤Ó¤Ç¤¹¡£
Àµµ¬É½¸½¤ÏÃ±¸ì¤ò¹½À®¤¹¤ëÇ¤°Õ¤ÎÊ¸»úÎó¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ëÃ±¸ì¤«¤é¤Ê¤ê¤Þ¤¹:

    "Hello World" =~ /World/;  # matches

=begin original

In this statement, C<World> is a regex and the C<//> enclosing
C</World/> tells Perl to search a string for a match.  The operator
C<=~> associates the string with the regex match and produces a true
value if the regex matched, or false if the regex did not match.  In
our case, C<World> matches the second word in C<"Hello World">, so the
expression is true.  This idea has several variations.

=end original

¤³¤ÎÊ¸¤Ç¡¢C<World> ¤ÏÀµµ¬É½¸½¤Ç¤¢¤ê¡¢ C<//> ¤Ç°Ï¤Þ¤ì¤¿ C</World/> ¤Ï
Perl ¤ËÂÐ¤·¤Æ¥Þ¥Ã¥Á¥ó¥°¤Î¤¿¤á¤ËÊ¸»úÎó¤ò¸¡º÷¤¹¤ë¤³¤È¤ò»Ø¼¨¤·¤Þ¤¹¡£
C<=~> ¤È¤¤¤¦±é»»»Ò¤ÏÀµµ¬É½¸½¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ëÊ¸»úÎó¤Ë·ë¤ÓÉÕ¤±¤é¤ì¡¢
Àµµ¬É½¸½¤¬¥Þ¥Ã¥Á¥ó¥°¤¹¤ì¤Ð¿¿¤ÎÃÍ¤òÀ¸À®¤·¡¢¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤±¤ì¤Ðµ¶¤È¤Ê¤ê¤Þ¤¹¡£
¤³¤ÎÎã¤Ç¤Ï¡¢C<World> ¤Ï C<"Hello World"> ¤ÎÆóÈÖÌÜ¤ÎÃ±¸ì¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤Î¤Ç¡¢
¼°¤Ï¿¿¤È¤Ê¤ê¤Þ¤¹¡£
¤³¤Î¹Í¤¨Êý¤Ë¤Ï¤¤¤¯¤Ä¤«¤Î¥Ð¥ê¥¨¡¼¥·¥ç¥ó¤¬¤¢¤ê¤Þ¤¹¡£

=begin original

Expressions like this are useful in conditionals:

=end original

°Ê²¼¤Î¤è¤¦¤Ê¼°¤Ï¾ò·ïÊ¸¤ÇÊØÍø¤Ç¤¹:

    print "It matches\n" if "Hello World" =~ /World/;

=begin original

The sense of the match can be reversed by using C<!~> operator:

=end original

¥Þ¥Ã¥Á¥ó¥°¤ÎÀ®ÈÝ¤Î°ÕÌ£¤òÈ¿Å¾¤¹¤ë±é»»»Ò C<!~> ¤¬¤¢¤ê¤Þ¤¹:

    print "It doesn't match\n" if "Hello World" !~ /World/;

=begin original

The literal string in the regex can be replaced by a variable:

=end original

Àµµ¬É½¸½Ãæ¤Î¥ê¥Æ¥é¥ëÊ¸»úÎó¤ÏÊÑ¿ô¤ËÃÖ¤­´¹¤¨¤ë¤³¤È¤â¤¬¤Ç¤­¤Þ¤¹:

    $greeting = "World";
    print "It matches\n" if "Hello World" =~ /$greeting/;

=begin original

If you're matching against C<$_>, the C<$_ =~> part can be omitted:

=end original

C<$_> ¤ËÂÐ¤·¤Æ¥Þ¥Ã¥Á¥ó¥°¤ò¹Ô¤¦¾ì¹ç¡¢C<$_ =~> ¤ÎÉôÊ¬¤Ï¾ÊÎ¬¤Ç¤­¤Þ¤¹:

    $_ = "Hello World";
    print "It matches\n" if /World/;

=begin original

Finally, the C<//> default delimiters for a match can be changed to
arbitrary delimiters by putting an C<'m'> out front:

=end original

ºÇ¸å¤Ë¡¢¥Þ¥Ã¥Á¥ó¥°¤Î¤¿¤á¤Î C<//> ¤Î¥Ç¥Õ¥©¥ë¥È¥Ç¥ê¥ß¥¿¤Ï C<'m'> ¤ò
Á°ÃÖ¤¹¤ë¤³¤È¤Ë¤è¤êÇ¤°Õ¤Î¤â¤Î¤Ë¤¹¤ë¤³¤È¤¬¤Ç¤­¤Þ¤¹:

=begin original

    "Hello World" =~ m!World!;   # matches, delimited by '!'
    "Hello World" =~ m{World};   # matches, note the matching '{}'
    "/usr/bin/perl" =~ m"/perl"; # matches after '/usr/bin',
                                 # '/' becomes an ordinary char

=end original

    "Hello World" =~ m!World!;   # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë; ¥Ç¥ê¥ß¥¿¤Ï '!'
    "Hello World" =~ m{World};   # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë; ÁÈ¤Ë¤Ê¤Ã¤Æ¤¤¤ë '{}' ¤ËÃí°Õ
    "/usr/bin/perl" =~ m"/perl"; # 'usr/bin' ¤Î¸å¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
                                 # '/' ¤ÏÉáÄÌ¤ÎÊ¸»ú¤Ë¤Ê¤Ã¤Æ¤¤¤ë

=begin original

Regexes must match a part of the string I<exactly> in order for the
statement to be true:

=end original

Àµµ¬É½¸½¤Ï¡¢Ê¸¤¬¿¿¤È¤Ê¤ë¤¿¤á¤Ë¤Ï I<Àµ³Î¤Ë> ½ç½øÄÌ¤ê¤ËÊ¸»úÎó¤Î
°ìÉô¤È¤·¤Æ¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤±¤ì¤Ð¤Ê¤ê¤Þ¤»¤ó¡£

=begin original

    "Hello World" =~ /world/;  # doesn't match, case sensitive
    "Hello World" =~ /o W/;    # matches, ' ' is an ordinary char
    "Hello World" =~ /World /; # doesn't match, no ' ' at end

=end original

    "Hello World" =~ /world/;  # ¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤¤; ÂçÊ¸»ú¾®Ê¸»ú¤Ï¶èÊÌ¤¹¤ë
    "Hello World" =~ /o W/;    # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë; ' ' ¤ÏÉáÄÌ¤ÎÊ¸»ú
    "Hello World" =~ /World /; # ¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤¤; ËöÈø¤Ë ' ' ¤Ï¤Ê¤¤

=begin original

Perl will always match at the earliest possible point in the string:

=end original

Perl ¤Ï¾ï¤ËÊ¸»úÎó¤ÎÃæ¤ÇºÇ½é¤Ë¸½¤ì¤ë¤â¤Î¤ò¥Þ¥Ã¥Á¥ó¥°¤·¤è¤¦¤È¤·¤Þ¤¹:

=begin original

    "Hello World" =~ /o/;       # matches 'o' in 'Hello'
    "That hat is red" =~ /hat/; # matches 'hat' in 'That'

=end original

    "Hello World" =~ /o/;       # 'Hello' ¤Î 'o' ¤Ë¥Þ¥Ã¥Á¥ó¥°
    "That hat is red" =~ /hat/; # 'That' ¤ÎÃæ¤Î 'hat' ¤Ë¥Þ¥Ã¥Á¥ó¥°

=begin original

Not all characters can be used 'as is' in a match.  Some characters,
called B<metacharacters>, are considered special, and reserved for use
in regex notation.  The metacharacters are

=end original

¤¹¤Ù¤Æ¤ÎÊ¸»ú¤¬¥Þ¥Ã¥Á¥ó¥°¤Ë¤ª¤¤¤Æ'¤¢¤ë¤¬¤Þ¤Þ'(as is) ¤Ë»È¤ï¤ì¤ë¤Î¤Ç¤Ï
¤¢¤ê¤Þ¤»¤ó¡£
I<¥á¥¿Ê¸»ú> ¤È¸Æ¤Ð¤ì¤ë¤¤¤¯¤Ä¤«¤ÎÊ¸»ú¤ÏÆÃÊÌ¤Ç¤¢¤ë¤È¹Í¤¨¤é¤ì¡¢
Àµµ¬É½¸½¤Îµ­½Ò¤Ë»È¤¦¤¿¤á¤ËÍ½Ìó¤µ¤ì¤Æ¤¤¤Þ¤¹¡£
¥á¥¿Ê¸»ú¤Ë¤Ï°Ê²¼¤Î¤â¤Î¤¬¤¢¤ê¤Þ¤¹

    {}[]()^$.|*+?\

=begin original

A metacharacter can be matched literally by putting a backslash before
it:

=end original

¥á¥¿Ê¸»ú¤Ï¥Ð¥Ã¥¯¥¹¥é¥Ã¥·¥å¤òÁ°ÃÖ¤¹¤ë¤³¤È¤Ë¤è¤Ã¤Æ¥ê¥Æ¥é¥ë¤Ë
¥Þ¥Ã¥Á¥ó¥°¤µ¤»¤é¤ì¤Þ¤¹:

=begin original

    "2+2=4" =~ /2+2/;    # doesn't match, + is a metacharacter
    "2+2=4" =~ /2\+2/;   # matches, \+ is treated like an ordinary +
    'C:\WIN32' =~ /C:\\WIN/;                       # matches
    "/usr/bin/perl" =~ /\/usr\/bin\/perl/;  # matches

=end original

    "2+2=4" =~ /2+2/;    # ¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤¤; + ¤Ï¥á¥¿Ê¸»ú
    "2+2=4" =~ /2\+2/;   # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë; \+ ÉáÄÌ¤Î + ¤Î¤è¤¦¤Ë°·¤ï¤ì¤ë
    'C:\WIN32' =~ /C:\\WIN/;                       # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    "/usr/bin/perl" =~ /\/usr\/bin\/perl/;  # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë

=begin original

In the last regex, the forward slash C<'/'> is also backslashed,
because it is used to delimit the regex.

=end original

ºÇ¸å¤ÎÀµµ¬É½¸½¤Ç¤Ï¡¢¥¹¥é¥Ã¥·¥å C<'/'> ¤â¤Þ¤¿¥Ð¥Ã¥¯¥¹¥é¥Ã¥·¥å¤¬
¤Ä¤±¤é¤ì¤Æ¤¤¤Þ¤¹;
¤Ê¤¼¤Ê¤é¡¢¤½¤ì¤¬Àµµ¬É½¸½¤Î¥Ç¥ê¥ß¥¿¤È¤·¤Æ»È¤ï¤ì¤Æ¤¤¤ë¤«¤é¤Ç¤¹¡£

=begin original

Most of the metacharacters aren't always special, and other characters
(such as the ones delimitting the pattern) become special under various
circumstances.  This can be confusing and lead to unexpected results.
L<S<C<use re 'strict'>>|re/'strict' mode> can notify you of potential
pitfalls.

=end original

¤Û¤È¤ó¤É¤Î¥á¥¿Ê¸»ú¤Ï¾ï¤ËÆÃÊÌ¤Ç¤Ï¤Ê¤¯¡¢
(¥Ñ¥¿¡¼¥ó¤ò¶èÀÚ¤ë¤è¤¦¤Ê) ¤½¤ÎÂ¾¤ÎÊ¸»ú¤ÏÍÍ¡¹¤Ê¾ò·ï¤ÇÆÃÊÌ¤Ë¤Ê¤ê¤Þ¤¹¡£
¤³¤ì¤ÏÊ¬¤«¤ê¤Ë¤¯¤¯¡¢ÁÛÄê³°¤Î·ë²Ì¤ò°ú¤­µ¯¤³¤¹¤«¤âÃÎ¤ì¤Þ¤»¤ó¡£
L<S<C<use re 'strict'>>|re/'strict' mode> ¤Ï¡¢ÀøºßÅª¤ÊÍî¤È¤··ê¤ò
¤¢¤Ê¤¿¤ËÄÌÃÎ¤·¤Þ¤¹¡£

=begin original

Non-printable ASCII characters are represented by B<escape sequences>.
Common examples are C<\t> for a tab, C<\n> for a newline, and C<\r>
for a carriage return.  Arbitrary bytes are represented by octal
escape sequences, e.g., C<\033>, or hexadecimal escape sequences,
e.g., C<\x1B>:

=end original

°õ»ú¤Ç¤­¤Ê¤¤ ASCII Ê¸»ú¤Ï B<¥¨¥¹¥±¡¼¥×¥·¡¼¥±¥ó¥¹> ¤Ë¤è¤Ã¤ÆÉ½¸½¤µ¤ì¤Þ¤¹¡£
°ìÈÌÅª¤ÊÎã¤Ç¤Ï¡¢¥¿¥Ö¤òÉ½¤¹ C<\t>¡¢²þ¹Ô¤òÉ½¤¹ C<\n>¡¢Éüµ¢¤òÉ½¤¹ C<\r> ¤¬
¤¢¤ê¤Þ¤¹¡£
Ç¤°Õ¤Î¥Ð¥¤¥È¤Ï 8 ¿Ê¥¨¥¹¥±¡¼¥×¥·¡¼¥±¥ó¥¹ (Îã¤¨¤Ð C<\033>) ¤¢¤ë¤¤¤Ï
16 ¿Ê¥¨¥¹¥±¡¼¥×¥·¡¼¥±¥ó¥¹ (Îã¤¨¤Ð C<\x1B>) ¤ÇÉ½¸½¤Ç¤­¤Þ¤¹:

=begin original

    "1000\t2000" =~ m(0\t2)  # matches
    "cat" =~ /\143\x61\x74/  # matches in ASCII, but
                             # a weird way to spell cat

=end original

    "1000\t2000" =~ m(0\t2)  # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    "cat" =~ /\143\x61\x74/  # ASCII ¤Ç¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤¬¡¢
                             # cat ¤òÄÖ¤ëÊÑ¤ÊÊýË¡

=begin original

Regexes are treated mostly as double-quoted strings, so variable
substitution works:

=end original

Àµµ¬É½¸½¤Ï¤Û¤È¤ó¤É¤Î¾ì¹ç¤Ë¤ª¤¤¤Æ¥À¥Ö¥ë¥¯¥©¡¼¥È¤Ç°Ï¤Þ¤ì¤¿Ê¸»úÎó¤Î¤è¤¦¤Ë
°·¤ï¤ì¤ë¤Î¤Ç¡¢ÊÑ¿ôÃÖ´¹¤ÏÆ°ºî¤·¤Þ¤¹:

=begin original

    $foo = 'house';
    'cathouse' =~ /cat$foo/;   # matches
    'housecat' =~ /${foo}cat/; # matches

=end original

    $foo = 'house';
    'cathouse' =~ /cat$foo/;   # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    'housecat' =~ /${foo}cat/; # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë

=begin original

With all of the regexes above, if the regex matched anywhere in the
string, it was considered a match.  To specify I<where> it should
match, we would use the B<anchor> metacharacters C<^> and C<$>.  The
anchor C<^> means match at the beginning of the string and the anchor
C<$> means match at the end of the string, or before a newline at the
end of the string.  Some examples:

=end original

¤³¤ì¤Þ¤Ç¤ÎÀµµ¬É½¸½¤Ç¤Ï¡¢Ê¸»úÎó¤Î¤É¤³¤«¤Ç¥Þ¥Ã¥Á¥ó¥°¤¹¤ì¤Ð¥Þ¥Ã¥Á¥ó¥°¤·¤¿¤È
¤ß¤Ê¤·¤Æ¤­¤Þ¤·¤¿¡£
Ê¸»úÎó¤Î I<¤É¤³¤Ç> Àµµ¬É½¸½¤¬¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤Î¤«¤ò»ØÄê¤¹¤ë¤Ë¤Ï¡¢
I<¥¢¥ó¥«¡¼> ¥á¥¿Ê¸»ú¤Ç¤¢¤ë C<^> ¤È C<$> ¤ò»È¤¤¤Þ¤¹¡£
¥¢¥ó¥«¡¼ C<^> ¤ÏÊ¸»úÎó¤ÎÀèÆ¬¤Ç¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤³¤È¤ò°ÕÌ£¤·¡¢¥¢¥ó¥«¡¼ C<$> ¤Ï
Ê¸»úÎó¤ÎËöÈø(¤¢¤ë¤¤¤ÏÊ¸»úÎó¤ÎËöÈø¤Ë¤¢¤ë²þ¹Ô¤ÎÁ°) ¤Ç¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤³¤È¤ò
°ÕÌ£¤·¤Þ¤¹¡£
¤¤¤¯¤Ä¤«Îã¤òµó¤²¤Þ¤¹:

=begin original

    "housekeeper" =~ /keeper/;         # matches
    "housekeeper" =~ /^keeper/;        # doesn't match
    "housekeeper" =~ /keeper$/;        # matches
    "housekeeper\n" =~ /keeper$/;      # matches
    "housekeeper" =~ /^housekeeper$/;  # matches

=end original

    "housekeeper" =~ /keeper/;         # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    "housekeeper" =~ /^keeper/;        # ¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤¤
    "housekeeper" =~ /keeper$/;        # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    "housekeeper\n" =~ /keeper$/;      # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    "housekeeper" =~ /^housekeeper$/;  # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë

=head2 Using character classes

(Ê¸»ú¥¯¥é¥¹¤ò»È¤¦)

=begin original

A B<character class> allows a set of possible characters, rather than
just a single character, to match at a particular point in a regex.
There are a number of different types of character classes, but usually
when people use this term, they are referring to the type described in
this section, which are technically called "Bracketed character
classes", because they are denoted by brackets C<[...]>, with the set of
characters to be possibly matched inside.  But we'll drop the "bracketed"
below to correspond with common usage.  Here are some examples of
(bracketed) character classes:

=end original

B<Ê¸»ú¥¯¥é¥¹> ¤ÏÀµµ¬É½¸½¤ÎÆÃÄê¤Î¾ì½ê¤Ë¤ª¤¤¤Æ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë²ÄÇ½À­¤Î¤¢¤ëÊ¸»ú¤Î
½¸¹ç¤Ç¤¹(Ã±°ì¤ÎÊ¸»ú¤Ç¤Ï¤¢¤ê¤Þ¤»¤ó)¡£
Ê¸»ú¥¯¥é¥¹¤Ë¤ÏÍÍ¡¹¤Ê¼ïÎà¤¬¤¢¤ê¤Þ¤¹¤¬¡¢ÄÌ¾ï¿Í¡¹¤¬¤³¤ÎÍÑ¸ì¤ò»È¤¦¤È¤­¤Ï¡¢
Èà¤é¤Ïµ»½ÑÅª¤Ë¤Ï(Âç¤«¤Ã¤³ C<[...]> ¤ÇÉ½¸½¤µ¤ì¡¢¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
²ÄÇ½À­¤Î¤¢¤ëÊ¸»ú¤Î½¸¹ç¤¬¤½¤ÎÆâÂ¦¤ËÃÖ¤«¤ì¤ë¤Î¤Ç)
¡ÖÂç¤«¤Ã¤³Ê¸»ú¥¯¥é¥¹¡×¤È¸Æ¤Ð¤ì¤ë¡¢¤³¤Î¾Ï¤Çµ­½Ò¤µ¤ì¤ë¼ïÎà¤ò»²¾È¤·¤Æ¤¤¤Þ¤¹¡£
¤·¤«¤·°ìÈÌÅª¤Ê»È¤¤Êý¤ËÂÐ±þ¤¹¤ë¤¿¤á¤Ë°Ê²¼¤Ç¤Ï¡ÖÂç¤«¤Ã¤³¡×¤ò¾ÊÎ¬¤·¤Þ¤¹¡£
°Ê²¼¤Ï(Âç¤«¤Ã¤³)Ê¸»ú¥¯¥é¥¹¤ÎÎã¤Ç¤¹:

=begin original

    /cat/;            # matches 'cat'
    /[bcr]at/;        # matches 'bat', 'cat', or 'rat'
    "abc" =~ /[cab]/; # matches 'a'

=end original

    /cat/;            # 'cat' ¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    /[bcr]at/;        # 'bat', 'cat', 'rat' ¤Î¤¤¤º¤ì¤«¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    "abc" =~ /[cab]/; # 'a' ¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë

=begin original

In the last statement, even though C<'c'> is the first character in
the class, the earliest point at which the regex can match is C<'a'>.

=end original

ºÇ¸å¤ÎÊ¸¤Ë¤ª¤¤¤Æ¡¢C<'c'> ¤¬¥¯¥é¥¹¤ÎºÇ½é¤ÎÊ¸»ú¤Ç¤¢¤ë¤Ë¤â¤«¤«¤ï¤é¤º
Àµµ¬É½¸½¤¬¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤³¤È¤Î¤Ç¤­¤ëºÇ½é¤Î°ÌÃÖ¤Ë¤¢¤ëÊ¸»ú¤Ç¤¢¤ë C<'a'> ¤¬
¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=begin original

    /[yY][eE][sS]/; # match 'yes' in a case-insensitive way
                    # 'yes', 'Yes', 'YES', etc.
    /yes/i;         # also match 'yes' in a case-insensitive way

=end original

    /[yY][eE][sS]/; # ÂçÊ¸»ú¾®Ê¸»ú¤òÌµ»ë¤·¤Æ 'yes' ¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
                    # 'yes', 'Yes', 'YES' ¤Ê¤É¡£
    /yes/i;         # ¤³¤ì¤âÂçÊ¸»ú¾®Ê¸»ú¤òÌµ»ë¤·¤Æ 'yes' ¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë

=begin original

The last example shows a match with an C<'i'> B<modifier>, which makes
the match case-insensitive.

=end original

ºÇ¸å¤ÎÎã¤ÏÂçÊ¸»ú¾®Ê¸»ú¤òÌµ»ë¤·¤Æ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤è¤¦¤Ë¤¹¤ë
C<'i'> B<½¤¾þ»Ò> (modifier) ¤ò»È¤Ã¤¿¥Þ¥Ã¥Á¥ó¥°¤ò¼¨¤·¤Æ¤¤¤Þ¤¹¡£

=begin original

Character classes also have ordinary and special characters, but the
sets of ordinary and special characters inside a character class are
different than those outside a character class.  The special
characters for a character class are C<-]\^$> and are matched using an
escape:

=end original

Ê¸»ú¥¯¥é¥¹¤âÉáÄÌ¤ÎÊ¸»ú¤ÈÆÃ¼ìÊ¸»ú¤¬¤¢¤ê¤Þ¤¹¤¬¡¢Ê¸»ú¥¯¥é¥¹¤ÎÆâÂ¦¤Ç¤Î
ÉáÄÌ¤ÎÊ¸»ú¤ÈÆÃ¼ìÊ¸»ú¤Ï¡¢Ê¸»ú¥¯¥é¥¹¤Î³°Â¦¤ÎÊª¤È¤Ï°ã¤¤¤Þ¤¹¡£
Ê¸»ú¥¯¥é¥¹¤Î¤¿¤á¤ËÆÃ¼ì¤ÊÊ¸»ú¤Ï C<-]\^$> ¤Ç¡¢¥¨¥¹¥±¡¼¥×¤ò»È¤Ã¤Æ
¥Þ¥Ã¥Á¥ó¥°¤µ¤ì¤Þ¤¹:

=begin original

   /[\]c]def/; # matches ']def' or 'cdef'
   $x = 'bcr';
   /[$x]at/;   # matches 'bat, 'cat', or 'rat'
   /[\$x]at/;  # matches '$at' or 'xat'
   /[\\$x]at/; # matches '\at', 'bat, 'cat', or 'rat'

=end original

   /[\]c]def/; # ']def' ¤Þ¤¿¤Ï 'cdef' ¤Ë¥Þ¥Ã¥Á¥ó¥°
   $x = 'bcr';
   /[$x]at/;   # 'bat', 'cat', 'rat' ¤Ë¥Þ¥Ã¥Á¥ó¥°
   /[\$x]at/;  # '$at' ¤Þ¤¿¤Ï 'xat' ¤Ë¥Þ¥Ã¥Á¥ó¥°
   /[\\$x]at/; # '\at', 'bat, 'cat', 'rat' ¤Ë¥Þ¥Ã¥Á¥ó¥°

=begin original

The special character C<'-'> acts as a range operator within character
classes, so that the unwieldy C<[0123456789]> and C<[abc...xyz]>
become the svelte C<[0-9]> and C<[a-z]>:

=end original

ÆÃ¼ìÊ¸»ú C<'-'> ¤ÏÊ¸»ú¥¯¥é¥¹¤ÎÃæ¤ÇÈÏ°Ï±é»»»Ò¤È¤·¤Æ¿¶Éñ¤¦¤Î¤Ç¡¢
C<[0123456789]> ¤ä C<[abc...xyz]> ¤Î¤è¤¦¤Ê
¸«¤Å¤é¤¤¤â¤Î¤Ï¤¹¤Ã¤­¤ê¤È¤·¤¿ C<[0-9]> ¤Ç¤¢¤ë¤È¤« C<[a-z]> ¤Î¤è¤¦¤Ë
½ñ¤­´¹¤¨¤é¤ì¤Þ¤¹:

=begin original

    /item[0-9]/;  # matches 'item0' or ... or 'item9'
    /[0-9a-fA-F]/;  # matches a hexadecimal digit

=end original

    /item[0-9]/;  # 'item0' ... 'item9' ¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    /[0-9a-fA-F]/;  # 16 ¿Ê¿ô¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë

=begin original

If C<'-'> is the first or last character in a character class, it is
treated as an ordinary character.

=end original

C<'-'> ¤¬Ê¸»ú¥¯¥é¥¹¤ÎÃæ¤ÎºÇ½é¤«ºÇ¸å¤ÎÊ¸»ú¤Ç¤¢¤Ã¤¿¾ì¹ç¡¢ÄÌ¾ï¤ÎÊ¸»ú¤È¤·¤Æ
°·¤ï¤ì¤Þ¤¹¡£

=begin original

The special character C<^> in the first position of a character class
denotes a B<negated character class>, which matches any character but
those in the brackets.  Both C<[...]> and C<[^...]> must match a
character, or the match fails.  Then

=end original

Ê¸»ú¥¯¥é¥¹¤ÎÀèÆ¬¤Î°ÌÃÖ¤Ë¤¢¤ëÆÃ¼ìÊ¸»ú C<^> ¤Ï B<È¿Å¾Ê¸»ú¥¯¥é¥¹> ¤òÉ½¤·¡¢
¥Ö¥é¥±¥Ã¥È¤ÎÃæ¤Ë¤Ê¤¤Ê¸»ú¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£
C<[...]> ¤È C<[^...]> ¤ÎÎ¾Êý¤È¤â¡¢°ì¤Ä¤ÎÊ¸»ú¤Ë¥Þ¥Ã¥Á¥ó¥°¤»¤Í¤Ð¤Ê¤é¤º¡¢
¤½¤¦¤Ç¤Ê¤¤¾ì¹ç¤Ë¤Ï¥Þ¥Ã¥Á¥ó¥°¤Ï¼ºÇÔ¤·¤Þ¤¹¡£
¤Ç¤¹¤«¤é

=begin original

    /[^a]at/;  # doesn't match 'aat' or 'at', but matches
               # all other 'bat', 'cat, '0at', '%at', etc.
    /[^0-9]/;  # matches a non-numeric character
    /[a^]at/;  # matches 'aat' or '^at'; here '^' is ordinary

=end original

    /[^a]at/;  # 'aat' ¤ä 'at' ¤Ë¤Ï¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤¤¤¬¡¢¤½¤ÎÂ¾¤Î
               # 'bat', 'cat, '0at', '%at' ¤Ê¤É¤Ë¤Ï¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    /[^0-9]/;  # ¿ô»ú°Ê³°¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    /[a^]at/;  # 'aat' ¤« '^at'¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë; ¤³¤³¤Ç¤Ï '^' ¤ÏÄÌ¾ï¤ÎÊ¸»ú

=begin original

Perl has several abbreviations for common character classes. (These
definitions are those that Perl uses in ASCII-safe mode with the C</a> modifier.
Otherwise they could match many more non-ASCII Unicode characters as
well.  See L<perlrecharclass/Backslash sequences> for details.)

=end original

Perl ¤Ï°ìÈÌÅª¤ÊÊ¸»ú¥¯¥é¥¹¤ÎÎ¬µ­Ë¡¤ò»ý¤Ã¤Æ¤¤¤Þ¤¹¡£
(¤³¤ì¤é¤ÎÄêµÁ¤Ï Perl ¤¬ C</a> ½¤¾þ»Ò¤Ë¤è¤Ã¤Æ ASCII °ÂÁ´¥â¡¼¥É¤ò
»È¤Ã¤Æ¤¤¤ë¤È¤­¤Î¤â¤Î¤Ç¤¹¡£
¤µ¤â¤Ê¤±¤ì¤Ð¤â¤Ã¤ÈÂ¿¤¯¤ÎÈó ASCII ¤Î Unicode Ê¸»ú¤Ë
¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤«¤â¤·¤ì¤Þ¤»¤ó¡£
¾Ü¤·¤¯¤Ï L<perlrecharclass/Backslash sequences> ¤ò»²¾È¤·¤Æ¤¯¤À¤µ¤¤¡£)

=over 4

=item *

=begin original

\d is a digit and represents

=end original

\d ¤Ï¿ô»ú¤Ç¡¢°Ê²¼¤Î¤â¤Î¤òÉ½¤·¤Þ¤¹

    [0-9]

=item *

=begin original

\s is a whitespace character and represents

=end original

\s ¤Ï¶õÇòÊ¸»ú¤Ç¡¢°Ê²¼¤Î¤â¤Î¤òÉ½¤·¤Þ¤¹

    [\ \t\r\n\f]

=item *

=begin original

\w is a word character (alphanumeric or _) and represents

=end original

\w ¤ÏÃ±¸ì¤ò¹½À®¤¹¤ëÊ¸»ú(±Ñ¿ô»ú ¤È _)¤Ç¡¢°Ê²¼¤Î¤â¤Î¤òÉ½¤·¤Þ¤¹

    [0-9a-zA-Z_]

=item *

=begin original

\D is a negated \d; it represents any character but a digit

=end original

\D ¤Ï \d ¤ÎÈÝÄê·Á¤Ç¤¹; ¿ô»ú°Ê³°¤ÎÊ¸»ú¤òÉ½¤·¤Þ¤¹

    [^0-9]

=item *

=begin original

\S is a negated \s; it represents any non-whitespace character

=end original

\S ¤Ï \s ¤ÎÈÝÄê·Á¤Ç¤¹; Èó¶õÇòÊ¸»ú¤òÉ½¤·¤Þ¤¹

    [^\s]

=item *

=begin original

\W is a negated \w; it represents any non-word character

=end original

\W ¤Ï \w ¤ÎÈÝÄê·Á¤Ç¤¹; Ã±¸ì¤ò¹½À®¤·¤Ê¤¤Ê¸»ú¤òÉ½¤·¤Þ¤¹

    [^\w]

=item *

=begin original

The period '.' matches any character but "\n"

=end original

¥Ô¥ê¥ª¥É '.' ¤Ï "\n" °Ê³°¤ÎÇ¤°Õ¤ÎÊ¸»ú¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹

=back

=begin original

The C<\d\s\w\D\S\W> abbreviations can be used both inside and outside
of character classes.  Here are some in use:

=end original

C<\d\s\w\D\S\W> ¤Î¾ÊÎ¬µ­Ë¡¤ÏÊ¸»ú¥¯¥é¥¹¤ÎÆâÂ¦¤Ç¤â³°Â¦¤Ç¤â»È¤¦¤³¤È¤¬¤Ç¤­¤Þ¤¹¡£
°Ê²¼¤Ï¤½¤ÎÎã¤Ç¤¹:

=begin original

    /\d\d:\d\d:\d\d/; # matches a hh:mm:ss time format
    /[\d\s]/;         # matches any digit or whitespace character
    /\w\W\w/;         # matches a word char, followed by a
                      # non-word char, followed by a word char
    /..rt/;           # matches any two chars, followed by 'rt'
    /end\./;          # matches 'end.'
    /end[.]/;         # same thing, matches 'end.'

=end original

    /\d\d:\d\d:\d\d/; # hh:mm:ss ·Á¼°¤Î»þ´ÖÉ½µ­¤Ë¥Þ¥Ã¥Á¥ó¥°
    /[\d\s]/;         # ¿ô»ú¤Þ¤¿¤Ï¶õÇò¤Ë¥Þ¥Ã¥Á¥ó¥°
    /\w\W\w/;         # ÈóÃ±¸ìÊ¸»ú¤¬Â³¤­¤µ¤é¤ËÃ±¸ìÊ¸»ú¤¬Â³¤¯
                      # Ã±¸ìÊ¸»ú¤Ë¥Þ¥Ã¥Á¥ó¥°
    /..rt/;           # 'rt' ¤¬Â³¤¯Ç¤°Õ¤ÎÆóÊ¸»ú¤Ë¥Þ¥Ã¥Á¥ó¥°
    /end\./;          # 'end.' ¤Ë¥Þ¥Ã¥Á¥ó¥°
    /end[.]/;         # Æ±¤¸¤³¤È; 'end.' ¤Ë¥Þ¥Ã¥Á¥ó¥°

=begin original

The S<B<word anchor> > C<\b> matches a boundary between a word
character and a non-word character C<\w\W> or C<\W\w>:

=end original

S<B<¸ì¥¢¥ó¥«¡¼> > (word anchor) C<\b> ¤Ï¤³¤ì¤ÏÃ±¸ì¤ò¹½À®¤¹¤ëÊ¸»ú¤ÈÃ±¸ì¤ò
¹½À®¤·¤Ê¤¤Ê¸»ú¤Î´Ö C<\w\W> ¤ä C<\W\w> ¤Î¶­³¦¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹:

=begin original

    $x = "Housecat catenates house and cat";
    $x =~ /\bcat/;  # matches cat in 'catenates'
    $x =~ /cat\b/;  # matches cat in 'housecat'
    $x =~ /\bcat\b/;  # matches 'cat' at end of string

=end original

    $x = "Housecat catenates house and cat";
    $x =~ /cat/;    # 'housecat' ¤Î cat ¤Ë¥Þ¥Ã¥Á¥ó¥°
    $x =~ /\bcat/;  # 'catenates' ¤Î cat ¤Ë¥Þ¥Ã¥Á¥ó¥°
    $x =~ /cat\b/;  # 'housecat' ¤Î cat ¤Ë¥Þ¥Ã¥Á¥ó¥°
    $x =~ /\bcat\b/;  # Ê¸»úÎó¤Î½ªÃ¼¤Î'cat'¤Ë¥Þ¥Ã¥Á¥ó¥°

=begin original

In the last example, the end of the string is considered a word
boundary.

=end original

ºÇ¸å¤ÎÎã¤Ç¤Ï¡¢Ê¸»úÎó¤Î½ªÃ¼¤ÏÃ±¸ì¶­³¦¤È¤·¤ÆÇ§¼±¤µ¤ì¤Æ¤¤¤Þ¤¹¡£

=begin original

For natural language processing (so that, for example, apostrophes are
included in words), use instead C<\b{wb}>

=end original

    "don't" =~ / .+? \b{wb} /x;  # matches the whole string

    "don't" =~ / .+? \b{wb} /x;  # matches the whole string

=head2 Matching this or that

(¤¢¤ì¤ä¤³¤ì¤ä¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë)

=begin original

We can match different character strings with the B<alternation>
metacharacter C<'|'>.  To match C<dog> or C<cat>, we form the regex
C<dog|cat>.  As before, Perl will try to match the regex at the
earliest possible point in the string.  At each character position,
Perl will first try to match the first alternative, C<dog>.  If
C<dog> doesn't match, Perl will then try the next alternative, C<cat>.
If C<cat> doesn't match either, then the match fails and Perl moves to
the next position in the string.  Some examples:

=end original

°Û¤Ê¤ëÊ¸»úÎó¤ò B<ÁªÂò> ¥á¥¿Ê¸»ú C<'|'> ¤Ë¤è¤Ã¤Æ¹Ô¤¨¤Þ¤¹¡£
C<dog> ¤Þ¤¿¤Ï C<cat> ¤Ë¥Þ¥Ã¥Á¥ó¥°¤µ¤»¤ë¤Ë¤Ï¡¢Àµµ¬É½¸½¤ò
C<dog|cat> ¤Î¤è¤¦¤Ë¤·¤Þ¤¹¡£
°ÊÁ°½Ò¤Ù¤¿ÄÌ¤ê¡¢Perl¤ÏÊ¸»úÎó¤Î²ÄÇ½¤Ê¸Â¤êºÇ¤âÁá¤¤°ÌÃÖ¤Ç¥Þ¥Ã¥Á¥ó¥°¤ò
¹Ô¤ª¤¦¤È¤·¤Þ¤¹¡£
¤½¤ì¤¾¤ì¤ÎÊ¸»ú°ÌÃÖ¤Ç¡¢Perl¤Ï¤Þ¤º¤Ï¤¸¤á¤ËºÇ½é¤ÎÁªÂò¤Ç¤¢¤ë C<dog> ¤Ë
¥Þ¥Ã¥Á¥ó¥°¤µ¤»¤ë¤³¤È¤ò»î¤ß¤Þ¤¹¡£
¤â¤· C<dog> ¤¬¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤±¤ì¤Ð¡¢Perl ¤Ï¼¡¤ÎÁªÂò»è¤Ç¤¢¤ë C<cat> ¤ò
»î¤·¤Þ¤¹¡£
C<cat> ¤â¤Þ¤¿¥Þ¥Ã¥Á¥ó¥°¤·¤Ê¤±¤ì¤Ð¡¢¥Þ¥Ã¥Á¥ó¥°¤Ï¼ºÇÔ¤·¤ÆPerl¤ÏÊ¸»úÎó¤Î
¼¡¤Î°ÌÃÖ¤Ë°ÜÆ°¤·¤Þ¤¹¡£
´ö¤Ä¤«Îã¤òµó¤²¤Þ¤·¤ç¤¦:

=begin original

    "cats and dogs" =~ /cat|dog|bird/;  # matches "cat"
    "cats and dogs" =~ /dog|cat|bird/;  # matches "cat"

=end original

    "cats and dogs" =~ /cat|dog|bird/;  # "cat" ¤Ë¥Þ¥Ã¥Á¥ó¥°
    "cats and dogs" =~ /dog|cat|bird/;  # "cat" ¤Ë¥Þ¥Ã¥Á¥ó¥°

=begin original

Even though C<dog> is the first alternative in the second regex,
C<cat> is able to match earlier in the string.

=end original

ÆóÈÖÌÜ¤ÎÀµµ¬É½¸½¤Ë¤ª¤¤¤ÆºÇ½é¤ÎÁªÂò»è¤¬ C<dog> ¤Ç¤¢¤ë¤Ë¤â¤«¤«¤ï¤é¤º¡¢
C<cat> ¤¬Ê¸»úÎó¤ÇºÇ½é¤Ë¸½¤ì¤ë¥Þ¥Ã¥Á¥ó¥°ÂÐ¾Ý¤Ç¤¹¡£

=begin original

    "cats"          =~ /c|ca|cat|cats/; # matches "c"
    "cats"          =~ /cats|cat|ca|c/; # matches "cats"

=end original

    "cats"          =~ /c|ca|cat|cats/; # "c" ¤Ë¥Þ¥Ã¥Á¥ó¥°
    "cats"          =~ /cats|cat|ca|c/; # "cats" ¤Ë¥Þ¥Ã¥Á¥ó¥°

=begin original

At a given character position, the first alternative that allows the
regex match to succeed will be the one that matches. Here, all the
alternatives match at the first string position, so the first matches.

=end original

Í¿¤¨¤é¤ì¤¿Ê¸»ú°ÌÃÖ¤Ç¡¢Àµµ¬É½¸½¤Î¥Þ¥Ã¥Á¥ó¥°¤òÀ®¸ù¤µ¤»¤ë¤¿¤á¤Î
ºÇ½é¤ÎÁªÂò»è¤Ï¥Þ¥Ã¥Á¥ó¥°¤¹¤ë°ì¤Ä¤È¤Ê¤ê¤Þ¤¹¡£
¤³¤³¤Ç¤Ï¡¢Á´¤Æ¤ÎÁªÂò¤Ï¤ÏºÇ½é¤ÎÊ¸»úÎó°ÌÃÖ¤Ç¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤Î¤Ç¡¢
ºÇ½é¤Î¤â¤Î¤¬¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=head2 Grouping things and hierarchical matching

(¥°¥ë¡¼¥×²½¤È³¬ÁØÅª¥Þ¥Ã¥Á¥ó¥°)

=begin original

The B<grouping> metacharacters C<()> allow a part of a regex to be
treated as a single unit.  Parts of a regex are grouped by enclosing
them in parentheses.  The regex C<house(cat|keeper)> means match
C<house> followed by either C<cat> or C<keeper>.  Some more examples
are

=end original

B<¥°¥ë¡¼¥×²½> ¥á¥¿Ê¸»ú C<()> ¤ÏÀµµ¬É½¸½¤Î°ìÉôÊ¬¤ò°ì¤Ä¤Î¥æ¥Ë¥Ã¥È¤È¤·¤Æ
°·¤¦¤³¤È¤òµö¤·¤Þ¤¹¡£
¤¢¤ëÀµµ¬É½¸½¤Î°ìÉô¤Ï¥«¥Ã¥³¤Ë¤è¤Ã¤Æ°Ï¤Þ¤ì¤ë¤³¤È¤Ç¥°¥ë¡¼¥×²½¤µ¤ì¤Þ¤¹¡£
Àµµ¬É½¸½ C<house(cat|keeper)> ¤Ï¡¢C<cat> ¤« C<keeper> ¤¬¸åÂ³¤¹¤ë
C<house> ¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤³¤È¤ò°ÕÌ£¤·¤Þ¤¹¡£
´ö¤Ä¤«Îã¤òµó¤²¤Þ¤·¤ç¤¦

=begin original

    /(a|b)b/;    # matches 'ab' or 'bb'
    /(^a|b)c/;   # matches 'ac' at start of string or 'bc' anywhere

=end original

    /(a|b)b/;    # 'ab' ¤Þ¤¿¤Ï 'bb' ¤Ë¥Þ¥Ã¥Á¥ó¥°
    /(^a|b)c/;   # Ê¸»úÎó¤ÎÀèÆ¬¤Ë¤¢¤ë 'ac' ¤«Ç¤°Õ¤Î¾ì½ê¤Î'bc'¤Ë¥Þ¥Ã¥Á¥ó¥°

=begin original

    /house(cat|)/;  # matches either 'housecat' or 'house'
    /house(cat(s|)|)/;  # matches either 'housecats' or 'housecat' or
                        # 'house'.  Note groups can be nested.

=end original

    /house(cat|)/;  # 'housecat' ¤« 'house' ¤Ë¥Þ¥Ã¥Á¥ó¥°
    /house(cat(s|)|)/;  # 'housecats' ¤« 'housecat' ¤« 'house' ¤Î¤¤¤º¤ì¤«¤Ë
                        # ¥Þ¥Ã¥Á¥ó¥°¡£¥°¥ë¡¼¥×¤¬¥Í¥¹¥È¤Ç¤­¤ë¤³¤È¤ËÃí°Õ

=begin original

    "20" =~ /(19|20|)\d\d/;  # matches the null alternative '()\d\d',
                             # because '20\d\d' can't match

=end original

    "20" =~ /(19|20|)\d\d/;  # ¶õ¤ÎÁªÂò»è '()\d\d' ¤Ë¥Þ¥Ã¥Á¥ó¥°
                             # '20\d\d' ¤Ï¥Þ¥Ã¥Á¥ó¥°¤Ç¤­¤Ê¤¤¤«¤é

=head2 Extracting matches

(¥Þ¥Ã¥Á¥ó¥°¤·¤¿¤â¤Î¤ò¼è¤ê½Ð¤¹)

=begin original

The grouping metacharacters C<()> also allow the extraction of the
parts of a string that matched.  For each grouping, the part that
matched inside goes into the special variables C<$1>, C<$2>, etc.
They can be used just as ordinary variables:

=end original

¥°¥ë¡¼¥×²½¥á¥¿Ê¸»ú C<()> ¤Ï¤Þ¤¿¡¢¥Þ¥Ã¥Á¥ó¥°¤·¤¿Ê¸»úÎó¤Î°ìÉôÊ¬¤ò
Å¸³«¤¹¤ë¤³¤È¤¬¤Ç¤­¤Þ¤¹¡£
¤½¤ì¤¾¤ì¤Î¥°¥ë¡¼¥×²½¤ËÂÐ¤·¤Æ¡¢¥Þ¥Ã¥Á¥ó¥°¤·¤¿ÉôÊ¬¤¬ÆÃ¼ìÊÑ¿ô C<$1>, C<$2>
¤Ê¤É¤Ë³ÊÇ¼¤µ¤ì¤Þ¤¹¡£
¤³¤ì¤é¤ÎÊÑ¿ô¤ÏÄÌ¾ï¤ÎÊÑ¿ô¤ÈÆ±¤¸¤è¤¦¤Ë»È¤¦¤³¤È¤¬¤Ç¤­¤Þ¤¹:

=begin original

    # extract hours, minutes, seconds
    $time =~ /(\d\d):(\d\d):(\d\d)/;  # match hh:mm:ss format
    $hours = $1;
    $minutes = $2;
    $seconds = $3;

=end original

    # »þ¡¢Ê¬¡¢ÉÃ¤òÃê½Ð¤¹¤ë
    $time =~ /(\d\d):(\d\d):(\d\d)/;  # hh:mm:ss ·Á¼°¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
    $hours = $1;
    $minutes = $2;
    $seconds = $3;

=begin original

In list context, a match C</regex/> with groupings will return the
list of matched values C<($1,$2,...)>.  So we could rewrite it as

=end original

¥ê¥¹¥È¥³¥ó¥Æ¥­¥¹¥È¤Ç¤Ï¡¢¥°¥ë¡¼¥×²½ÉÕ¤­¤Î¥Þ¥Ã¥Á¥ó¥° C</regex/> ¤Ï
¥Þ¥Ã¥Á¥ó¥°¤·¤¿ÃÍ¤Î¥ê¥¹¥È C<($1,$2,...)> ¤òÊÖ¤·¤Þ¤¹¡£
½¾¤Ã¤Æ¤³¤ì¤Ï°Ê²¼¤Î¤è¤¦¤Ë½ñ¤­´¹¤¨¤é¤ì¤Þ¤¹

    ($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);

=begin original

If the groupings in a regex are nested, C<$1> gets the group with the
leftmost opening parenthesis, C<$2> the next opening parenthesis,
etc.  For example, here is a complex regex and the matching variables
indicated below it:

=end original

Àµµ¬É½¸½Ãæ¤Î¥°¥ë¡¼¥×²½¤¬¥Í¥¹¥È¤·¤Æ¤¤¤¿¾ì¹ç¡¢C<$1> ¤ÏºÇ¤âº¸¤Ë¤¢¤ë
³«¤­¤«¤Ã¤³¤Ë¤è¤Ã¤Æ¥°¥ë¡¼¥×²½¤µ¤ì¤Æ¤¤¤ë¤â¤Î¤ò¼è¤ê¡¢C<$2> ¤Ï
¼¡¤Î³«¤­¤«¤Ã¤³¤Ë¤è¤ë¤â¤Î¤ò¼è¤ê¡Ä¤È¤Ê¤Ã¤Æ¤¤¤­¤Þ¤¹¡£
Îã¤¨¤Ð¡¢°Ê²¼¤ÏÊ£»¨¤ÊÀµµ¬É½¸½¤È¡¢¸å½Ò¤¹¤ë¥Þ¥Ã¥Á¥ó¥°ÊÑ¿ô¤Ç¤¹:

    /(ab(cd|ef)((gi)|j))/;
     1  2      34

=begin original

Associated with the matching variables C<$1>, C<$2>, ... are
the B<backreferences> C<\g1>, C<\g2>, ...  Backreferences are
matching variables that can be used I<inside> a regex:

=end original

¥Þ¥Ã¥Á¥ó¥°ÊÑ¿ô C<$1>, C<$2> ¡Ä¤ËÌ©ÀÜ¤Ë·ë¤ÓÉÕ¤±¤é¤ì¤¿¤â¤Î¤Ï¡¢
B<¸åÊý»²¾È> (backreferences) C<\g1>, C<\g2> ¡Ä¤Ç¤¹¡£
¸åÊý»²¾È¤ÏÀµµ¬É½¸½¤Î I<ÆâÂ¦> ¤Ç»È¤¦¤³¤È¤Î¤Ç¤­¤ë¥Þ¥Ã¥Á¥ó¥°ÊÑ¿ô¤Ç¤¹:

=begin original

    /(\w\w\w)\s\g1/; # find sequences like 'the the' in string

=end original

    /(\w\w\w)\s\g1/; # Ê¸»úÎóÃæ¤Î 'the the' ¤Î¤è¤¦¤ÊÊÂ¤Ó¤òÃµ¤¹

=begin original

C<$1>, C<$2>, ... should only be used outside of a regex, and C<\g1>,
C<\g2>, ... only inside a regex.

=end original

C<$1>, C<$2> ¡Ä¤ÏÀµµ¬É½¸½¤Î³°Â¦¤Î¤ß¤ÇÍÑ¤¤¡¢
¸åÊý»²¾È C<\g1>, C<\g2> ¡Ä¤ÏÀµµ¬É½¸½¤ÎÆâÂ¦¤Ç¤Î¤ß»È¤¦¤è¤¦¤Ë¤¹¤Ù¤­¤Ç¤¹¡£

=head2 Matching repetitions

(¥Þ¥Ã¥Á¥ó¥°¤Î·«¤êÊÖ¤·)

=begin original

The B<quantifier> metacharacters C<?>, C<*>, C<+>, and C<{}> allow us
to determine the number of repeats of a portion of a regex we
consider to be a match.  Quantifiers are put immediately after the
character, character class, or grouping that we want to specify.  They
have the following meanings:

=end original

B<ÎÌ»ØÄê»Ò> (quantifier) C<?>, C<*>, C<+>, C<{}> ¤Ë¤è¤Ã¤Æ¡¢
¥Þ¥Ã¥Á¥ó¥°¤µ¤»¤¿¤¤¤È¹Í¤¨¤Æ¤¤¤ëÀµµ¬É½¸½¤Î°ìÉôÊ¬¤Î·«¤êÊÖ¤·²ó¿ô¤ò
»ØÄê¤Ç¤­¤Þ¤¹¡£
ÎÌ»ØÄê»Ò¤Ï·«¤êÊÖ¤·¤ò»ØÄê¤·¤¿¤¤Ê¸»ú¡¢Ê¸»ú¥¯¥é¥¹¡¢¤Þ¤¿¤Ï¥°¥ë¡¼¥×¤ÎÄ¾¸å¤Ë
ÃÖ¤­¤Þ¤¹¡£
ÎÌ»ØÄê»Ò¤Ë¤Ï°Ê²¼¤Î¤è¤¦¤Ê°ÕÌ£¤¬¤¢¤ê¤Þ¤¹:

=over 4

=item *

=begin original

C<a?> = match 'a' 1 or 0 times

=end original

C<a?> ¤Ï: 'a' ¤Þ¤¿¤Ï¶õÊ¸»úÎó¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=item *

=begin original

C<a*> = match 'a' 0 or more times, i.e., any number of times

=end original

C<a*> ¤Ï: 'a' ¤Î¥¼¥í²ó°Ê¾å¤Î·«¤êÊÖ¤·¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=item *

=begin original

C<a+> = match 'a' 1 or more times, i.e., at least once

=end original

C<a+> ¤Ï: 'a' ¤Î°ì²ó°Ê¾å¤Î·«¤êÊÖ¤·¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=item *

=begin original

C<a{n,m}> = match at least C<n> times, but not more than C<m>
times.

=end original

C<a{n,m}> ¤Ï: C<n> ²ó°Ê¾å C<m> ²ó°Ê²¼¤Î·«¤êÊÖ¤·¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=item *

=begin original

C<a{n,}> = match at least C<n> or more times

=end original

C<a{n,}> ¤Ï: C<n> ²ó°Ê¾å¤Î·«¤êÊÖ¤·¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=item *

=begin original

C<a{n}> = match exactly C<n> times

=end original

C<a{n}> ¤Ï: C<n> ²ó¤Î·«¤êÊÖ¤·¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=back

=begin original

Here are some examples:

=end original

°Ê²¼¤Ë´ö¤Ä¤«Îã¤òµó¤²¤Þ¤¹:

=begin original

    /[a-z]+\s+\d*/;  # match a lowercase word, at least some space, and
                     # any number of digits
    /(\w+)\s+\g1/;    # match doubled words of arbitrary length
    $year =~ /^\d{2,4}$/;  # make sure year is at least 2 but not more
                           # than 4 digits
    $year =~ /^\d{4}$|^\d{2}$/; # better match; throw out 3 digit dates

=end original

    /[a-z]+\s+\d*/;  # ¾®Ê¸»ú¤ÎÃ±¸ì¡¢´ö¤Ä¤«¤Î¶õÇò¡¢¤½¤ì¤ËÂ³¤¯Ç¤°Õ¤ÎÄ¹¤µ¤Î
                     # ¿ô»ú¤Ë¥Þ¥Ã¥Á¥ó¥°
    /(\w+)\s+\1/;     # Ç¤°Õ¤ÎÄ¹¤µ¤ÎÃ±¸ì¤Î½ÅÊ£¤Ë¥Þ¥Ã¥Á¥ó¥°
    $year =~ /\d{2,4}/;  # Ç¯¤¬¾¯¤Ê¤¯¤È¤â 2 ·å¤¢¤ë¤¬ºÇÂç¤Ç¤â 4 ·å¤Ë
                         # ¤Ê¤ë¤è¤¦¤Ë¤¹¤ë
    $year =~ /^\d{4}$|^\d{2}$/; # ¤â¤Ã¤ÈÎÉ¤¤; 3·å¤ò¤Ï¤¸¤¯

=begin original

These quantifiers will try to match as much of the string as possible,
while still allowing the regex to match.  So we have

=end original

¤³¤ì¤é¤ÎÎÌ»ØÄê»Ò¤Ï¤ÏÀµµ¬É½¸½¤Î¥Þ¥Ã¥Á¥ó¥°¤¬À®¸ù¤¹¤ë¤Î¤òµö¤¹ÈÏ°Ï¤Ç
²ÄÇ½¤Ê¸Â¤ê¤ÎÊ¸»úÎó¤ò¥Þ¥Ã¥Á¥ó¥°¤µ¤»¤è¤¦¤È¤·¤Þ¤¹¡£
½¾¤Ã¤Æ¡¢°Ê²¼¤Î¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹

=begin original

    $x = 'the cat in the hat';
    $x =~ /^(.*)(at)(.*)$/; # matches,
                            # $1 = 'the cat in the h'
                            # $2 = 'at'
                            # $3 = ''   (0 matches)

=end original

    $x = 'the cat in the hat';
    $x =~ /^(.*)(at)(.*)$/; # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
                            # $1 = 'the cat in the h'
                            # $2 = 'at'
                            # $3 = ''   (0 ²ó¥Þ¥Ã¥Á¥ó¥°)

=begin original

The first quantifier C<.*> grabs as much of the string as possible
while still having the regex match. The second quantifier C<.*> has
no string left to it, so it matches 0 times.

=end original

ºÇ½é¤ÎÎÌ»ØÄê»Ò C<.*> ¤ÏÀµµ¬É½¸½¤¬¥Þ¥Ã¥Á¥ó¥°¤¹¤ëÈÏ°Ï¤Ç²ÄÇ½¤Ê¸Â¤ê¤Î
Ä¹¤¤Ê¸»úÎó¤ò¤Ä¤«¤ß¤È¤ê¤Þ¤¹¡£
2 ÈÖÌÜ¤ÎÎÌ»ØÄê»Ò¤Ï C<.*> ¤Ë¤ÏÊ¸»úÎó¤¬»Ä¤µ¤ì¤Æ¤¤¤Ê¤¤¤Î¤Ç¡¢0 ²ó
¥Þ¥Ã¥Á¥ó¥°¤·¤Þ¤¹¡£

=head2 More matching

(¹¹¤Ê¤ë¥Þ¥Ã¥Á¥ó¥°)

=begin original

There are a few more things you might want to know about matching
operators.
The global modifier C</g> allows the matching operator to match
within a string as many times as possible.  In scalar context,
successive matches against a string will have C</g> jump from match
to match, keeping track of position in the string as it goes along.
You can get or set the position with the C<pos()> function.
For example,

=end original

¥Þ¥Ã¥Á¥ó¥°±é»»»Ò¤Ë¤Ä¤¤¤ÆÃÎ¤ê¤¿¤¤¤«¤â¤·¤ì¤Ê¤¤¤³¤È¤¬¤¢¤È¤¤¤¯¤Ä¤«¤¢¤ê¤Þ¤¹¡£
¥°¥í¡¼¥Ð¥ë½¤¾þ»Ò C</g> ¤Ï°ì¤Ä¤ÎÊ¸»úÎó¤Ë½ÐÍè¤ë¤À¤±²¿²ó¤â¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¤³¤È¤ò
µö¤·¤Þ¤¹¡£
¥¹¥«¥é¥³¥ó¥Æ¥­¥¹¥È¤Ç¤Ï¡¢Ê¸»úÎó¤ËÂÐ¤¹¤ë¥Þ¥Ã¥Á¥ó¥°¤ÎÀ®¸ù¤Ë¤è¤Ã¤Æ
C</g> ¤Ï¥Þ¥Ã¥Á¥ó¥°¤«¤é¥Þ¥Ã¥Á¥ó¥°¤Ë¥¸¥ã¥ó¥×¤·¡¢Ê¸»úÎó¤Î°ÌÃÖ¤òµ­Ï¿¤·Â³¤±¤Æ
¤¤¤­¤Þ¤¹¡£
C<pos()> ´Ø¿ô¤ò»È¤Ã¤Æ°ÌÃÖ¤ò¼èÆÀ¤Þ¤¿¤ÏÀßÄê¤Ç¤­¤Þ¤¹¡£
Îã¤¨¤Ð:

    $x = "cat dog house"; # 3 words
    while ($x =~ /(\w+)/g) {
        print "Word is $1, ends at position ", pos $x, "\n";
    }

=begin original

prints

=end original

¤Ï°Ê²¼¤òÉ½¼¨¤·¤Þ¤¹

    Word is cat, ends at position 3
    Word is dog, ends at position 7
    Word is house, ends at position 13

=begin original

A failed match or changing the target string resets the position.  If
you don't want the position reset after failure to match, add the
C</c>, as in C</regex/gc>.

=end original

¥Þ¥Ã¥Á¥ó¥°¤Ë¼ºÇÔ¤·¤¿¤ê¡¢¥¿¡¼¥²¥Ã¥ÈÊ¸»úÎó¤òÊÑ¹¹¤¹¤ë¤È¤³¤Î°ÌÃÖ¤Ï
¥ê¥»¥Ã¥È¤µ¤ì¤Þ¤¹¡£
¤â¤·¥Þ¥Ã¥Á¥ó¥°¤Ë¼ºÇÔ¤·¤¿¤È¤­¤Ë°ÌÃÖ¤ò¥ê¥»¥Ã¥È¤·¤¿¤¯¤Ê¤¤¤Î¤Ç¤¢¤ì¤Ð¡¢
C</regexp/gc> ¤Î¤è¤¦¤Ë C</c> ¤òÄÉ²Ã¤·¤Þ¤¹¡£

=begin original

In list context, C</g> returns a list of matched groupings, or if
there are no groupings, a list of matches to the whole regex.  So

=end original

¥ê¥¹¥È¥³¥ó¥Æ¥­¥¹¥È¤Ç¤Ï¡¢C</g> ¤Ï¥Þ¥Ã¥Á¥ó¥°¤·¤¿¥°¥ë¡¼¥×¤Î¥ê¥¹¥È¤òÊÖ¤·¤Þ¤¹;
¥°¥ë¡¼¥×²½¤Î»ØÄê¤¬¤Ê¤±¤ì¤Ð¡¢Àµµ¬É½¸½Á´ÂÎ¤Ë¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¥ê¥¹¥È¤òÊÖ¤·¤Þ¤¹¡£
½¾¤Ã¤Æ

=begin original

    @words = ($x =~ /(\w+)/g);  # matches,
                                # $word[0] = 'cat'
                                # $word[1] = 'dog'
                                # $word[2] = 'house'

=end original

    @words = ($x =~ /(\w+)/g);  # ¥Þ¥Ã¥Á¥ó¥°¤¹¤ë
                                # $word[0] = 'cat'
                                # $word[1] = 'dog'
                                # $word[2] = 'house'

=head2 Search and replace

(¸¡º÷¤ÈÃÖ´¹)

=begin original

Search and replace is performed using C<s/regex/replacement/modifiers>.
The C<replacement> is a Perl double-quoted string that replaces in the
string whatever is matched with the C<regex>.  The operator C<=~> is
also used here to associate a string with C<s///>.  If matching
against C<$_>, the S<C<$_ =~>> can be dropped.  If there is a match,
C<s///> returns the number of substitutions made; otherwise it returns
false.  Here are a few examples:

=end original

¸¡º÷¤ÈÃÖ´¹¤Ï C<s/regexp/replacement/modifiers> ¤ò»È¤Ã¤Æ½èÍý¤µ¤ì¤Þ¤¹¡£
C<replacement> ¤Ï Perl¤Ç¤Î¥À¥Ö¥ë¥¯¥©¡¼¥È¤Ç°Ï¤Þ¤ì¤¿Ê¸»úÎó¤Ç¡¢
C<regexp> ¤Ë¥Þ¥Ã¥Á¥ó¥°¤·¤¿Ê¸»úÎó¤òÃÖ¤­´¹¤¨¤ë¤â¤Î¤Ç¤¹¡£
C<=~> ±é»»»Ò¤â¤Þ¤¿ C<s///> ¤òÈ¼¤Ã¤¿Ê¸»úÎó¤Ë·ë¤Ó¤Ä¤±¤é¤ì¤ë¤¿¤á¤Ë
»È¤ï¤ì¤Þ¤¹¡£
C<$_> ¤ËÂÐ¤·¤Æ¥Þ¥Ã¥Á¥ó¥°¤ò¹Ô¤¦¾ì¹ç¤Ë¤Ï¡¢S<C<$_ =~> > ¤Ï¾ÊÎ¬¤Ç¤­¤Þ¤¹¡£
¥Þ¥Ã¥Á¥ó¥°¤ËÀ®¸ù¤·¤¿¾ì¹ç¤Ë¤Ï C<s///> ¤ÏÃÖ´¹¤¬¹Ô¤ï¤ì¤¿¿ô¤òÊÖ¤·¤Þ¤¹;
¼ºÇÔ¤·¤¿¾ì¹ç¤Ë¤Ïµ¶¤òÊÖ¤·¤Þ¤¹¡£
´ö¤Ä¤«Îã¤òµó¤²¤Þ¤·¤ç¤¦:

=begin original

    $x = "Time to feed the cat!";
    $x =~ s/cat/hacker/;   # $x contains "Time to feed the hacker!"
    $y = "'quoted words'";
    $y =~ s/^'(.*)'$/$1/;  # strip single quotes,
                           # $y contains "quoted words"

=end original

    $x = "Time to feed the cat!";
    $x =~ s/cat/hacker/;   # $x ¤ÎÆâÍÆ¤Ï "Time to feed the hacker!"
    $y = "'quoted words'";
    $y =~ s/^'(.*)'$/$1/;  # ¥·¥ó¥°¥ë¥¯¥©¡¼¥È¤òÇí¤®¼è¤ë
                           # $y ¤ÎÆâÍÆ¤Ï "quoted words"

=begin original

With the C<s///> operator, the matched variables C<$1>, C<$2>, etc.
are immediately available for use in the replacement expression. With
the global modifier, C<s///g> will search and replace all occurrences
of the regex in the string:

=end original

C<s///> ±é»»»Ò¤ò»È¤¦¤Ë¤¢¤¿¤Ã¤Æ¡¢C<$1>, C<$2> ¤È¤¤¤Ã¤¿¥Þ¥Ã¥Á¥ó¥°ÊÑ¿ô¤Ï
¤½¤ÎÃÖ´¹¼°¤Î¤Ê¤«¤ÇÂ¨ºÂ¤Ë»È¤¦¤³¤È¤¬¤Ç¤­¤Þ¤¹¡£
¥°¥í¡¼¥Ð¥ë½¤¾þ»Ò C<s///g> ¤ò»È¤¦¤³¤È¤Ç¡¢Ê¸»úÎóÃæ¤Î¤¹¤Ù¤Æ¤ÎÀµµ¬É½¸½¤Ë
¥Þ¥Ã¥Á¥ó¥°¤¹¤ë¸¡º÷¤ÈÃÖ´¹¤ò¹Ô¤¤¤Þ¤¹:

=begin original

    $x = "I batted 4 for 4";
    $x =~ s/4/four/;   # $x contains "I batted four for 4"
    $x = "I batted 4 for 4";
    $x =~ s/4/four/g;  # $x contains "I batted four for four"

=end original

    $x = "I batted 4 for 4";
    $x =~ s/4/four/;   # $x ¤ÎÆâÍÆ¤Ï "I batted four for 4"
    $x = "I batted 4 for 4";
    $x =~ s/4/four/g;  # $x ¤ÎÆâÍÆ¤Ï "I batted four for four"

=begin original

The non-destructive modifier C<s///r> causes the result of the substitution
to be returned instead of modifying C<$_> (or whatever variable the
substitute was bound to with C<=~>):

=end original

ÈóÇË²õ½¤¾þ»Ò C<s///r> ¤Ï C<$_> (¤Þ¤¿¤Ï C<=~> ¤Ë¤è¤Ã¤ÆÃÖ´¹¤µ¤ì¤ë¤³¤È¤Ë¤Ê¤ë
ÊÑ¿ô) ¤òÊÑ¹¹¤¹¤ëÂå¤ï¤ê¤Ë¡¢ÃÖ´¹¤Î·ë²Ì¤òÊÖ¤·¤Þ¤¹:

    $x = "I like dogs.";
    $y = $x =~ s/dogs/cats/r;
    print "$x $y\n"; # prints "I like dogs. I like cats."

    $x = "Cats are great.";
    print $x =~ s/Cats/Dogs/r =~ s/Dogs/Frogs/r =~
        s/Frogs/Hedgehogs/r, "\n";
    # prints "Hedgehogs are great."

    @foo = map { s/[a-z]/X/r } qw(a b c 1 2 3);
    # @foo is now qw(X X X 1 2 3)

=begin original

The evaluation modifier C<s///e> wraps an C<eval{...}> around the
replacement string and the evaluated result is substituted for the
matched substring.  Some examples:

=end original

É¾²Á½¤¾þ»Ò C<s///e> ¤ÏÃÖ´¹Ê¸»úÎó¤ò C<eval{...}> ¤Ç¥é¥Ã¥×¤·¡¢¤½¤ÎÉ¾²Á·ë²Ì¤ò
¥Þ¥Ã¥Á¥ó¥°¤·¤¿ÉôÊ¬Ê¸»úÎó¤ÎÃÖ´¹¤Î¤¿¤á¤Ë»È¤¤¤Þ¤¹¡£
¤¤¤¯¤Ä¤«Îã¤òµó¤²¤Þ¤¹:

=begin original

    # reverse all the words in a string
    $x = "the cat in the hat";
    $x =~ s/(\w+)/reverse $1/ge;   # $x contains "eht tac ni eht tah"

=end original

    # Ê¸»úÎóÃæ¤ÎÁ´¤Æ¤ÎÃ±¸ì¤òµÕ½ç¤Ë¤¹¤ë
    $x = "the cat in the hat";
    $x =~ s/(\w+)/reverse $1/ge;   # $x ¤Ï "eht tac ni eht tah"

=begin original

    # convert percentage to decimal
    $x = "A 39% hit rate";
    $x =~ s!(\d+)%!$1/100!e;       # $x contains "A 0.39 hit rate"

=end original

    # É´Ê¬Î¨¤ò 10 ¿Ê¿ô¤ËÃÖ¤­´¹¤¨¤ë
    $x = "A 39% hit rate";
    $x =~ s!(\d+)%!$1/100!e;       # $x ¤Ï "A 0.39 hit rate"

=begin original

The last example shows that C<s///> can use other delimiters, such as
C<s!!!> and C<s{}{}>, and even C<s{}//>.  If single quotes are used
C<s'''>, then the regex and replacement are treated as single-quoted
strings.

=end original

ºÇ¸å¤ÎÎã¤Î¤è¤¦¤Ë¡¢C<s///> ¤â C<s!!!> ¤ä C<s{}{}> ¡¢
²Ì¤Æ¤Ï C<s{}//> ¤Î¤è¤¦¤Ë°Û¤Ê¤ë¥Ç¥ê¥ß¥¿¤ò»È¤¦¤³¤È¤¬¤Ç¤­¤Þ¤¹¡£
C<s'''> ¤Î¤è¤¦¤Ë¥·¥ó¥°¥ë¥¯¥©¡¼¥È¤¬»È¤ï¤ì¤¿¾ì¹ç¡¢¤½¤ÎÀµµ¬É½¸½¤È
ÃÖ´¹¥Æ¥­¥¹¥È¤Ï¥·¥ó¥°¥ë¥¯¥©¡¼¥ÈÊ¸»úÎó¤Î¤è¤¦¤Ë°·¤ï¤ì¡¢ÊÑ¿ô¤ÎÃÖ¤­´¹¤¨¤Ï
¹Ô¤ï¤ì¤Þ¤»¤ó¡£

=head2 The split operator

(split ±é»»»Ò)

=begin original

C<split /regex/, string> splits C<string> into a list of substrings
and returns that list.  The regex determines the character sequence
that C<string> is split with respect to.  For example, to split a
string into words, use

=end original

C<split /regex/, string, limit> ¤Ï C<string> ¥ª¥Ú¥é¥ó¥É¤òÉôÊ¬Ê¸»úÎó¤Î
¥ê¥¹¥È¤ËÊ¬³ä¤·¡¢¤½¤Î¥ê¥¹¥È¤òÊÖ¤·¤Þ¤¹¡£
regex ¤Ï¡¢C<string> ¤òÊ¬³ä¤¹¤ë¤È¤­¤Ë»È¤ï¤ì¤ëÊ¸»úÊÂ¤Ó¤ò·èÄê¤·¤Þ¤¹¡£
¤¿¤È¤¨¤Ð¡¢Ê¸»úÎó¤òÃ±¸ì¤ËÊ¬³ä¤¹¤ë¤Ë¤Ï°Ê²¼¤Î¤è¤¦¤Ë¤·¤Þ¤¹

    $x = "Calvin and Hobbes";
    @word = split /\s+/, $x;  # $word[0] = 'Calvin'
                              # $word[1] = 'and'
                              # $word[2] = 'Hobbes'

=begin original

To extract a comma-delimited list of numbers, use

=end original

¥«¥ó¥Þ¶èÀÚ¤ê¤Î¿ôÃÍ¥ê¥¹¥È¤òÅ¸³«¤¹¤ë¤Ë¤Ï¡¢°Ê²¼¤Î¤è¤¦¤Ë¤·¤Þ¤¹

    $x = "1.618,2.718,   3.142";
    @const = split /,\s*/, $x;  # $const[0] = '1.618'
                                # $const[1] = '2.718'
                                # $const[2] = '3.142'

=begin original

If the empty regex C<//> is used, the string is split into individual
characters.  If the regex has groupings, then the list produced contains
the matched substrings from the groupings as well:

=end original

C<//> ¤¬»È¤ï¤ì¤¿¾ì¹ç¤Ë¤Ï¡¢Ê¸»úÎó¤Ï¸Ä¡¹¤ÎÊ¸»ú¤ËÊ¬³ä¤µ¤ì¤Þ¤¹¡£
Àµµ¬É½¸½¤¬¥°¥ë¡¼¥×²½¤òÈ¼¤Ã¤Æ¤¤¤¿¾ì¹ç¤Ë¤Ï¡¢¥°¥ë¡¼¥×²½¤µ¤ì¤¿¤â¤Î¤âÉôÊ¬Ê¸»úÎó¤Ë
´Þ¤Þ¤ì¤ë¤è¤¦¤Ë¤Ê¤ê¤Þ¤¹:

    $x = "/usr/bin";
    @parts = split m!(/)!, $x;  # $parts[0] = ''
                                # $parts[1] = '/'
                                # $parts[2] = 'usr'
                                # $parts[3] = '/'
                                # $parts[4] = 'bin'

=begin original

Since the first character of $x matched the regex, C<split> prepended
an empty initial element to the list.

=end original

$x ¤ÎºÇ½é¤ÎÊ¸»ú¤ËÀµµ¬É½¸½¤¬¥Þ¥Ã¥Á¥ó¥°¤·¤Æ¤¤¤ë¤Î¤Ç¡¢C<split> ¤Ï¥ê¥¹¥È¤Î
ºÇ½é¤ÎÍ×ÁÇ¤Ë¶õÍ×ÁÇ¤òÃÖ¤­¤Þ¤¹¡£

=head2 C<use re 'strict'>

=begin original

New in v5.22, this applies stricter rules than otherwise when compiling
regular expression patterns.  It can find things that, while legal, may
not be what you intended.

=end original

¤³¤ì¤Ï v5.22 ¤«¤é¤Î¿·µ¡Ç½¤Ç¡¢Àµµ¬É½¸½¥Ñ¥¿¡¼¥ó¤ò¥³¥ó¥Ñ¥¤¥ë¤¹¤ë¤È¤­¤ËÂ¾¤è¤ê¤â
¤è¤ê¸·Ì©¤Êµ¬Â§¤òÅ¬ÍÑ¤·¤Þ¤¹¡£
¤³¤ì¤Ë¤è¤ê¡¢Í­¸ú¤Ç¤Ï¤¢¤ë¤±¤ì¤É¤â¡¢°Õ¿Þ¤·¤Æ¤¤¤ë¤â¤Î¤È°ã¤¦¤«¤âÃÎ¤ì¤Ê¤¤¤â¤Î¤ò
¸«¤Ä¤±¤ë¤³¤È¤¬¤Ç¤­¤Þ¤¹¡£

=begin original

See L<'strict' in re|re/'strict' mode>.

=end original

L<'strict' in re|re/'strict' mode> ¤ò»²¾È¤·¤Æ¤¯¤À¤µ¤¤¡£

=head1 BUGS

=begin original

None.

=end original

¤Ê¤·¡£

=head1 SEE ALSO

=begin original

This is just a quick start guide.  For a more in-depth tutorial on
regexes, see L<perlretut> and for the reference page, see L<perlre>.

=end original

¤³¤ì¤ÏÃ±¤Ê¤ë¥¯¥¤¥Ã¥¯¥¹¥¿¡¼¥È¥¬¥¤¥É¤Ç¤¹¡£
Àµµ¬É½¸½¤Ë´Ø¤¹¤ë¤è¤ê¿¼¤¤¥Á¥å¡¼¥È¥ê¥¢¥ë¤Ë¤Ä¤¤¤Æ¤Ï L<perlretut> ¤ò¡¢
¥ê¥Õ¥¡¥ì¥ó¥¹¤Ë¤Ä¤¤¤Æ¤Ï L<perlre> ¤ò»²¾È¤·¤Æ¤¯¤À¤µ¤¤¡£

=head1 AUTHOR AND COPYRIGHT

Copyright (c) 2000 Mark Kvale
All rights reserved.

This document may be distributed under the same terms as Perl itself.

=head2 Acknowledgments

The author would like to thank Mark-Jason Dominus, Tom Christiansen,
Ilya Zakharevich, Brad Hughes, and Mike Giroux for all their helpful
comments.

=cut

=begin meta

Translate: SHIRAKATA Kentaro <argrath@ub32.org>
Status: completed

=end meta