perlrecharclass 5.10.1 と 5.40.0 の差分

1	1
2		=encoding eu~~c-jp~~
	2	=encoding utf8
3	3
4	4	=head1 NAME
	5	X<character class>
5	6
6	7	=begin original
7	8
8	9	perlrecharclass - Perl Regular Expression Character Classes
9	10
10	11	=end original
11	12
12	13	perlrecharclass - Perl 正規表現文字クラス
13	14
14	15	=head1 DESCRIPTION
15	16
16	17	=begin original
17	18
18	19	The top level documentation about Perl regular expressions
19	20	is found in L<perlre>.
20	21
21	22	=end original
22	23
23	24	Perl 正規表現に関する最上位文書は L<perlre> です。
24	25
25	26	=begin original
26	27
27	28	This manual page discusses the syntax and use of character
28		classes in Perl Regular Expressions.
	29	classes in Perl regular expressions.
29	30
30	31	=end original
31	32
32	33	このマニュアルページは Perl 正規表現の文字クラスの文法と使用法について
33	34	議論します。
34	35
35	36	=begin original
36	37
37		A character class is a way of denoting a set of characters,
	38	A character class is a way of denoting a set of characters
38	39	in such a way that one character of the set is matched.
39		It's important to remember that matching a character class
	40	It's important to remember that: matching a character class
40	41	consumes exactly one character in the source string. (The source
41	42	string is the string the regular expression is matched against.)
42	43
43	44	=end original
44	45
45	46	文字クラスは、集合の中の一文字がマッチングするというような方法で、
46	47	文字の集合を指定するための方法です。
47		文字集合はソース文字列の中から正確に~~一文字だけを消費するということを~~
	48	次のことを覚えておくことは重要です: 文字集合はソース文字列の中から正確に
48		~~覚えておくことは重要で~~す。
	49	一文字だけを消費します。
49	50	(ソース文字列とは正規表現がマッチングしようとしている文字列です。)
50	51
51	52	=begin original
52	53
53	54	There are three types of character classes in Perl regular
54		expressions: the dot, backslashed sequences, and the bracketed form.
	55	expressions: the dot, backslash sequences, and the form enclosed in square
	56	brackets. Keep in mind, though, that often the term "character class" is used
	57	to mean just the bracketed form. Certainly, most Perl documentation does that.
55	58
56	59	=end original
57	60
58	61	Perl 正規表現には 3 種類の文字クラスがあります: ドット、
59		逆スラッシュシーケンス、かっこ付き形式です。
	62	逆スラッシュシーケンス、大かっこで囲まれた形式です。
	63	しかし、「文字クラス」という用語はしばしば大かっこ形式だけを意味するために
	64	使われることに注意してください。
	65	確かに、ほとんどの Perl 文書ではそうなっています。
60	66
61	67	=head2 The dot
62	68
63	69	(ドット)
64	70
65	71	=begin original
66	72
67	73	The dot (or period), C<.> is probably the most used, and certainly
68	74	the most well-known character class. By default, a dot matches any
69		character, except for the newline. The default can be changed to
	75	character, except for the newline. That default can be changed to
70		add matching the newline with the I<single line> modifier: ~~either~~
	76	add matching the newline by using the I<single line> modifier:
71		for the entire regular expression using the C</s> modifier, or
	77	for the entire regular expression with the C</s> modifier, or
72		locally using C<(?s)>.
	78	locally with C<(?s)> (and even globally within the scope of
	79	L<C<use re '/s'>\|re/'E<sol>flags' mode>). (The C<L</\N>> backslash
	80	sequence, described
	81	below, matches any character except newline without regard to the
	82	I<single line> modifier.)
73	83
74	84	=end original
75	85
76	86	ドット (またはピリオド) C<.> はおそらくもっともよく使われ、そして確実に
77	87	もっともよく知られている文字クラスです。
78	88	デフォルトでは、ドットは改行を除く任意の文字にマッチングします。
79		デフォルトは I<単一行> 修飾子を使うことで改行にもマッチングするように
	89	このデフォルトは I<単一行> 修飾子を使うことで改行にもマッチングするように
80	90	変更されます: 正規表現全体に対して C</s> 修飾子を使うか、ローカルには
81		C<(?s)> を使います。
	91	C<(?s)> を使います
	92	(そしてグローバルに L<C<use re '/s'>\|re/'E<sol>flags' mode> の
	93	スコープ内の場合でもそうです)。
	94	(後述する C<L</\N>> 逆スラッシュシーケンスでは、I<単一行> 修飾子に
	95	関わりなく改行以外の任意の文字にマッチングします。)
82	96
83	97	=begin original
84	98
85	99	Here are some examples:
86	100
87	101	=end original
88	102
89	103	以下は例です:
90	104
91	105	=begin original
92	106
93	107	"a" =~ /./ # Match
94	108	"." =~ /./ # Match
95	109	"" =~ /./ # No match (dot has to match a character)
96	110	"\n" =~ /./ # No match (dot does not match a newline)
97	111	"\n" =~ /./s # Match (global 'single line' modifier)
98	112	"\n" =~ /(?s:.)/ # Match (local 'single line' modifier)
99	113	"ab" =~ /^.$/ # No match (dot matches one character)
100	114
101	115	=end original
102	116
103	117	"a" =~ /./ # マッチングする
104	118	"." =~ /./ # マッチングする
105	119	"" =~ /./ # マッチングしない (ドットは文字にマッチングする必要がある)
106	120	"\n" =~ /./ # マッチングしない (ドットは改行にはマッチングしない)
107	121	"\n" =~ /./s # マッチングする (グローバル「単一行」修飾子)
108	122	"\n" =~ /(?s:.)/ # マッチングする (ローカル「単一行」修飾子)
109	123	"ab" =~ /^.$/ # マッチングしない (ドットは一文字にマッチングする)
110	124
111		=head2 Backslashed sequences
	125	=head2 Backslash sequences
	126	X<\w> X<\W> X<\s> X<\S> X<\d> X<\D> X<\p> X<\P>
	127	X<\N> X<\v> X<\V> X<\h> X<\H>
	128	X<word> X<whitespace>
112	129
113	130	(逆スラッシュシーケンス)
114	131
115	132	=begin original
116	133
117		~~Perl~~ ~~regu~~lar e~~xpr~~e~~ssio~~ns c~~onta~~in many ~~back~~s~~lash~~e~~d se~~quences that
	134	A backslash sequence is a sequence of characters, the first one of which is a
118		cons~~titute~~ a ch~~aract~~er cl~~ass.~~ That is, the~~y w~~ill match a s~~ingl~~e
	135	backslash. Perl ascribes special meaning to many such sequences, and some of
119		char~~act~~er, ~~if that~~ character bel~~ong~~s to a spec~~ific~~ set of characters
	136	these are character classes. That is, they match a single character each,
120		(de~~fine~~d by the ~~sequen~~ce). A b~~acks~~lashed se~~quen~~ce is a se~~quen~~ce of
	137	provided that the character belongs to the specific set of characters defined
121		~~characters starting with a~~ b~~ackslash.~~ Not ~~all backslas~~hed sequences
	138	by the sequence.
122		are character class; for a full list, see L<perlrebackslash>.
123	139
124	140	=end original
125	141
126		~~Perl 正規表現には、文字クラスを構成する多くの~~逆スラッシュシーケンスを
	142	逆スラッシュシーケンスは、最初がバックスラッシュの文字並びです。
127		持~~ちます。~~
	143	Perl はそのような並びの多くに特別な意味を持たせていて、
128		~~これは(シーケンスで定義される)ある特定~~の~~文字集合に属する~~一つの文字に
	144	その一部は文字クラスです。
129		~~マッチングし~~ます。
	145	つまり、それらはそれぞれ並びによって定義されている特定の文字の集合に
130		~~逆スラ~~ッ~~シュシーケ~~ン~~スは、逆スラッシュで始~~ま~~る並びで~~す。
	146	帰属する一文字にマッチングします。
131		全ての逆スラッシュシーケンスが文字クラスというわけではありません; 完全な
132		リストは、L<perlrebackslash> を参照してください。
133	147
134	148	=begin original
135	149
136		Here's a list of the backslashed sequences, which are ~~dis~~cussed in
	150	Here's a list of the backslash sequences that are character classes. They
137		more detail below.
	151	are discussed in more detail below. (For the backslash sequences that aren't
	152	character classes, see L<perlrebackslash>.)
138	153
139	154	=end original
140	155
141		以下は逆スラッシュシーケンスの一覧で~~す; 以下でさらに詳細に議論しま~~す。
	156	以下は文字クラスの逆スラッシュシーケンスの一覧です。
	157	以下でさらに詳細に議論します。
	158	(文字クラスではない逆スラッシュシーケンスについては、L<perlrebackslash> を
	159	参照してください。)
142	160
143	161	=begin original
144	162
145		\d Match a digit character.
	163	\d Match a decimal digit character.
146		\D Match a non-digit character.
	164	\D Match a non-decimal-digit character.
147	165	\w Match a "word" character.
148	166	\W Match a non-"word" character.
149		\s Match a white space character.
	167	\s Match a whitespace character.
150		\S Match a non-white space character.
	168	\S Match a non-whitespace character.
151		\h Match a horizontal white space character.
	169	\h Match a horizontal whitespace character.
152		\H Match a character that isn't horizontal white space.
	170	\H Match a character that isn't horizontal whitespace.
153		\v Match a vertical white space character.
	171	\v Match a vertical whitespace character.
154		\V Match a character that isn't vertical white space.
	172	\V Match a character that isn't vertical whitespace.
155		\~~pP,~~ ~~\p{Prop}~~ Match a character matching a Uni~~cod~~e ~~property~~.
	173	\N Match a character that isn't a newline.
156		\PP, \P{Prop} Match a character that ~~doe~~s~~n't~~ match a Unicode property.
	174	\pP, \p{Prop} Match a character that has the given Unicode property.
	175	\PP, \P{Prop} Match a character that doesn't have the Unicode property
157	176
158	177	=end original
159	178
160		\d 数字にマッチング。
	179	\d 10 進数字にマッチング。
161		\D 非数字にマッチング。
	180	\D 非 10 進数字にマッチング。
162	181	\w 「単語」文字にマッチング。
163	182	\W 非「単語」文字にマッチング。
164	183	\s 空白文字にマッチング。
165	184	\S 非空白文字にマッチング。
166	185	\h 水平空白文字にマッチング。
167	186	\H 水平空白でない文字にマッチング。
168	187	\v 垂直空白文字にマッチング。
169	188	\V 垂直空白でない文字にマッチング。
170		\~~pP,~~ ~~\p{Prop}~~ ~~Unicode~~ ~~特性にマッチする~~文字にマッチング。
	189	\N 改行以外の文字にマッチング。
171		\PP, \P{Prop} Unicode 特性~~にマッチしない~~文字にマッチング。
	190	\pP, \p{Prop} 指定された Unicode 特性を持つ文字にマッチング。
	191	\PP, \P{Prop} 指定された Unicode 特性を持たない文字にマッチング。
172	192
	193	=head3 \N
	194
	195	=begin original
	196
	197	C<\N>, available starting in v5.12, like the dot, matches any
	198	character that is not a newline. The difference is that C<\N> is not influenced
	199	by the I<single line> regular expression modifier (see L</The dot> above). Note
	200	that the form C<\N{...}> may mean something completely different. When the
	201	C<{...}> is a L<quantifier\|perlre/Quantifiers>, it means to match a non-newline
	202	character that many times. For example, C<\N{3}> means to match 3
	203	non-newlines; C<\N{5,}> means to match 5 or more non-newlines. But if C<{...}>
	204	is not a legal quantifier, it is presumed to be a named character. See
	205	L<charnames> for those. For example, none of C<\N{COLON}>, C<\N{4F}>, and
	206	C<\N{F4}> contain legal quantifiers, so Perl will try to find characters whose
	207	names are respectively C<COLON>, C<4F>, and C<F4>.
	208
	209	=end original
	210
	211	v5.12 から利用可能な C<\N> は、ドットのように、
	212	改行以外の任意の文字にマッチングします。
	213	違いは、C<\N> は I<単一行> 正規表現修飾子の影響を受けないことです
	214	(上述の L</The dot> 参照)。
	215	C<\N{...}> 型式は何か全く違うものを意味するかも知れないことに
	216	注意してください。
	217	C<{...}> が L<量指定子\|perlre/Quantifiers> なら、これは指定された回数の
	218	非改行文字にマッチングします。
	219	例えば、C<\N{3}> は三つの非改行にマッチングします;
	220	C<\N{5,}> は五つ以上の非改行にマッチングします。
	221	しかし、C<{...}> が有効な量指定子でない場合、これは名前付き文字と
	222	推定されます。
	223	これについては L<charnames> を参照してください。
	224	例えば、C<\N{COLON}>, C<\N{4F}>, C<\N{F4}> はどれも有効な
	225	量指定子ではないので、Perl はそれぞれ C<COLON>, C<4F>, C<F4> という名前の
	226	文字を探そうとします。
	227
173	228	=head3 Digits
174	229
175	230	(数字)
176	231
177	232	=begin original
178	233
179		C<\d> matches a single character ~~that is~~ considered to be a I<digit>.
	234	C<\d> matches a single character considered to be a decimal I<digit>.
180		What ~~is conside~~red a ~~digit d~~epends on the in~~ternal~~ encoding of
	235	If the C</a> regular expression modifier is in effect, it matches [0-9].
181		the ~~sou~~r~~ce str~~i~~ng. If the~~ s~~ourc~~e ~~str~~i~~ng is in UTF-8 forma~~t~~, C<\d>~~
	236	Otherwise, it
182		~~not only~~ matches th~~e d~~ig~~its~~ ~~'0' - '9', bu~~t also Arabic, D~~evana~~gari and
	237	matches anything that is matched by C<\p{Digit}>, which includes [0-9].
183		~~digits~~ ~~from oth~~er l~~anguag~~es. ~~Oth~~e~~rwis~~e, if there ~~is a~~ locale in e~~ffect~~,
	238	(An unlikely possible exception is that under locale matching rules, the
184		it will m~~atc~~h whatever cha~~rac~~t~~ers t~~he ~~locale~~ considers digits. With~~out~~
	239	current locale might not have C<[0-9]> matched by C<\d>, and/or might match
185		~~a l~~o~~cal~~e, ~~C<\d> mat~~ches the digits ~~'0'~~ to ~~'9'~~.
	240	other characters whose code point is less than 256. The only such locale
186		Se~~e L</L~~ocale, ~~Unic~~ode and ~~UTF~~-8>.
	241	definitions that are legal would be to match C<[0-9]> plus another set of
	242	10 consecutive digit characters; anything else would be in violation of
	243	the C language standard, but Perl doesn't currently assume anything in
	244	regard to this.)
187	245
188	246	=end original
189	247
190		C<\d> は I<数字> と考えられる単一の文字にマッチングします。
	248	C<\d> は 10 進 I<数字> と考えられる単一の文字にマッチングします。
191		何が~~数字と考えら~~れるかは~~ソース文字列の内部エ~~ン~~コーディン~~グ~~に依存~~します。
	249	C</a> 正規表現修飾子が有効の場合、これは [0-9] にマッチングします。
192		~~ソース文字列が UTF-8 形式~~なら、~~C<\d>~~ は数字 '0' - '9' ~~だけでなく~~、~~Arab~~ic,
	250	さもなければ、これは C<[0-9]> を含む、C<\p{Digit}> にマッチングするものに
193		Devanagari およびその他の言語の数字もマッチングします。
194		さもなければ、ロケールが有効なら、ロケールが数字と考える文字に
195	251	マッチングします。
196		ロケール~~がなければ、C<\d> は '0' から '9' の数字に~~マッチング~~します。~~
	252	(ありそうもない例外はロケールマッチングの下で、現在のロケールが
197		L<~~/Locale, Unico~~de ~~and~~ ~~UTF~~-8> ~~を参照してくださ~~い。
	253	C<\d> にマッチングする [0-9] がないか、
	254	符号位置が 256 未満の他の文字にマッチングすることです。
	255	唯一正当なロケール定義は、C<[0-9]> に加えてもう一つの 10 の連続した
	256	数字の集合にマッチングするもので、
	257	それ以外は C 言語標準に違反していますが、
	258	Perl は今のところこれに関して何も仮定しません。)
198	259
199	260	=begin original
200	261
201		~~Any c~~ha~~rac~~ter that is~~n't~~ matched by C<\d> wi~~ll b~~e ~~matched~~ by C<\D>.
	262	What this means is that unless the C</a> modifier is in effect C<\d> not
	263	only matches the digits '0' - '9', but also Arabic, Devanagari, and
	264	digits from other languages. This may cause some confusion, and some
	265	security issues.
202	266
203	267	=end original
204	268
	269	これが意味することは、C</a> 修飾子が有効でない限り、C<\d> は数字
	270	'0' - '9' だけでなく、アラビア文字、デバナーガリ文字、およびその他の言語の
	271	数字もマッチングします。
	272	これは混乱やセキュリティ問題を引き起こすことがあります。
	273
	274	=begin original
	275
	276	Some digits that C<\d> matches look like some of the [0-9] ones, but
	277	have different values. For example, BENGALI DIGIT FOUR (U+09EA) looks
	278	very much like an ASCII DIGIT EIGHT (U+0038), and LEPCHA DIGIT SIX
	279	(U+1C46) looks very much like an ASCII DIGIT FIVE (U+0035). An
	280	application that
	281	is expecting only the ASCII digits might be misled, or if the match is
	282	C<\d+>, the matched string might contain a mixture of digits from
	283	different writing systems that look like they signify a number different
	284	than they actually do. L<Unicode::UCD/num()> can
	285	be used to safely
	286	calculate the value, returning C<undef> if the input string contains
	287	such a mixture. Otherwise, for example, a displayed price might be
	288	deliberately different than it appears.
	289
	290	=end original
	291
	292	C<\d> にマッチングする数字には、[0-9] のように見えるけれども、
	293	異なる値を持つものもあります。
	294	例えば、BENGALI DIGIT FOUR (U+09EA) は ASCII DIGIT EIGHT (U+0038) に
	295	とてもよく似ていて、
	296	LEPCHA DIGIT SIX (U+1C46) は ASCII DIGIT FIVE (U+0035) に
	297	とてもよく似ています。
	298	ASCII 数字のみを想定しているアプリケーションはミスリードされるかも知れず、
	299	マッチングが C<\d+> の場合、
	300	マッチングした文字列は、実際と異なる値を示しているように見える、
	301	異なった書記体系からの数字が混ざったものかもしれません。
	302	L<Unicode::UCD/num()> は値を安全に計算するのに使えます;
	303	入力文字列がこのような混合を含んでいる場合は C<undef> を返します。
	304	さもなければ、例えば、表示された価格は見た目と意図的に違うものに
	305	なるかもしれません。
	306
	307	=begin original
	308
	309	What C<\p{Digit}> means (and hence C<\d> except under the C</a>
	310	modifier) is C<\p{General_Category=Decimal_Number}>, or synonymously,
	311	C<\p{General_Category=Digit}>. Starting with Unicode version 4.1, this
	312	is the same set of characters matched by C<\p{Numeric_Type=Decimal}>.
	313	But Unicode also has a different property with a similar name,
	314	C<\p{Numeric_Type=Digit}>, which matches a completely different set of
	315	characters. These characters are things such as C<CIRCLED DIGIT ONE>
	316	or subscripts, or are from writing systems that lack all ten digits.
	317
	318	=end original
	319
	320	C<\p{Digit}> が意味するもの(つまり、C</a> 修飾子の下でない C<\d>)は、
	321	C<\p{General_Category=Decimal_Number}>、または同義語として
	322	C<\p{General_Category=Digit}> です。
	323	Unicode バージョン 4.1 以降では、これは C<\p{Numeric_Type=Decimal}> に
	324	マッチングする文字集合と同じです。
	325	ただし、Unicode には、C<\p{Numeric_Type=Digit}> という類似した名前を持つ
	326	別の特性もあります; これは完全に異なる文字集合とマッチングします。
	327	これらの文字は、C<CIRCLEED DIGIT ONE> や添字のようなものであるか、
	328	10 の数字すべてが揃っていない書記体系からのものです。
	329
	330	=begin original
	331
	332	The design intent is for C<\d> to exactly match the set of characters
	333	that can safely be used with "normal" big-endian positional decimal
	334	syntax, where, for example 123 means one 'hundred', plus two 'tens',
	335	plus three 'ones'. This positional notation does not necessarily apply
	336	to characters that match the other type of "digit",
	337	C<\p{Numeric_Type=Digit}>, and so C<\d> doesn't match them.
	338
	339	=end original
	340
	341	設計意図は、C<\d> が「通常の」ビッグエンディアンの
	342	位置 10 進構文 (例えば、123 は一つの「100」に二つの「10」と三つの「1」を
	343	加えたものを意味する) で安全に使用できる文字集合と
	344	正確にマッチングするようにすることです。
	345	この位置表記は、他のタイプの「digit」である C<\p{Numeric_Type=Digit}> に
	346	マッチングする文字には必ずしも適用されないため、
	347	C<\d> はこれらの文字にマッチングしません。
	348
	349	=begin original
	350
	351	The Tamil digits (U+0BE6 - U+0BEF) can also legally be
	352	used in old-style Tamil numbers in which they would appear no more than
	353	one in a row, separated by characters that mean "times 10", "times 100",
	354	etc. (See L<https://www.unicode.org/notes/tn21>.)
	355
	356	=end original
	357
	358	タミル語の数字(U+0BE6-U+0BEF)は、古い様式のタミル語の
	359	数字でも合法的に使用することができます;
	360	この数字は、「×10」や「×100」などを意味する文字で区切られて、
	361	1 回に一度にしか現れません。
	362	(L<https://www.unicode.org/notes/tn21>を参照してください。)
	363
	364	=begin original
	365
	366	Any character not matched by C<\d> is matched by C<\D>.
	367
	368	=end original
	369
205	370	C<\d> にマッチングしない任意の文字は C<\D> にマッチングします。
206	371
207	372	=head3 Word characters
208	373
209	374	(単語文字)
210	375
211	376	=begin original
212	377
213		C<\w> matches a single ~~I<wo~~rd> character: an alpha~~num~~eric character
	378	A C<\w> matches a single alphanumeric character (an alphabetic character, or a
214		~~(th~~at is, an alphabetic character, ~~or a digit), or t~~he ~~under~~s~~core~~ ~~(C<_>).~~
	379	decimal digit); or a connecting punctuation character, such as an
215		~~What is co~~nsidered a word character depends on the int~~ernal~~ ~~encoding~~
	380	underscore ("_"); or a "mark" character (like some sort of accent) that
216		of the string. If it's in ~~UTF-8~~ fo~~rma~~t, ~~C<\w>~~ matches those ~~cha~~ract~~ers~~
	381	attaches to one of those. It does not match a whole word. To match a
217		th~~at are c~~o~~nsid~~e~~red~~ word ~~charact~~ers in the ~~Unicod~~e database. That is, it
	382	whole word, use C<\w+>. This isn't the same thing as matching an
218		not only matches ASCII ~~lette~~rs, but also Thai le~~tter~~s, ~~Greek~~ letters, ~~etc.~~
	383	English word, but in the ASCII range it is the same as a string of
219		~~If th~~e ~~sou~~r~~ce str~~in~~g isn'~~t i~~n UTF-8~~ for~~mat,~~ ~~C<\w> mat~~ch~~es those ch~~aracters
	384	Perl-identifier characters.
220		that are considered word characters by the current locale. Without
221		a locale in effect, C<\w> matches the ASCII letters, digits and the
222		underscore.
223	385
224	386	=end original
225	387
226		C<\w> は単~~一の I<単~~語~~> 文字にマッチングします: これ~~は英数字(つまり英字または
	388	C<\w> は単語全体ではなく、単一の英数字(つまり英字または数字)または
227		~~数字)および~~下線 (C<_>) ~~です。~~
	389	下線(C<_>) のような接続句読点
228		~~何が単語文字と考えら~~れる~~かは文字列~~の~~内部エ~~ンコー~~ディング~~に~~依存します。~~
	390	またはこれらの一つに付いている(ある種のアクセントのような)「マーク」文字に
229		UTF-8 形式の場合、C<\w> は Unicode データベースで単語文字と考えられるものに
230	391	マッチングします。
231		これは~~、ASCII の文字だけではなく、タイの文字、ギリシャの文字、など~~にも
	392	これは単語全体にはマッチングしません。
232		マッチングするとい~~うことです~~。
	393	単語全体にマッチングするには、C<\w+> を使ってください。
233		~~ソース文字列が UTF-8 形式でない場合、C<\w>~~ は現在の~~ロケールで~~単語文字と
	394	これは英語の単語にマッチングするのと同じことではありませんが、
234		~~考えられるも~~の~~にマッチングしま~~す。
	395	ASCII の範囲では、Perl の識別子文字の文字列と同じです。
235		~~ロケールが有効でない場合、C<\w> は ASCII 文字、数字、下線に~~
	397	=over
	398
	399	=item If the C</a> modifier is in effect ...
	400
	401	(C</a> 修飾子が有効なら ...)
	402
	403	=begin original
	404
	405	C<\w> matches the 63 characters [a-zA-Z0-9_].
	406
	407	=end original
	408
	409	C<\w> は 63 文字 [a-zA-Z0-9_] にマッチングします。
	410
	411	=item otherwise ...
	412
	413	(さもなければ ...)
	414
	415	=over
	416
	417	=item For code points above 255 ...
	418
	419	(256 以上の符号位置では ...)
	420
	421	=begin original
	422
	423	C<\w> matches the same as C<\p{Word}> matches in this range. That is,
	424	it matches Thai letters, Greek letters, etc. This includes connector
	425	punctuation (like the underscore) which connect two words together, or
	426	diacritics, such as a C<COMBINING TILDE> and the modifier letters, which
	427	are generally used to add auxiliary markings to letters.
	428
	429	=end original
	430
	431	C<\w> はこの範囲で C<\p{Word}> がマッチングするものと同じものに
236	432	マッチングします。
	433	つまり、タイ文字、ギリシャ文字などです。
	434	これには(下線のような)二つの単語を繋ぐ接続句読点、
	435	C<COMBINING TILDE> や一般的に文字に追加のマークを付けるために
	436	使われる修飾字のようなダイアクリティカルマークが含まれます。
237	437
	438	=item For code points below 256 ...
	439
	440	(255 以下の符号位置では ...)
	441
	442	=over
	443
	444	=item if locale rules are in effect ...
	445
	446	(ロケール規則が有効なら ...)
	447
238	448	=begin original
239	449
240		~~Any~~ cha~~rac~~ter that ~~isn~~'t matched ~~by C<\w> will b~~e ~~mat~~ched by ~~C<\W>.~~
	450	C<\w> matches the platform's native underscore character plus whatever
	451	the locale considers to be alphanumeric.
241	452
242	453	=end original
243	454
	455	C<\w> は、プラットフォームのネイティブな下線に加えてロケールが英数字と
	456	考えるものにマッチングします。
	457
	458	=item if, instead, Unicode rules are in effect ...
	459
	460	(そうではなく、Unicode 規則が有効なら ...)
	461
	462	=begin original
	463
	464	C<\w> matches exactly what C<\p{Word}> matches.
	465
	466	=end original
	467
	468	C<\w> は C<\p{Word}> がマッチングするものと同じものにマッチングします。
	469
	470	=item otherwise ...
	471
	472	(さもなければ ...)
	473
	474	=begin original
	475
	476	C<\w> matches [a-zA-Z0-9_].
	477
	478	=end original
	479
	480	C<\w> は [a-zA-Z0-9_] にマッチングします。
	481
	482	=back
	483
	484	=back
	485
	486	=back
	487
	488	=begin original
	489
	490	Which rules apply are determined as described in L<perlre/Which character set modifier is in effect?>.
	491
	492	=end original
	493
	494	どの規則を適用するかは L<perlre/Which character set modifier is in effect?> で
	495	記述されている方法で決定されます。
	496
	497	=begin original
	498
	499	There are a number of security issues with the full Unicode list of word
	500	characters. See L<https://unicode.org/reports/tr36>.
	501
	502	=end original
	503
	504	完全な Unicode の単語文字の一覧には多くのセキュリティ問題があります。
	505	L<https://unicode.org/reports/tr36> を参照してください。
	506
	507	=begin original
	508
	509	Also, for a somewhat finer-grained set of characters that are in programming
	510	language identifiers beyond the ASCII range, you may wish to instead use the
	511	more customized L</Unicode Properties>, C<\p{ID_Start}>,
	512	C<\p{ID_Continue}>, C<\p{XID_Start}>, and C<\p{XID_Continue}>. See
	513	L<http://unicode.org/reports/tr31>.
	514
	515	=end original
	516
	517	また、ASCII の範囲を超えたプログラミング言語識別子のための
	518	より高精度の文字集合のためには、代わりによりカスタマイズされた
	519	L<Unicode 特性\|/Unicode Properties>である
	520	C<\p{ID_Start}>,
	521	C<\p{ID_Continue}>, C<\p{XID_Start}>, and C<\p{XID_Continue}> を
	522	使った方がよいでしょう。
	523	L<http://unicode.org/reports/tr31> を参照してください。
	524
	525	=begin original
	526
	527	Any character not matched by C<\w> is matched by C<\W>.
	528
	529	=end original
	530
244	531	C<\w> にマッチングしない任意の文字は C<\W> にマッチングします。
245	532
246		=head3 White space
	533	=head3 Whitespace
247	534
248	535	(空白)
249	536
250	537	=begin original
251	538
252		C<\s> matches any single character ~~that is~~ consider white space. ~~In the~~
	539	C<\s> matches any single character considered whitespace.
253		ASCII range, C<\s> matches the horizontal tab (C<\t>), the new line
254		(C<\n>), the form feed (C<\f>), the carriage return (C<\r>), and the
255		space (the vertical tab, C<\cK> is not matched by C<\s>). The exact set
256		of characters matched by C<\s> depends on whether the source string is
257		in UTF-8 format. If it is, C<\s> matches what is considered white space
258		in the Unicode database. Otherwise, if there is a locale in effect, C<\s>
259		matches whatever is considered white space by the current locale. Without
260		a locale, C<\s> matches the five characters mentioned in the beginning
261		of this paragraph. Perhaps the most notable difference is that C<\s>
262		matches a non-breaking space only if the non-breaking space is in a
263		UTF-8 encoded string.
264	540
265	541	=end original
266	542
267	543	C<\s> は空白と考えられる単一の文字にマッチングします。
268		ASCII の範囲では、C<\s> は水平タブ(C<\t>)、改行(C<\n>)、ページ送り(C<\f>)、
269		復帰(C<\r>)、スペースにマッチングします (垂直タブ C<\cK> は C<\s> に
270		マッチングしません)。
271		C<\s> がマッチングする文字の正確な集合はソース文字列が UTF-8 形式かどうかに
272		依存します。
273		もしそうなら、C<\s> は Unicode データベースで空白と考えられるものに
274		マッチングします。
275		さもなければ、ロケールが有効なら、C<\s> は現在のロケールで空白と
276		考えられるものにマッチングします。
277		ロケールなしでは、C<\s> はこの段落の始めに言及した五つの文字に
278		マッチングします。
279		おそらくもっとも顕著な違いは、non-breaking space は UTF-8 エンコードされた
280		文字列にある場合にのみ、C<\s> にマッチングするということです。
281	544
	545	=over
	546
	547	=item If the C</a> modifier is in effect ...
	548
	549	(C</a> 修飾子が有効なら ...)
	550
282	551	=begin original
283	552
284		Any cha~~ract~~er ~~that i~~sn't ~~matched by~~ C<\s> ~~will be~~ matched by C<\~~S>.~~
	553	In all Perl versions, C<\s> matches the 5 characters [\t\n\f\r ]; that
	554	is, the horizontal tab,
	555	the newline, the form feed, the carriage return, and the space.
	556	Starting in Perl v5.18, it also matches the vertical tab, C<\cK>.
	557	See note C<[1]> below for a discussion of this.
285	558
286	559	=end original
287	560
288		C<\s> ~~にマッチングしない任意の文字~~は C<\S> にマッチングします。
	561	全ての Perl バージョンで、C<\s> は [\t\n\f\r ] の 5 文字にマッチングします;
	562	つまり、水平タブ、改行、改頁、復帰、スペースです。
	563	Perl 5.18 から、垂直タブ C<\cK> にもマッチングします。
	564	ここでの議論については後述する C<[1]> を参照してください。
289	565
	566	=item otherwise ...
	567
	568	(さもなければ ...)
	569
	570	=over
	571
	572	=item For code points above 255 ...
	573
	574	(256 以上の符号位置では ...)
	575
290	576	=begin original
291	577
292		C<\h> ~~will~~ match any c~~harac~~ter that is co~~nsi~~de~~red~~ horizontal white space;
	578	C<\s> matches exactly the code points above 255 shown with an "s" column
293		thi~~s i~~n~~cludes~~ the ~~space and~~ t~~he t~~ab ~~charact~~e~~rs.~~ ~~C<\H> will match any charact~~er
	579	in the table below.
294		that is not considered horizontal white space.
295	580
296	581	=end original
297	582
298		C<\h> は~~水平空白と考えられ~~る任意の~~文字にマッチングします;~~ これ~~はスペースと~~
	583	C<\s> は、後述する表の "s" の列で示されている、
299		~~タブ文字で~~す。
	584	255 を超える符号位置に正確にマッチングします。
300		C<\H> は水平空白と考えられない文字にマッチングします。
301	585
	586	=item For code points below 256 ...
	587
	588	(255 以下の符号位置では ...)
	589
	590	=over
	591
	592	=item if locale rules are in effect ...
	593
	594	(ロケール規則が有効なら ...)
	595
302	596	=begin original
303	597
304		C<\v> ~~will~~ match ~~any c~~ha~~rac~~ter that is considered ~~ver~~t~~ical~~ white space;
	598	C<\s> matches whatever the locale considers to be whitespace.
305		this includes the carriage return and line feed characters (newline).
306		C<\V> will match any character that is not considered vertical white space.
307	599
308	600	=end original
309	601
310		C<\v> は垂直空白と考えられる任意の文字にマッチングします~~; これは復帰と~~
	602	C<\s> はロケールが空白だと考えるものにマッチングします。
311		行送り(改行)文字です。
312		C<\V> は垂直空白と考えられない任意の文字にマッチングします。
313	603
	604	=item if, instead, Unicode rules are in effect ...
	605
	606	(そうではなく、Unicode 規則が有効なら ...)
	607
314	608	=begin original
315	609
316		C<\R> matches anyt~~hing~~ that ca~~n b~~e consi~~dered~~ a n~~ewline~~ un~~der~~ Uni~~cod~~e
	610	C<\s> matches exactly the characters shown with an "s" column in the
317		~~rules. I~~t~~'s not~~ a ~~charact~~er ~~class, as it can match a multi-charact~~er
	611	table below.
318		sequence. Therefore, it cannot be used inside a bracketed character
319		class. Details are discussed in L<perlrebackslash>.
320	612
321	613	=end original
322	614
323		C<\R> は ~~Unicode~~ の~~規則で改行と考えられ~~るものにマッチングします。
	615	C<\s> は正確に以下の表で "s" の列にある文字にマッチングします。
324		複数文字の並びにマッチングすることもあるので、これは
325		文字クラスではありません。
326		従って、大かっこ文字クラスの中では使えません。
327		詳細は L<perlrebackslash> で議論しています。
328	616
	617	=item otherwise ...
	618
	619	(さもなければ ...)
	620
329	621	=begin original
330	622
331		C<\h>, ~~C<\H>,~~ C<\~~v>, C<~~\~~V>,~~ and ~~C<\R>~~ are new in perl ~~5.10.0.~~
	623	C<\s> matches [\t\n\f\r ] and, starting in Perl
	624	v5.18, the vertical tab, C<\cK>.
	625	(See note C<[1]> below for a discussion of this.)
	626	Note that this list doesn't include the non-breaking space.
332	627
333	628	=end original
334	629
335		C<\h>, ~~C<\H>,~~ C<\~~v>, C<~~\~~V>, C<~~\R> は perl 5.1~~0.0~~ ~~の新機能です。~~
	630	C<\s> は [\t\n\f\r ] にマッチングし、Perl v5.18 から、
	631	垂直タブ C<\cK> にもマッチングします。
	632	(これの議論については後述する C<[1]> を参照してください。)
	633	この一覧にはノーブレークスペースが含まれていないことに注意してください。
336	634
	635	=back
	636
	637	=back
	638
	639	=back
	640
337	641	=begin original
338	642
339		~~Note t~~hat unlike ~~C<\~~s>, ~~C<\d>~~ and ~~C<\w>,~~ ~~C<\h>~~ and C<~~\v>~~ always match
	643	Which rules apply are determined as described in L<perlre/Which character set modifier is in effect?>.
340		the same characters, regardless whether the source string is in UTF-8
341		format or not. The set of characters they match is also not influenced
342		by locale.
343	644
344	645	=end original
345	646
346		C<\s>, ~~C<\~~d>, ~~C<\w>~~ ~~と違って、C<\h>~~ ~~および C<\v~~> ~~はソース文字列が~~
	647	どの規則を適用するかは L<perlre/Which character set modifier is in effect?> で
347		~~UTF-8 形式かどうかに関わらず同じ文字にマッチングし~~ます。
	648	記述されている方法で決定されます。
348		マッチングする文字の集合はロケールの影響も受けません。
349	649
350	650	=begin original
351	651
352		One ~~mig~~ht ~~thi~~nk that C<\s> is ~~equiv~~a~~len~~t ~~wit~~h C<[\~~h\v]~~>~~. This is not true~~.
	652	Any character not matched by C<\s> is matched by C<\S>.
353		The vertical tab (C<"\x0b">) is not matched by C<\s>, it is however
354		considered vertical white space. Furthermore, if the source string is
355		not in UTF-8 format, the next line (C<"\x85">) and the no-break space
356		(C<"\xA0">) are not matched by C<\s>, but are by C<\v> and C<\h> respectively.
357		If the source string is in UTF-8 format, both the next line and the
358		no-break space are matched by C<\s>.
359	653
360	654	=end original
361	655
	656	C<\s> にマッチングしない任意の文字は C<\S> にマッチングします。
	657
	658	=begin original
	659
	660	C<\h> matches any character considered horizontal whitespace;
	661	this includes the platform's space and tab characters and several others
	662	listed in the table below. C<\H> matches any character
	663	not considered horizontal whitespace. They use the platform's native
	664	character set, and do not consider any locale that may otherwise be in
	665	use.
	666
	667	=end original
	668
	669	C<\h> は水平空白と考えられる任意の文字にマッチングします; これは
	670	プラットフォームのスペースとタブ文字および以下の表に上げられている
	671	いくつかのその他の文字です。
	672	C<\H> は水平空白と考えられない文字にマッチングします。
	673	これらはプラットフォームのネイティブな文字集合を使い、
	674	他の場所では有効なロケールを考慮しません。
	675
	676	=begin original
	677
	678	C<\v> matches any character considered vertical whitespace;
	679	this includes the platform's carriage return and line feed characters (newline)
	680	plus several other characters, all listed in the table below.
	681	C<\V> matches any character not considered vertical whitespace.
	682	They use the platform's native character set, and do not consider any
	683	locale that may otherwise be in use.
	684
	685	=end original
	686
	687	C<\v> は垂直空白と考えられる任意の文字にマッチングします; これは
	688	プラットフォームの復帰と行送り(改行)文字に加えていくつかのその他の文字です;
	689	全ては以下の表に挙げられています。
	690	C<\V> は垂直空白と考えられない任意の文字にマッチングします。
	691	これらはプラットフォームのネイティブな文字集合を使い、
	692	他の場所では有効なロケールを考慮しません。
	693
	694	=begin original
	695
	696	C<\R> matches anything that can be considered a newline under Unicode
	697	rules. It can match a multi-character sequence. It cannot be used inside
	698	a bracketed character class; use C<\v> instead (vertical whitespace).
	699	It uses the platform's
	700	native character set, and does not consider any locale that may
	701	otherwise be in use.
	702	Details are discussed in L<perlrebackslash>.
	703
	704	=end original
	705
	706	C<\R> は Unicode の規則で改行と考えられるものにマッチングします。
	707	複数文字の並びにマッチングすることもあります。
	708	従って、大かっこ文字クラスの中では使えません; 代わりに C<\v> (垂直空白) を
	709	使ってください。
	710	これらはプラットフォームのネイティブな文字集合を使い、
	711	他の場所では有効なロケールを考慮しません。
	712	詳細は L<perlrebackslash> で議論しています。
	713
	714	=begin original
	715
	716	Note that unlike C<\s> (and C<\d> and C<\w>), C<\h> and C<\v> always match
	717	the same characters, without regard to other factors, such as the active
	718	locale or whether the source string is in UTF-8 format.
	719
	720	=end original
	721
	722	C<\s> (および C<\d> と C<\w>) と違って、C<\h> および C<\v> は、現在の
	723	ロケールやソース文字列が UTF-8 形式かどうかといった他の要素に関わらず
	724	同じ文字にマッチングします。
	725
	726	=begin original
	727
	728	One might think that C<\s> is equivalent to C<[\h\v]>. This is indeed true
	729	starting in Perl v5.18, but prior to that, the sole difference was that the
	730	vertical tab (C<"\cK">) was not matched by C<\s>.
	731
	732	=end original
	733
362	734	C<\s> が C<[\h\v]> と等価と考える人がいるかもしれません。
363		これは正しくあり~~ません。~~
	735	Perl 5.18 からはもちろん正しいです; しかしそれより前では、
364		垂直タブ (C<"\x0b">) は C<\s> にマッチングし~~ませんが、垂直空白~~と
	736	唯一の違いは、垂直タブ (C<"\xcK">) は C<\s> にマッチングしないということです。
365		考えられます。
366		さらに、ソース文字列が UTF-8 形式でなければ、next line (C<"\x85">) と
367		no-break space (C<"\xA0">) は C<\s> にマッチングしませんが、
368		それぞれ C<\v> および C<\h> にはマッチングします。
369		ソース文字列が UTF-8 形式なら、next line と no-break space は C<\s> に
370		マッチングします。
371	737
372	738	=begin original
373	739
374	740	The following table is a complete listing of characters matched by
375		C<\s>, C<\h> and C<\v>.
	741	C<\s>, C<\h> and C<\v> as of Unicode 14.0.
376	742
377	743	=end original
378	744
379		以下の表は C<\s>, C<\h>, C<\v> にマッチングする文字の~~完全な一覧です。~~
	745	以下の表は Unicode 14.0 現在で C<\s>, C<\h>, C<\v> にマッチングする文字の
	746	完全な一覧です。
380	747
381	748	=begin original
382	749
383		The first column gives the code point of the character (in hex format),
	750	The first column gives the Unicode code point of the character (in hex format),
384	751	the second column gives the (Unicode) name. The third column indicates
385		by which class(es) the character is matched.
	752	by which class(es) the character is matched (assuming no locale is in
	753	effect that changes the C<\s> matching).
386	754
387	755	=end original
388	756
389		最初の列は文字の符号位置(16 進形式)、2 番目の列は (Unicode の)~~名前です。~~
	757	最初の列は文字の Unicode 符号位置(16 進形式)、2 番目の列は (Unicode の)
390		~~3 番目の列はどのクラスにマッチングするかを示していま~~す。
	758	名前です。
	759	3 番目の列はどのクラスにマッチングするかを示しています
	760	(C<\s> のマッチングを変更するようなロケールが
	761	有効でないことを仮定しています)。
391	762
392		0x00009 CHARACTER TABULATION h s
	763	0x0009 CHARACTER TABULATION h s
393		0x0000a LINE FEED (LF) vs
	764	0x000a LINE FEED (LF) vs
394		0x0000b LINE TABULATION v
	765	0x000b LINE TABULATION vs [1]
395		0x0000c FORM FEED (FF) vs
	766	0x000c FORM FEED (FF) vs
396		0x0000d CARRIAGE RETURN (CR) vs
	767	0x000d CARRIAGE RETURN (CR) vs
397		0x00020 SPACE h s
	768	0x0020 SPACE h s
398		0x00085 NEXT LINE (NEL) vs [1]
	769	0x0085 NEXT LINE (NEL) vs [2]
399		0x000a0 NO-BREAK SPACE h s [1]
	770	0x00a0 NO-BREAK SPACE h s [2]
400		0x01680 OGHAM SPACE MARK h s
	771	0x1680 OGHAM SPACE MARK h s
401		0x0180e ~~MONGOLIAN~~ ~~VOWEL~~ SEPA~~RATOR~~ h s
	772	0x2000 EN QUAD h s
402		0x02000 EN QUAD h s
	773	0x2001 EM QUAD h s
403		0x02001 EM QUAD h s
	774	0x2002 EN SPACE h s
404		0x02002 EN SPACE h s
	775	0x2003 EM SPACE h s
405		0x02003 EM SPACE h s
	776	0x2004 THREE-PER-EM SPACE h s
406		0x02004 THREE-PER-EM SPACE h s
	777	0x2005 FOUR-PER-EM SPACE h s
407		0x02005 ~~FOUR~~-PER-EM SPACE h s
	778	0x2006 SIX-PER-EM SPACE h s
408		0x02006 SI~~X-PE~~R-EM SPACE h s
	779	0x2007 FIGURE SPACE h s
409		0x02007 ~~FIG~~URE SPACE h s
	780	0x2008 PUNCTUATION SPACE h s
410		0x02008 ~~PUNC~~T~~UAT~~ION SPACE h s
	781	0x2009 THIN SPACE h s
411		0x02009 THIN SPACE h s
	782	0x200a HAIR SPACE h s
412		0x0200a HAIR SPACE h s
	783	0x2028 LINE SEPARATOR vs
413		0x02028 ~~LINE~~ SEPARATOR vs
	784	0x2029 PARAGRAPH SEPARATOR vs
414		0x02029 PARAGRAPH SEPA~~RATOR~~ vs
	785	0x202f NARROW NO-BREAK SPACE h s
415		0x0202f NA~~RROW NO-BR~~EAK SPACE h s
	786	0x205f MEDIUM MATHEMATICAL SPACE h s
416		0x0205f ~~MEDIUM~~ ~~MATH~~EMATICAL SPACE h s
	787	0x3000 IDEOGRAPHIC SPACE h s
417		0x03000 IDEOGRAPHIC SPACE h s
418	788
419	789	=over 4
420	790
421	791	=item [1]
422	792
423	793	=begin original
424	794
425		~~NEXT LINE and NO-BREAK S~~P~~ACE~~ only ~~match~~ C<\s> if the ~~sour~~ce string ~~is in~~
	795	Prior to Perl v5.18, C<\s> did not match the vertical tab.
426		~~UTF-8~~ format.
	796	C<[^\S\cK]> (obscurely) matches what C<\s> traditionally did.
427	797
428	798	=end original
429	799
430		~~NEXT~~ ~~LINE~~ ~~と NO-BREAK SPA~~CE は~~ソース文字列が UTF-8 形式の時~~にのみ
	800	Perl v5.18 より前では、C<\s> は垂直タブにマッチングしませんでした。
431		C<\s> に~~マッチングします。~~
	801	C<[^\S\cK]> は(ひっそりと)C<\s> が伝統的に
	802	マッチングしていたものにマッチングします。
432	803
433		=~~back~~
	804	=item [2]
434	805
435	806	=begin original
436	807
437		It is worth not~~ing~~ that C<\d>, ~~C<\w>,~~ e~~tc, match s~~ing~~le characters, not~~
	808	NEXT LINE and NO-BREAK SPACE may or may not match C<\s> depending
438		co~~mple~~te numbers ~~or words. To match a~~ n~~umber~~ ~~(that~~ c~~onsis~~ts of ~~int~~ege~~rs),~~
	809	on the rules in effect. See
439		use ~~C<\d+>;~~ to match ~~a w~~o~~rd, u~~se ~~C<\w+~~>.
	810	L<the beginning of this section\|/Whitespace>.
440	811
441	812	=end original
442	813
443		C~~<\d>,~~ C<\w> ~~などは単語や数値全体ではなく単一の文字~~に~~マッチングすると~~
	814	NEXT LINE と NO-BREAK SPACE はどの規則が有効かによって C<\s> に
444		~~いうことは注意する価値があ~~ります。
	815	マッチングしたりマッチングしなかったりします。
445		~~(整数で構成される)数値にマッチングするには、C~~<~~\d+~~> を使ってください~~; 単語に~~
	816	L<the beginning of this section\|/Whitespace> を参照してください。
446		マッチングするには、C<\w+> を使ってください。
447	817
	818	=back
	819
448	820	=head3 Unicode Properties
449	821
450	822	(Unicode 特性)
451	823
452	824	=begin original
453	825
454		C<\pP> and C<\p{Prop}> are character classes to match characters that
	826	C<\pP> and C<\p{Prop}> are character classes to match characters that fit given
455		~~fit given~~ Unicode ~~class~~es. One letter classes can be used in the C<\pP>
	827	Unicode properties. One letter property names can be used in the C<\pP> form,
456		~~form,~~ with the ~~class~~ name following the C<\p>, otherwise, the property
	828	with the property name following the C<\p>, otherwise, braces are required.
457		~~nam~~e ~~is e~~n~~closed~~ in braces, and fo~~llo~~ws the ~~C<\~~p~~>. F~~or instance~~, a~~
	829	When using braces, there is a single form, which is just the property name
458		~~mat~~c~~h f~~o~~r a numb~~er can be wr~~itt~~en as ~~C</\~~p~~N/>~~ or as C</\p{Number}/>.
	830	enclosed in the braces, and a compound form which looks like C<\p{name=value}>,
459		Lowercase let~~ters~~ ~~are~~ matched by the property ~~I<Low~~erca~~seLe~~tter> which
	831	which means to match if the property "name" for the character has that particular
460		~~has as short form I<Ll>. They ha~~v~~e to be written~~ a~~s C</\p{L~~l~~}/> or~~
	832	"value".
461		~~C</\p{L~~owercas~~eLe~~tte~~r}/>.~~ ~~C</\pLl/>~~ is va~~lid,~~ but means ~~som~~e~~thing~~ different.
	833	For instance, a match for a number can be written as C</\pN/> or as
	834	C</\p{Number}/>, or as C</\p{Number=True}/>.
	835	Lowercase letters are matched by the property I<Lowercase_Letter> which
	836	has the short form I<Ll>. They need the braces, so are written as C</\p{Ll}/> or
	837	C</\p{Lowercase_Letter}/>, or C</\p{General_Category=Lowercase_Letter}/>
	838	(the underscores are optional).
	839	C</\pLl/> is valid, but means something different.
462	840	It matches a two character string: a letter (Unicode property C<\pL>),
463	841	followed by a lowercase C<l>.
464	842
465	843	=end original
466	844
467		C<\pP> と C<\p{Prop}> は指定された Unicode ~~クラス~~に一致する文字に
	845	C<\pP> と C<\p{Prop}> は指定された Unicode 特性に一致する文字に
468	846	マッチングする文字クラスです。
469		一文字~~クラス~~は C<\pP> 形式で、C<\p> に引き続いて~~クラス~~名です; さもなければ
	847	一文字特性は C<\pP> 形式で、C<\p> に引き続いて特性名です; さもなければ
470		~~特性名は~~中かっこで~~囲まれて、C<\p> に引き続きま~~す。
	848	中かっこが必要です。
471		~~例えば~~、数字に~~マッチングするものは C</\pN/> または C</\p{Number}/>~~ と
	849	中かっこを使うとき、単に特性名を中かっこで囲んだ単一形式と、
472		~~書けます。~~
	850	C<\p{name=value}> のような形で、文字の特性 "name" が特定の "value" を
	851	持つものにマッチングすることになる複合形式があります。
	852	例えば、数字にマッチングするものは C</\pN/> または C</\p{Number}/> または
	853	C</\p{Number=True}/> と書けます。
473	854	小文字は I<LowercaseLetter> 特性にマッチングします; これには
474	855	I<Ll> と言う短縮形式があります。
475		C</\p{Ll}/> または C</\p{LowercaseLetter}/> ~~と書く必要があり~~ます。
	856	中かっこが必要なので、C</\p{Ll}/> または C</\p{Lowercase_Letter}/> または
	857	C</\p{General_Category=Lowercase_Letter}/> と書きます(下線はオプションです)。
476	858	C</\pLl/> も妥当ですが、違う意味になります。
477	859	これは 2 文字にマッチングします: 英字 (Unicode 特性 C<\pL>)に引き続いて
478	860	小文字の C<l> です。
479	861
480	862	=begin original
481	863
482		~~For~~ a ~~lis~~t of ~~poss~~ible properties, see
	864	What a Unicode property matches is never subject to locale rules, and
483		~~L<perlun~~ico~~de/Uni~~code ~~Cha~~r~~act~~er Properties>. It is also po~~ssible~~ to
	865	if locale rules are not otherwise in effect, the use of a Unicode
484		defi~~ned~~ your ~~own~~ properties~~. Thi~~s is discussed in
	866	property will force the regular expression into using Unicode rules, if
	867	it isn't already.
	868
	869	=end original
	870
	871	Unicode 特性が何にマッチングするかは決してロケールの規則に影響されず、
	872	ロケール規則が有効でない場合、Unicode 特性を使うと
	873	正規表現に (まだそうでなければ) Unicode 規則を使うように強制します。
	874
	875	=begin original
	876
	877	Note that almost all properties are immune to case-insensitive matching.
	878	That is, adding a C</i> regular expression modifier does not change what
	879	they match. But there are two sets that are affected. The first set is
	880	C<Uppercase_Letter>,
	881	C<Lowercase_Letter>,
	882	and C<Titlecase_Letter>,
	883	all of which match C<Cased_Letter> under C</i> matching.
	884	The second set is
	885	C<Uppercase>,
	886	C<Lowercase>,
	887	and C<Titlecase>,
	888	all of which match C<Cased> under C</i> matching.
	889	(The difference between these sets is that some things, such as Roman
	890	numerals, come in both upper and lower case, so they are C<Cased>, but
	891	aren't considered to be letters, so they aren't C<Cased_Letter>s. They're
	892	actually C<Letter_Number>s.)
	893	This set also includes its subsets C<PosixUpper> and C<PosixLower>, both
	894	of which under C</i> match C<PosixAlpha>.
	895
	896	=end original
	897
	898	ほとんど全ての特性は大文字小文字を無視したマッチングから免除されることに
	899	注意してください。
	900	つまり、C</i> 正規表現修飾子はこれらがマッチングするものに影響を
	901	与えないということです。
	902	しかし、影響を与える二つの集合があります。
	903	一つ目の集合は
	904	C<Uppercase_Letter>,
	905	C<Lowercase_Letter>,
	906	C<Titlecase_Letter> で、全て C</i> マッチングの下で
	907	C<Cased_Letter> にマッチングします。
	908	二つ目の集合は
	909	C<Uppercase>,
	910	C<Lowercase>,
	911	C<Titlecase> で、全てC</i> マッチングの下で
	912	C<Cased> にマッチングします。
	913	(これらの集合の違いは、ローマ数字のような一部のものは、
	914	大文字と小文字があるので C<Cased> ですが、
	915	文字とは扱われないので C<Cased_Letter> ではありません。
	916	これらは実際には C<Letter_Number> です。)
	917	この集合はその部分集合である C<PosixUpper> と C<PosixLower> を含みます;
	918	これら両方は C</i> マッチングの下では C<PosixAlpha> にマッチングします。
	919
	920	=begin original
	921
	922	For more details on Unicode properties, see L<perlunicode/Unicode
	923	Character Properties>; for a
	924	complete list of possible properties, see
	925	L<perluniprops/Properties accessible through \p{} and \P{}>,
	926	which notes all forms that have C</i> differences.
	927	It is also possible to define your own properties. This is discussed in
485	928	L<perlunicode/User-Defined Character Properties>.
486	929
487	930	=end original
488	931
489		~~特性のリストについては、L<perlunicode/~~Unicode ~~Character Properties> を~~
	932	Unicode 特性に関するさらなる詳細については、
490		参照してください。
	933	L<perlunicode/Unicode Character Properties> を参照してください; 特性の完全な
	934	一覧については、C</i> に違いのある全ての形式について記されている
	935	L<perluniprops/Properties accessible through \p{} and \P{}> を参照して
	936	ください。
491	937	独自の特性を定義することも可能です。
492		これは L<perlunicode/User-Defined Character Properties> で~~議論されています。~~
	938	これは L<perlunicode/User-Defined Character Properties> で
	939	議論されています。
493	940
	941	=begin original
	942
	943	Unicode properties are defined (surprise!) only on Unicode code points.
	944	Starting in v5.20, when matching against C<\p> and C<\P>, Perl treats
	945	non-Unicode code points (those above the legal Unicode maximum of
	946	0x10FFFF) as if they were typical unassigned Unicode code points.
	947
	948	=end original
	949
	950	Unicode 特性は (驚くべきことに!) Unicode 符号位置に対してのみ
	951	定義されています。
	952	v5.20 から、C<\p> と C<\P> に対してマッチングするとき、
	953	Perl は
	954	非 Unicode 符号位置 (正当な Unicode の上限の 0x10FFFF を超えるもの) を、
	955	典型的な未割り当て Unicode 符号位置であるかのように扱います。
	956
	957	=begin original
	958
	959	Prior to v5.20, Perl raised a warning and made all matches fail on
	960	non-Unicode code points. This could be somewhat surprising:
	961
	962	=end original
	963
	964	v5.20 より前では、非 Unicode 符号位置に対しては全てのマッチングは失敗して、
	965	Perl は警告を出していました。
	966	これは驚かされるものだったかもしれません。
	967
	968	chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails on Perls < v5.20.
	969	chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Also fails on Perls
	970	# < v5.20
	971
	972	=begin original
	973
	974	Even though these two matches might be thought of as complements, until
	975	v5.20 they were so only on Unicode code points.
	976
	977	=end original
	978
	979	これら二つのマッチングは補集合と考えるかもしれませんが、
	980	v5.20 まで、これらは Unicode 符号位置だけでした。
	981
	982	=begin original
	983
	984	Starting in perl v5.30, wildcards are allowed in Unicode property
	985	values. See L<perlunicode/Wildcards in Property Values>.
	986
	987	=end original
	988
	989	perl v5.30 から、Unicode 特性にワイルドカードを使えます。
	990	L<perlunicode/Wildcards in Property Values> を参照してください。
	991
494	992	=head4 Examples
495	993
496	994	(例)
497	995
498	996	=begin original
499	997
500	998	"a" =~ /\w/ # Match, "a" is a 'word' character.
501	999	"7" =~ /\w/ # Match, "7" is a 'word' character as well.
502	1000	"a" =~ /\d/ # No match, "a" isn't a digit.
503	1001	"7" =~ /\d/ # Match, "7" is a digit.
504		" " =~ /\s/ # Match, a space is white space.
	1002	" " =~ /\s/ # Match, a space is whitespace.
505	1003	"a" =~ /\D/ # Match, "a" is a non-digit.
506	1004	"7" =~ /\D/ # No match, "7" is not a non-digit.
507		" " =~ /\S/ # No match, a space is not non-white space.
	1005	" " =~ /\S/ # No match, a space is not non-whitespace.
508	1006
509	1007	=end original
510	1008
511	1009	"a" =~ /\w/ # マッチング; "a" は「単語」文字。
512	1010	"7" =~ /\w/ # マッチング; "7" も「単語」文字。
513	1011	"a" =~ /\d/ # マッチングしない; "a" は数字ではない。
514	1012	"7" =~ /\d/ # マッチング; "7" は数字。
515	1013	" " =~ /\s/ # マッチング; スペースは空白。
516	1014	"a" =~ /\D/ # マッチング; "a" は非数字。
517	1015	"7" =~ /\D/ # マッチングしない; "7" は非数字ではない。
518	1016	" " =~ /\S/ # マッチングしない; スペースは非空白ではない。
519	1017
520	1018	=begin original
521	1019
522		" " =~ /\h/ # Match, space is horizontal white space.
	1020	" " =~ /\h/ # Match, space is horizontal whitespace.
523		" " =~ /\v/ # No match, space is not vertical white space.
	1021	" " =~ /\v/ # No match, space is not vertical whitespace.
524		"\r" =~ /\v/ # Match, a return is vertical white space.
	1022	"\r" =~ /\v/ # Match, a return is vertical whitespace.
525	1023
526	1024	=end original
527	1025
528	1026	" " =~ /\h/ # マッチング; スペースは水平空白。
529	1027	" " =~ /\v/ # マッチングしない; スペースは垂直空白ではない。
530	1028	"\r" =~ /\v/ # マッチング; 復帰は垂直空白。
531	1029
532	1030	=begin original
533	1031
534	1032	"a" =~ /\pL/ # Match, "a" is a letter.
535	1033	"a" =~ /\p{Lu}/ # No match, /\p{Lu}/ matches upper case letters.
536	1034
537	1035	=end original
538	1036
539	1037	"a" =~ /\pL/ # マッチング; "a" は英字。
540	1038	"a" =~ /\p{Lu}/ # マッチングしない; /\p{Lu}/ は大文字にマッチングする。
541	1039
542	1040	=begin original
543	1041
544	1042	"\x{0e0b}" =~ /\p{Thai}/ # Match, \x{0e0b} is the character
545	1043	# 'THAI CHARACTER SO SO', and that's in
546	1044	# Thai Unicode class.
547		"a" =~ /\P{Lao}/ # Match, as "a" is not a Laoian character.
	1045	"a" =~ /\P{Lao}/ # Match, as "a" is not a Laotian character.
548	1046
549	1047	=end original
550	1048
551	1049	"\x{0e0b}" =~ /\p{Thai}/ # マッチング; \x{0e0b} は文字
552	1050	# 'THAI CHARACTER SO SO' で、これは
553	1051	# Thai Unicode クラスにある。
554	1052	"a" =~ /\P{Lao}/ # マッチング; "a" はラオス文字ではない。
555	1053
	1054	=begin original
	1055
	1056	It is worth emphasizing that C<\d>, C<\w>, etc, match single characters, not
	1057	complete numbers or words. To match a number (that consists of digits),
	1058	use C<\d+>; to match a word, use C<\w+>. But be aware of the security
	1059	considerations in doing so, as mentioned above.
	1060
	1061	=end original
	1062
	1063	C<\d>, C<\w> などは数値や単語全体ではなく、1 文字にマッチングすることは
	1064	強調する価値があります。
	1065	(数字で構成される) 数値にマッチングするには C<\d+> を使います;
	1066	単語にマッチングするには C<\w+> を使います。
	1067	しかし前述したように、そうする場合のセキュリティ問題について
	1068	注意してください。
	1069
556	1070	=head2 Bracketed Character Classes
557	1071
558	1072	(かっこ付き文字クラス)
559	1073
560	1074	=begin original
561	1075
562	1076	The third form of character class you can use in Perl regular expressions
563		is the bracketed form. In its simplest form, it lists the characters
	1077	is the bracketed character class. In its simplest form, it lists the characters
564		that may be matched inside square brackets, like this: C<[aeiou]>.
	1078	that may be matched, surrounded by square brackets, like this: C<[aeiou]>.
565		This matches one of C<a>, C<e>, C<i>, C<o> or C<u>. ~~Just~~ as the other
	1079	This matches one of C<a>, C<e>, C<i>, C<o> or C<u>. Like the other
566		character classes, exactly one character will be matched. To match
	1080	character classes, exactly one character is matched.* To match
567		a longer string consisting of characters mentioned in the characters
	1081	a longer string consisting of characters mentioned in the character
568		class, follow the character class with a quantifier. For ~~instance,~~
	1082	class, follow the character class with a L<quantifier\|perlre/Quantifiers>. For
569		C<[aeiou]+> matches ~~a string~~ o~~f o~~ne or more lowercase ~~ASCII~~ vowels.
	1083	instance, C<[aeiou]+> matches one or more lowercase English vowels.
570	1084
571	1085	=end original
572	1086
573		Perl 正規表現で使える文字クラスの第 3 の形式は大かっこ形式です。
	1087	Perl 正規表現で使える文字クラスの第 3 の形式は大かっこ文字クラスです。
574	1088	もっとも単純な形式では、以下のように大かっこの中にマッチングする文字を
575	1089	リストします: C<[aeiou]>.
576	1090	これは C<a>, C<e>, C<i>, C<o>, C<u> のどれかにマッチングします。
577	1091	他の文字クラスと同様、正確に一つの文字にマッチングします。
578	1092	文字クラスで言及した文字で構成されるより長い文字列にマッチングするには、
579		文字クラスに量指定子を付けます。
	1093	文字クラスに L<量指定子\|perlre/Quantifiers> を付けます。
580		例えば、C<[aeiou]+> は一つまたはそれ以上の小文字 ~~ASCII~~ 母音に
	1094	例えば、C<[aeiou]+> は一つまたはそれ以上の小文字英語母音に
581	1095	マッチングします。
582	1096
583	1097	=begin original
584	1098
585	1099	Repeating a character in a character class has no
586	1100	effect; it's considered to be in the set only once.
587	1101
588	1102	=end original
589	1103
590	1104	文字クラスの中で文字を繰り返しても効果はありません; 一度だけ現れたものと
591	1105	考えられます。
592	1106
593	1107	=begin original
594	1108
595	1109	Examples:
596	1110
597	1111	=end original
598	1112
599	1113	例:
600	1114
601	1115	=begin original
602	1116
603	1117	"e" =~ /[aeiou]/ # Match, as "e" is listed in the class.
604	1118	"p" =~ /[aeiou]/ # No match, "p" is not listed in the class.
605	1119	"ae" =~ /^[aeiou]$/ # No match, a character class only matches
606	1120	# a single character.
607	1121	"ae" =~ /^[aeiou]+$/ # Match, due to the quantifier.
608	1122
609	1123	=end original
610	1124
611	1125	"e" =~ /[aeiou]/ # マッチング; "e" はクラスにある。
612	1126	"p" =~ /[aeiou]/ # マッチングしない; "p" はクラスにない。
613	1127	"ae" =~ /^[aeiou]$/ # マッチングしない; 一つの文字クラスは
614	1128	# 一文字だけにマッチングする。
615	1129	"ae" =~ /^[aeiou]+$/ # マッチング; 量指定子により。
616	1130
	1131	-------
	1132
	1133	=begin original
	1134
	1135	* There are two exceptions to a bracketed character class matching a
	1136	single character only. Each requires special handling by Perl to make
	1137	things work:
	1138
	1139	=end original
	1140
	1141	* 大かっこ文字クラスは単一の文字にのみマッチングするということには
	1142	二つの例外があります。
	1143	それぞれは Perl がうまく動くために特別な扱いが必要です:
	1144
	1145	=over
	1146
	1147	=item *
	1148
	1149	=begin original
	1150
	1151	When the class is to match caselessly under C</i> matching rules, and a
	1152	character that is explicitly mentioned inside the class matches a
	1153	multiple-character sequence caselessly under Unicode rules, the class
	1154	will also match that sequence. For example, Unicode says that the
	1155	letter C<LATIN SMALL LETTER SHARP S> should match the sequence C<ss>
	1156	under C</i> rules. Thus,
	1157
	1158	=end original
	1159
	1160	クラスが C</i> マッチング規則の下で大文字小文字を無視したマッチングを
	1161	して、クラスの中で明示的に記述された文字が Unicode の規則の下で複数文字並びに
	1162	大文字小文字を無視してマッチングするとき、
	1163	そのクラスはその並びにもマッチングします。
	1164	例えば、Unicode は文字 C<LATIN SMALL LETTER SHARP S> は C</i> 規則の下では
	1165	並び C<ss> にマッチングするとしています。
	1166	従って:
	1167
	1168	'ss' =~ /\A\N{LATIN SMALL LETTER SHARP S}\z/i # Matches
	1169	'ss' =~ /\A[aeioust\N{LATIN SMALL LETTER SHARP S}]\z/i # Matches
	1170
	1171	=begin original
	1172
	1173	For this to happen, the class must not be inverted (see L</Negation>)
	1174	and the character must be explicitly specified, and not be part of a
	1175	multi-character range (not even as one of its endpoints). (L</Character
	1176	Ranges> will be explained shortly.) Therefore,
	1177
	1178	=end original
	1179
	1180	これが起きるためには、
	1181	そのクラスは否定 (L</Negation> 参照) ではなく、
	1182	その文字は明示的に指定され、複数文字範囲の一部
	1183	(たとえその端でも)でない必要があります。
	1184	(L</Character Ranges> は短く説明されています。)
	1185	従って:
	1186
	1187	'ss' =~ /\A[\0-\x{ff}]\z/ui # Doesn't match
	1188	'ss' =~ /\A[\0-\N{LATIN SMALL LETTER SHARP S}]\z/ui # No match
	1189	'ss' =~ /\A[\xDF-\xDF]\z/ui # Matches on ASCII platforms, since
	1190	# \xDF is LATIN SMALL LETTER SHARP S,
	1191	# and the range is just a single
	1192	# element
	1193
	1194	=begin original
	1195
	1196	Note that it isn't a good idea to specify these types of ranges anyway.
	1197
	1198	=end original
	1199
	1200	どちらにしろこれらの種類の範囲を指定するのは良い考えではありません。
	1201
	1202	=item *
	1203
	1204	=begin original
	1205
	1206	Some names known to C<\N{...}> refer to a sequence of multiple characters,
	1207	instead of the usual single character. When one of these is included in
	1208	the class, the entire sequence is matched. For example,
	1209
	1210	=end original
	1211
	1212	Some names known to
	1213	C<\N{...}> で知られているいくつかの名前は、通常の単一の文字ではなく、
	1214	複数の文字の並びを参照します。
	1215	その一つがこのクラスに含まれている場合、並び全体がマッチングします。
	1216	例えば:
	1217
	1218	"\N{TAMIL LETTER KA}\N{TAMIL VOWEL SIGN AU}"
	1219	=~ / ^ [\N{TAMIL SYLLABLE KAU}] $ /x;
	1220
	1221	=begin original
	1222
	1223	matches, because C<\N{TAMIL SYLLABLE KAU}> is a named sequence
	1224	consisting of the two characters matched against. Like the other
	1225	instance where a bracketed class can match multiple characters, and for
	1226	similar reasons, the class must not be inverted, and the named sequence
	1227	may not appear in a range, even one where it is both endpoints. If
	1228	these happen, it is a fatal error if the character class is within the
	1229	scope of L<C<use re 'strict>\|re/'strict' mode>, or within an extended
	1230	L<C<(?[...])>\|/Extended Bracketed Character Classes> class; otherwise
	1231	only the first code point is used (with a C<regexp>-type warning
	1232	raised).
	1233
	1234	=end original
	1235
	1236	これはマッチングします; なぜなら C<\N{TAMIL SYLLABLE KAU}> は
	1237	マッチングする二つの文字からなる名前付き並びだからです。
	1238	大かっこクラスが複数の文字にマッチングするその他の例と同じように、
	1239	そして同様の理由で、クラスは否定できず、
	1240	たとえ両端の間であっても名前付き並びは範囲の中には現れません。
	1241	これらが起きたとき、文字クラスが
	1242	L<C<use re 'strict>\|re/'strict' mode> のスコープ内か、
	1243	拡張された L<C<(?[...])>\|/Extended Bracketed Character Classes> クラスの
	1244	中の場合には致命的エラーになります;
	1245	さもなければ、最初の符号位置のみが使われます
	1246	(そして C<regexp> 系の警告が発生します)。
	1247
	1248	=back
	1249
617	1250	=head3 Special Characters Inside a Bracketed Character Class
618	1251
619	1252	(かっこ付き文字クラスの中の特殊文字)
620	1253
621	1254	=begin original
622	1255
623	1256	Most characters that are meta characters in regular expressions (that
624		is, characters that carry a special meaning like C<*> or C<(>) lose
	1257	is, characters that carry a special meaning like C<.>, C<*>, or C<(>) lose
625	1258	their special meaning and can be used inside a character class without
626	1259	the need to escape them. For instance, C<[()]> matches either an opening
627	1260	parenthesis, or a closing parenthesis, and the parens inside the character
628		class don't group or capture.
	1261	class don't group or capture. Be aware that, unless the pattern is
	1262	evaluated in single-quotish context, variable interpolation will take
	1263	place before the bracketed class is parsed:
629	1264
630	1265	=end original
631	1266
632		正規表現内でメタ文字(つまり、C<*> や C<(> のように特別な意味を持つ~~文字)となる~~
	1267	正規表現内でメタ文字(つまり、C<.>, C<*>, C<(> のように特別な意味を持つ
633		ほとんどの文字は文字クラス内ではエスケープしなくても特別な意味を~~失うので、~~
	1268	文字)となるほとんどの文字は文字クラス内ではエスケープしなくても特別な意味を
634		エスケープする必要はありません。
	1269	失うので、エスケープする必要はありません。
635	1270	例えば、C<[()]> は開きかっこまたは閉じかっこにマッチングし、文字クラスの中の
636	1271	かっこはグループや捕捉にはなりません。
	1272	パターンがシングルクォート風コンテキストの中で評価されない限り、
	1273	変数展開は大かっこクラスがパースされる前に行われることに注意してください:
637	1274
	1275	$, = "\t\| ";
	1276	$x =~ m'[$,]'; # single-quotish: matches '$' or ','
	1277	$x =~ q{[$,]}' # same
	1278	$x =~ m/[$,]/; # double-quotish: Because we made an
	1279	# assignment to $, above, this now
	1280	# matches "\t", "\|", or " "
	1281
638	1282	=begin original
639	1283
640	1284	Characters that may carry a special meaning inside a character class are:
641	1285	C<\>, C<^>, C<->, C<[> and C<]>, and are discussed below. They can be
642	1286	escaped with a backslash, although this is sometimes not needed, in which
643	1287	case the backslash may be omitted.
644	1288
645	1289	=end original
646	1290
647	1291	文字クラスの中でも特別な意味を持つ文字は:
648	1292	C<\>, C<^>, C<->, C<[>, C<]> で、以下で議論します。
649	1293	これらは逆スラッシュでエスケープできますが、不要な場合もあり、そのような
650	1294	場合では逆スラッシュは省略できます。
651	1295
652	1296	=begin original
653	1297
654	1298	The sequence C<\b> is special inside a bracketed character class. While
655		outside the character class C<\b> is an assertion indicating a point
	1299	outside the character class, C<\b> is an assertion indicating a point
656	1300	that does not have either two word characters or two non-word characters
657	1301	on either side, inside a bracketed character class, C<\b> matches a
658	1302	backspace character.
659	1303
660	1304	=end original
661	1305
662	1306	シーケンス C<\b> は大かっこ文字クラスの内側では特別です。
663	1307	文字クラスの外側では C<\b> 二つの単語文字か二つの非単語文字のどちらかではない
664	1308	位置を示す表明ですが、大かっこ文字クラスの内側では C<\b> は後退文字に
665	1309	マッチングします。
666	1310
667	1311	=begin original
668	1312
669		~~A C<[> is not special inside a c~~h~~aract~~er ~~cla~~ss, unles~~s it's the start~~
	1313	The sequences
670		of a ~~POSIX character class (see below). It normally does not need escaping.~~
	1314	C<\a>,
	1315	C<\c>,
	1316	C<\e>,
	1317	C<\f>,
	1318	C<\n>,
	1319	C<\N{I<NAME>}>,
	1320	C<\N{U+I<hex char>}>,
	1321	C<\r>,
	1322	C<\t>,
	1323	and
	1324	C<\x>
	1325	are also special and have the same meanings as they do outside a
	1326	bracketed character class.
671	1327
672	1328	=end original
673	1329
674		~~C<[> は、POSIX 文字クラス(後述)の開始でない限りは文字クラスの中では~~
	1330	並び
675		~~特別ではありません。~~
	1331	C<\a>,
	1332	C<\c>,
	1333	C<\e>,
	1334	C<\f>,
	1335	C<\n>,
	1336	C<\N{I<NAME>}>,
	1337	C<\N{U+I<hex char>}>,
	1338	C<\r>,
	1339	C<\t>,
	1340	C<\x>
	1341	も特別で、大かっこ文字クラスの外側と同じ意味を持ちます。
	1342
	1343	=begin original
	1344
	1345	Also, a backslash followed by two or three octal digits is considered an octal
	1346	number.
	1347
	1348	=end original
	1349
	1350	また、逆スラッシュに引き続いて 2 または 3 桁の 8 進数字があると 8 進数として
	1351	扱われます。
	1352
	1353	=begin original
	1354
	1355	A C<[> is not special inside a character class, unless it's the start of a
	1356	POSIX character class (see L</POSIX Character Classes> below). It normally does
	1357	not need escaping.
	1358
	1359	=end original
	1360
	1361	C<[> は、POSIX 文字クラス(後述の L</POSIX Character Classes> 参照)の
	1362	開始でない限りは文字クラスの中では特別ではありません。
676	1363	これは普通エスケープは不要です。
677	1364
678	1365	=begin original
679	1366
680		A C<]> is either the end of a POSIX character class (see ~~below), or it~~
	1367	A C<]> is normally either the end of a POSIX character class (see
681		signals the end of the bracketed ~~character class. Normally it needs~~
	1368	L</POSIX Character Classes> below), or it signals the end of the bracketed
682		esca~~ping~~ if you want to include a C<]> in the set of characters.
	1369	character class. If you want to include a C<]> in the set of characters, you
	1370	must generally escape it.
	1371
	1372	=end original
	1373
	1374	A C<]> は普通は POSIX 文字クラス(後述の L</POSIX Character Classes> 参照)の
	1375	終わりか、大かっこ文字クラスの終了を示すかどちらかです。
	1376	文字集合に C<]> を含める必要がある場合、一般的には
	1377	エスケープしなければなりません。
	1378
	1379	=begin original
	1380
683	1381	However, if the C<]> is the I<first> (or the second if the first
684	1382	character is a caret) character of a bracketed character class, it
685	1383	does not denote the end of the class (as you cannot have an empty class)
686	1384	and is considered part of the set of characters that can be matched without
687	1385	escaping.
688	1386
689	1387	=end original
690	1388
691		A C<]> は POSIX 文字クラス(後述)の終わりか、大かっこ文字クラスの終了を
692		示すかどちらかです。
693		通常、文字集合に C<]> を含める場合はエスケープする必要があります。
694	1389	しかし、C<]> が大かっこ文字クラスの I<最初> (または最初の文字がキャレットなら
695	1390	2 番目) の文字の場合、(空クラスを作ることはできないので)これはクラスの
696	1391	終了を意味せず、エスケープなしでマッチングできる文字の集合の一部と
697	1392	考えられます。
698	1393
699	1394	=begin original
700	1395
701	1396	Examples:
702	1397
703	1398	=end original
704	1399
705	1400	例:
706	1401
707	1402	=begin original
708	1403
709	1404	"+" =~ /[+?*]/ # Match, "+" in a character class is not special.
710	1405	"\cH" =~ /[\b]/ # Match, \b inside in a character class
711		# is equivalent with a backspace.
	1406	# is equivalent to a backspace.
712		"]" =~ /[][]/ # Match, as the character class contains.
	1407	"]" =~ /[][]/ # Match, as the character class contains
713	1408	# both [ and ].
714	1409	"[]" =~ /[[]]/ # Match, the pattern contains a character class
715		# containing just ], and the character class is
	1410	# containing just [, and the character class is
716	1411	# followed by a ].
717	1412
718	1413	=end original
719	1414
720	1415	"+" =~ /[+?*]/ # マッチング; 文字クラス内の "+" は特別ではない。
721	1416	"\cH" =~ /[\b]/ # マッチング; 文字クラスの内側の \b は後退と
722	1417	# 等価。
723	1418	"]" =~ /[][]/ # マッチング; 文字クラスに [ と ] の両方を
724	1419	# 含んでいる。
725		"[]" =~ /[[]]/ # マッチング; パターンは ] だけを含んでいる
	1420	"[]" =~ /[[]]/ # マッチング; パターンは [ だけを含んでいる
726	1421	# 文字クラスと、それに引き続く
727	1422	# ] からなる。
728	1423
	1424	=head3 Bracketed Character Classes and the C</xx> pattern modifier
	1425
	1426	=begin original
	1427
	1428	Normally SPACE and TAB characters have no special meaning inside a
	1429	bracketed character class; they are just added to the list of characters
	1430	matched by the class. But if the L<C</xx>\|perlre/E<sol>x and E<sol>xx>
	1431	pattern modifier is in effect, they are generally ignored and can be
	1432	added to improve readability. They can't be added in the middle of a
	1433	single construct:
	1434
	1435	=end original
	1436
	1437	通常、大かっこ文字クラスの内側では SPACE と TAB の文字は
	1438	特別な意味はありません; これらは単にクラスによってマッチングされる文字の
	1439	リストに加えられます。
	1440	しかし、L<C</xx>\|perlre/E<sol>x and E<sol>xx> パターン修飾子が有効の場合、
	1441	これらは一般的に無視されるので、可読性を向上させるために追加できます。
	1442	これらは単一の構文の中には追加できません:
	1443
	1444	/ [ \x{10 FFFF} ] /xx # WRONG!
	1445
	1446	=begin original
	1447
	1448	The SPACE in the middle of the hex constant is illegal.
	1449
	1450	=end original
	1451
	1452	16 進定数の中の SPACE は不正です。
	1453
	1454	=begin original
	1455
	1456	To specify a literal SPACE character, you can escape it with a
	1457	backslash, like:
	1458
	1459	=end original
	1460
	1461	リテラルな SPACE 文字を指定するには、次のように逆スラッシュで
	1462	エスケープします:
	1463
	1464	/[ a e i o u \ ]/xx
	1465
	1466	=begin original
	1467
	1468	This matches the English vowels plus the SPACE character.
	1469
	1470	=end original
	1471
	1472	これは英語の母音と SPACE 文字に一致します。
	1473
	1474	=begin original
	1475
	1476	For clarity, you should already have been using C<\t> to specify a
	1477	literal tab, and C<\t> is unaffected by C</xx>.
	1478
	1479	=end original
	1480
	1481	確認すると、リテラルなタブのためには既に C<\t> を使っているべきで、
	1482	C<\t> は C</xx> の影響を受けません。
	1483
729	1484	=head3 Character Ranges
730	1485
731	1486	(文字範囲)
732	1487
733	1488	=begin original
734	1489
735	1490	It is not uncommon to want to match a range of characters. Luckily, instead
736		of listing all ~~the~~ characters in the range, one may use the hyphen (C<->).
	1491	of listing all characters in the range, one may use the hyphen (C<->).
737	1492	If inside a bracketed character class you have two characters separated
738		by a hyphen, it's treated as if all ~~the~~ characters between the two are in
	1493	by a hyphen, it's treated as if all characters between the two were in
739	1494	the class. For instance, C<[0-9]> matches any ASCII digit, and C<[a-m]>
740	1495	matches any lowercase letter from the first half of the ASCII alphabet.
741	1496
742	1497	=end original
743	1498
744	1499	文字のある範囲にマッチングしたいというのは珍しくありません。
745	1500	幸運なことに、その範囲の文字を全て一覧に書く代わりに、ハイフン (C<->) を
746	1501	使えます。
747	1502	大かっこ文字クラスの内側で二つの文字がハイフンで区切られていると、
748	1503	二つの文字の間の全ての文字がクラスに書かれているかのように扱われます。
749	1504	例えば、C<[0-9]> は任意の ASCII 数字にマッチングし、C<[a-m]> は
750	1505	ASCII アルファベットの前半分の小文字にマッチングします。
751	1506
752	1507	=begin original
753	1508
754	1509	Note that the two characters on either side of the hyphen are not
755		necessary both letters or both digits. Any character is possible,
	1510	necessarily both letters or both digits. Any character is possible,
756	1511	although not advisable. C<['-?]> contains a range of characters, but
757		most people will not know which characters that ~~will b~~e. Furthermore,
	1512	most people will not know which characters that means. Furthermore,
758	1513	such ranges may lead to portability problems if the code has to run on
759	1514	a platform that uses a different character set, such as EBCDIC.
760	1515
761	1516	=end original
762	1517
763	1518	ハイフンのそれぞれの側の二つの文字は両方とも英字であったり両方とも
764		数字であったりする必要は~~ありませんが、勧められ~~ないことに注意してください。
	1519	数字であったりする必要はないことに注意してください。
	1520	任意の文字が可能ですが、勧められません。
765	1521	C<['-?]> は文字の範囲を含みますが、ほとんどの人はどの文字が含まれるか
766	1522	分かりません。
767	1523	さらに、このような範囲は、コードが EBCDIC のような異なった文字集合を使う
768	1524	プラットフォームで実行されると移植性の問題を引き起こします。
769	1525
770	1526	=begin original
771	1527
772		If a hyphen in a character class cannot be part of a range, for ~~instance~~
	1528	If a hyphen in a character class cannot syntactically be part of a range, for
773		because it is the first or the last character of the character class,
	1529	instance because it is the first or the last character of the character class,
774		or if it immediately follows a range, the hyphen isn't special, and ~~will~~ be
	1530	or if it immediately follows a range, the hyphen isn't special, and so is
775		considered a character t~~hat~~ ~~may~~ be matched. You have to ~~esc~~ape th~~e h~~yphen
	1531	considered a character to be matched literally. If you want a hyphen in
776		~~with~~ ~~a back~~s~~lash~~ if ~~you w~~ant to ~~hav~~e a h~~yph~~en in your set of ~~charac~~ters to
	1532	your set of characters to be matched and its position in the class is such
777		~~be ma~~tch~~ed,~~ and its position in the ~~class~~ is such that ~~it can b~~e con~~sidered~~
	1533	that it could be considered part of a range, you must escape that hyphen
778		~~par~~t of a ra~~nge~~.
	1534	with a backslash.
779	1535
780	1536	=end original
781	1537
782	1538	例えば文字クラスの最初または最後であったり、範囲の直後のために、文字クラスの
783		中のハイフンが範囲の一部となれない場合、ハイフンは特別ではなく、
	1539	中のハイフンが文法的に範囲の一部となれない場合、ハイフンは特別ではなく、
784		マッチングするべき文字として扱われます。
	1540	リテラルにマッチングするべき文字として扱われます。
785	1541	マッチングする文字の集合にハイフンを入れたいけれどもその位置が範囲の
786		一部として考えられる場合はハイフンを逆スラッシュで~~エスケープする~~
	1542	一部として考えられる場合はハイフンを逆スラッシュで
787		~~必要があ~~ります。
	1543	エスケープしなければなりません。
788	1544
789	1545	=begin original
790	1546
791	1547	Examples:
792	1548
793	1549	=end original
794	1550
795	1551	例:
796	1552
797	1553	=begin original
798	1554
799	1555	[a-z] # Matches a character that is a lower case ASCII letter.
800		[a-fz] # Matches any letter between 'a' and 'f' (inclusive) or ~~the~~
	1556	[a-fz] # Matches any letter between 'a' and 'f' (inclusive) or
801		# letter 'z'.
	1557	# the letter 'z'.
802	1558	[-z] # Matches either a hyphen ('-') or the letter 'z'.
803	1559	[a-f-m] # Matches any letter between 'a' and 'f' (inclusive), the
804	1560	# hyphen ('-'), or the letter 'm'.
805	1561	['-?] # Matches any of the characters '()*+,-./0123456789:;<=>?
806	1562	# (But not on an EBCDIC platform).
	1563	[\N{APOSTROPHE}-\N{QUESTION MARK}]
	1564	# Matches any of the characters '()*+,-./0123456789:;<=>?
	1565	# even on an EBCDIC platform.
	1566	[\N{U+27}-\N{U+3F}] # Same. (U+27 is "'", and U+3F is "?")
807	1567
808	1568	=end original
809	1569
810	1570	[a-z] # 小文字 ASCII 英字にマッチング。
811	1571	[a-fz] # 'a' から 'f' の英字およびと 'z' の英字に
812	1572	# マッチング。
813	1573	[-z] # ハイフン ('-') または英字 'z' にマッチング。
814	1574	[a-f-m] # 'a' から 'f' の英字、ハイフン ('-')、英字 'm' に
815	1575	# マッチング。
816	1576	['-?] # 文字 '()*+,-./0123456789:;<=>? のどれかにマッチング
817	1577	# (しかし EBCDIC プラットフォームでは異なります)。
	1578	[\N{APOSTROPHE}-\N{QUESTION MARK}]
	1579	# たとえ EBCDIC プラットフォームでも '()*+,-./0123456789:;<=>?
	1580	# のいずれかの文字にマッチング。
	1581	[\N{U+27}-\N{U+3F}] # 同じ。 (U+27 は "'", U+3F は "?")
818	1582
	1583	=begin original
	1584
	1585	As the final two examples above show, you can achieve portability to
	1586	non-ASCII platforms by using the C<\N{...}> form for the range
	1587	endpoints. These indicate that the specified range is to be interpreted
	1588	using Unicode values, so C<[\N{U+27}-\N{U+3F}]> means to match
	1589	C<\N{U+27}>, C<\N{U+28}>, C<\N{U+29}>, ..., C<\N{U+3D}>, C<\N{U+3E}>,
	1590	and C<\N{U+3F}>, whatever the native code point versions for those are.
	1591	These are called "Unicode" ranges. If either end is of the C<\N{...}>
	1592	form, the range is considered Unicode. A C<regexp> warning is raised
	1593	under C<S<"use re 'strict'">> if the other endpoint is specified
	1594	non-portably:
	1595
	1596	=end original
	1597
	1598	前述の最後の二つの例が示すように、範囲の端点に
	1599	C<\N{...}> 形式を使用することで、非 ASCII プラットフォームへの
	1600	移植性を実現できます。
	1601	これらは、指定された範囲が Unicode 値を使用して解釈されることを示しています;
	1602	したがって、C<[\N{U+27}-\N{U+3F}]>は、C<\N{U+27}>、C<\N{U+28}>、
	1603	C<\N{U+29}>、...、C<\N{U+3D}>、C<\N{U+3E}>、C<\N{U+3F}> に
	1604	マッチングすることを意味します;
	1605	これらのネイティブ符号位置のバージョンが何であっても一致します。
	1606	これらは "Unicode" 範囲と呼ばれます。
	1607	いずれかの端点が C<\N{...}> 形式の場合、範囲は Unicode と見なされます。
	1608	もう一方の端点が移植性がない形で指定されている場合、
	1609	C<S<"use re 'strict'">> の下で C<regexp> 警告が発生します:
	1610
	1611	[\N{U+00}-\x09] # Warning under re 'strict'; \x09 is non-portable
	1612	[\N{U+00}-\t] # No warning;
	1613
	1614	=begin original
	1615
	1616	Both of the above match the characters C<\N{U+00}> C<\N{U+01}>, ...
	1617	C<\N{U+08}>, C<\N{U+09}>, but the C<\x09> looks like it could be a
	1618	mistake so the warning is raised (under C<re 'strict'>) for it.
	1619
	1620	=end original
	1621
	1622	前述の両方とも文字 C<\N{U+00}> C<\N{U+01}>, ...
	1623	C<\N{U+08}>, C<\N{U+09}> にマッチングしますが、
	1624	C<\x09> は誤りのように見えるので、
	1625	(C<re 'strict'> の下で) 警告が発生します。
	1626
	1627	=begin original
	1628
	1629	Perl also guarantees that the ranges C<A-Z>, C<a-z>, C<0-9>, and any
	1630	subranges of these match what an English-only speaker would expect them
	1631	to match on any platform. That is, C<[A-Z]> matches the 26 ASCII
	1632	uppercase letters;
	1633	C<[a-z]> matches the 26 lowercase letters; and C<[0-9]> matches the 10
	1634	digits. Subranges, like C<[h-k]>, match correspondingly, in this case
	1635	just the four letters C<"h">, C<"i">, C<"j">, and C<"k">. This is the
	1636	natural behavior on ASCII platforms where the code points (ordinal
	1637	values) for C<"h"> through C<"k"> are consecutive integers (0x68 through
	1638	0x6B). But special handling to achieve this may be needed on platforms
	1639	with a non-ASCII native character set. For example, on EBCDIC
	1640	platforms, the code point for C<"h"> is 0x88, C<"i"> is 0x89, C<"j"> is
	1641	0x91, and C<"k"> is 0x92. Perl specially treats C<[h-k]> to exclude the
	1642	seven code points in the gap: 0x8A through 0x90. This special handling is
	1643	only invoked when the range is a subrange of one of the ASCII uppercase,
	1644	lowercase, and digit ranges, AND each end of the range is expressed
	1645	either as a literal, like C<"A">, or as a named character (C<\N{...}>,
	1646	including the C<\N{U+...> form).
	1647
	1648	=end original
	1649
	1650	Perl はまた、範囲 C<A-Z>、C<a-z>、C<0-9>、およびこれらの部分範囲が、
	1651	英語のみの話者が一致すると予想する範囲とどのプラットフォームでも
	1652	一致することを保証します。
	1653	つまり、C<[A-Z]> はASCII の大文字 26 文字と一致します;
	1654	C<[a-z]> は小文字 26 文字と一致します;
	1655	C<[0-9]>は 10 の数字と一致します。
	1656	C<[h-k]> のような部分範囲もこれに対応して一致します;
	1657	この場合、4 文字 C<"h">、C<"i">、C<"j">、C<"k"> だけが一致します。
	1658	これは、C<"h"> から C<"k"> までの符号位置(序数値)が連続した
	1659	整数(0x68 から 0x6B)である ASCII プラットフォームでの自然な動作です。
	1660	しかし、非 ASCII ネイティブ文字集合を持つプラットフォームでは、
	1661	これを実現するための特別な処理が必要になるかもしれません。
	1662	たとえば、EBCDIC プラットフォームでは、C<"h"> のコードポイントは
	1663	0x88、C<"i"> は 0x89、C<"j"> は 0x91、C<"k"> は 0x92 です。
	1664	Perl は C<[h-k]> を特別に扱い、隙間にある七つの符号位置
	1665	(0x8A から 0x90)を除外します。
	1666	この特殊処理は、範囲が ASCII の大文字、小文字、数字の範囲の
	1667	いずれかの部分範囲であり、範囲の両端が C<"A"> のようなリテラル
	1668	または名前付き文字(C<\N{...}>(C<\N{U+...> 形式を含む))として表現されている
	1669	場合にのみ呼び出されます。
	1670
	1671	=begin original
	1672
	1673	EBCDIC Examples:
	1674
	1675	=end original
	1676
	1677	EBCDIC の例:
	1678
	1679	[i-j] # Matches either "i" or "j"
	1680	[i-\N{LATIN SMALL LETTER J}] # Same
	1681	[i-\N{U+6A}] # Same
	1682	[\N{U+69}-\N{U+6A}] # Same
	1683	[\x{89}-\x{91}] # Matches 0x89 ("i"), 0x8A .. 0x90, 0x91 ("j")
	1684	[i-\x{91}] # Same
	1685	[\x{89}-j] # Same
	1686	[i-J] # Matches, 0x89 ("i") .. 0xC1 ("J"); special
	1687	# handling doesn't apply because range is mixed
	1688	# case
	1689
819	1690	=head3 Negation
820	1691
821	1692	(否定)
822	1693
823	1694	=begin original
824	1695
825	1696	It is also possible to instead list the characters you do not want to
826	1697	match. You can do so by using a caret (C<^>) as the first character in the
827		character class. For instance, C<[^a-z]> matches a character that is not a
	1698	character class. For instance, C<[^a-z]> matches any character that is not a
828		lowercase ASCII letter.
	1699	lowercase ASCII letter, which therefore includes more than a million
	1700	Unicode code points. The class is said to be "negated" or "inverted".
829	1701
830	1702	=end original
831	1703
832	1704	代わりにマッチングしたくない文字の一覧を指定することも可能です。
833	1705	文字クラスの先頭の文字としてキャレット (C<^>) を使うことで実現します。
834		例えば、C<[^a-z]> 小文字の ASCII 英字以外の文字にマッチングします。
	1706	例えば、C<[^a-z]> 小文字の ASCII 英字以外の文字にマッチングします;
	1707	従って 100 万種類以上の Unicode 符号位置が含まれます。
	1708	このクラスは「否定」("negated") や「反転」("inverted")と呼ばれます。
835	1709
836	1710	=begin original
837	1711
838	1712	This syntax make the caret a special character inside a bracketed character
839	1713	class, but only if it is the first character of the class. So if you want
840		to have ~~the~~ caret as one of the characters ~~you wan~~t to match, ~~you~~ either
	1714	the caret as one of the characters to match, either escape the caret or
841		~~hav~~e ~~to e~~s~~cap~~e ~~the caret,~~ or not list it first.
	1715	else don't list it first.
842	1716
843	1717	=end original
844	1718
845	1719	この文法はキャレットを大かっこ文字クラスの内側で特別な文字にしますが、
846	1720	クラスの最初の文字の場合のみです。
847	1721	それでマッチングしたい文字の一つでキャレットを使いたい場合、キャレットを
848		エスケープするか、最初以外の位置に書く~~必要があります~~。
	1722	エスケープするか、最初以外の位置に書いてください。
849	1723
850	1724	=begin original
851	1725
	1726	In inverted bracketed character classes, Perl ignores the Unicode rules
	1727	that normally say that named sequence, and certain characters should
	1728	match a sequence of multiple characters use under caseless C</i>
	1729	matching. Following those rules could lead to highly confusing
	1730	situations:
	1731
	1732	=end original
	1733
	1734	否定大かっこ文字クラスでは、通常は大文字小文字を無視した C</i> マッチングの
	1735	下では名前空間とある種の文字が複数の文字並びにマッチングするということを
	1736	Perl は無視します。
	1737	これらの規則に従うととても混乱する状況を引き起こすことになるからです:
	1738
	1739	"ss" =~ /^[^\xDF]+$/ui; # Matches!
	1740
	1741	=begin original
	1742
	1743	This should match any sequences of characters that aren't C<\xDF> nor
	1744	what C<\xDF> matches under C</i>. C<"s"> isn't C<\xDF>, but Unicode
	1745	says that C<"ss"> is what C<\xDF> matches under C</i>. So which one
	1746	"wins"? Do you fail the match because the string has C<ss> or accept it
	1747	because it has an C<s> followed by another C<s>? Perl has chosen the
	1748	latter. (See note in L</Bracketed Character Classes> above.)
	1749
	1750	=end original
	1751
	1752	これは C</i> の下では C<\xDF> または C<\xDF> にマッチングするもの以外の
	1753	任意の文字並びにマッチングするべきです。
	1754	C<"s"> は C<\xDF> ではありませんが、
	1755	C</i> の下では C<"ss"> は C<\xDF> がマッチングするものと Unicode は
	1756	言っています。
	1757	ではどちらが「勝つ」のでしょうか?
	1758	文字列は C<ss> だからマッチングに失敗するのでしょうか、
	1759	それともこれは C<s> の後にもう一つの C<s> があるから成功するのでしょうか?
	1760	Perl は後者を選択しました。
	1761	(前述の L</Bracketed Character Classes> を参照してください。)
	1762
	1763	=begin original
	1764
852	1765	Examples:
853	1766
854	1767	=end original
855	1768
856	1769	例:
857	1770
858	1771	=begin original
859	1772
860	1773	"e" =~ /[^aeiou]/ # No match, the 'e' is listed.
861	1774	"x" =~ /[^aeiou]/ # Match, as 'x' isn't a lowercase vowel.
862	1775	"^" =~ /[^^]/ # No match, matches anything that isn't a caret.
863	1776	"^" =~ /[x^]/ # Match, caret is not special here.
864	1777
865	1778	=end original
866	1779
867	1780	"e" =~ /[^aeiou]/ # マッチングしない; 'e' がある。
868	1781	"x" =~ /[^aeiou]/ # マッチング; 'x' は小文字の母音ではない。
869	1782	"^" =~ /[^^]/ # マッチングしない; キャレット以外全てにマッチング。
870	1783	"^" =~ /[x^]/ # マッチング; キャレットはここでは特別ではない。
871	1784
872	1785	=head3 Backslash Sequences
873	1786
874	1787	(逆スラッシュシーケンス)
875	1788
876	1789	=begin original
877	1790
878		You can put a backslash sequence character class i~~nside~~ ~~a bracke~~ted c~~haract~~er
	1791	You can put any backslash sequence character class (with the exception of
879		~~class,~~ and it wi~~ll act ju~~s~~t as~~ if ~~you put~~ all the characters matc~~hed~~ by
	1792	C<\N> and C<\R>) inside a bracketed character class, and it will act just
880		the backslash sequence inside the ~~character class. For instance,~~
	1793	as if you had put all characters matched by the backslash sequence inside the
881		C<[a-f\d]> ~~will~~ match any digit, or any ~~of the lowercase letters between~~
	1794	character class. For instance, C<[a-f\d]> matches any decimal digit, or any
882		'a' and 'f' inclusive.
	1795	of the lowercase letters between 'a' and 'f' inclusive.
883	1796
884	1797	=end original
885	1798
886		大かっこ文字クラスの中に逆スラッシュシーケンス~~文字クラスを置くことができ、~~
	1799	大かっこ文字クラスの中に(C<\N> と C<\R> を例外として)逆スラッシュシーケンス
887		逆スラッシュシーケンスにマッチングする全ての~~文字を文字クラスの中に~~
	1800	文字クラスを置くことができ、逆スラッシュシーケンスにマッチングする全ての
888		置いたかのように動作します。
	1801	文字を文字クラスの中に置いたかのように動作します。
889		例えば、C<[a-f\d]> は任意の数字、あるいは 'a' から 'f' までの小文字に
	1802	例えば、C<[a-f\d]> は任意の 10 進数字、あるいは 'a' から 'f' までの小文字に
890	1803	マッチングします。
891	1804
892	1805	=begin original
893	1806
	1807	C<\N> within a bracketed character class must be of the forms C<\N{I<name>}>
	1808	or C<\N{U+I<hex char>}>, and NOT be the form that matches non-newlines,
	1809	for the same reason that a dot C<.> inside a bracketed character class loses
	1810	its special meaning: it matches nearly anything, which generally isn't what you
	1811	want to happen.
	1812
	1813	=end original
	1814
	1815	大かっこ文字クラスの中のドット C<.> が特別な意味を持たないのと同じ理由で、
	1816	大かっこ文字クラスの中の C<\N> は C<\N{I<name>}> または
	1817	C<\N{U+I<hex char>}> の形式で、かつ非改行マッチング形式でない形でなければ
	1818	なりません: これはほとんど何でもマッチングするので、一般的には起こって
	1819	欲しいことではありません。
	1820
	1821	=begin original
	1822
894	1823	Examples:
895	1824
896	1825	=end original
897	1826
898	1827	例:
899	1828
900	1829	=begin original
901	1830
902	1831	/[\p{Thai}\d]/ # Matches a character that is either a Thai
903	1832	# character, or a digit.
904	1833	/[^\p{Arabic}()]/ # Matches a character that is neither an Arabic
905	1834	# character, nor a parenthesis.
906	1835
907	1836	=end original
908	1837
909	1838	/[\p{Thai}\d]/ # タイ文字または数字の文字に
910	1839	# マッチングする。
911	1840	/[^\p{Arabic}()]/ # アラビア文字でもかっこでもない文字に
912	1841	# マッチングする。
913	1842
914	1843	=begin original
915	1844
916	1845	Backslash sequence character classes cannot form one of the endpoints
917		of a range.
	1846	of a range. Thus, you can't say:
918	1847
919	1848	=end original
920	1849
921	1850	逆スラッシュシーケンス文字クラスは範囲の端点の一つにはできません。
	1851	従って、以下のようにはできません:
922	1852
923		=head3 ~~Posix~~ ~~Character~~ ~~Classes~~
	1853	/[\p{Thai}-\d]/ # Wrong!
924	1854
925		(P~~osix~~ ~~文字クラス)~~
	1855	=head3 POSIX Character Classes
	1856	X<character class> X<\p> X<\p{}>
	1857	X<alpha> X<alnum> X<ascii> X<blank> X<cntrl> X<digit> X<graph>
	1858	X<lower> X<print> X<punct> X<space> X<upper> X<word> X<xdigit>
926	1859
	1860	(POSIX 文字クラス)
	1861
927	1862	=begin original
928	1863
929		P~~osix~~ character classes have the form C<[:class:]>, where I<class> is
	1864	POSIX character classes have the form C<[:class:]>, where I<class> is the
930		name, and the C<[:> and C<:]> delimiters. P~~osix~~ character classes appear
	1865	name, and the C<[:> and C<:]> delimiters. POSIX character classes only appear
931	1866	I<inside> bracketed character classes, and are a convenient and descriptive
932		way of listing a group of characters. ~~Be careful about the syntax,~~
	1867	way of listing a group of characters.
933	1868
934	1869	=end original
935	1870
936		P~~osix~~ 文字クラスは C<[:class:]> の形式で、I<class> は名前、C<[:> と C<:]> は
	1871	POSIX 文字クラスは C<[:class:]> の形式で、I<class> は名前、C<[:> と C<:]> は
937	1872	デリミタです。
938		P~~osix~~ 文字クラスは大かっこ文字クラスの I<内側> に現れ、文字のグループを
	1873	POSIX 文字クラスは大かっこ文字クラスの I<内側> にのみ現れ、文字のグループを
939	1874	一覧するのに便利で記述的な方法です。
	1875
	1876	=begin original
	1877
	1878	Be careful about the syntax,
	1879
	1880	=end original
	1881
940	1882	文法について注意してください、
941	1883
942	1884	# Correct:
943	1885	$string =~ /[[:alpha:]]/
944	1886
945	1887	# Incorrect (will warn):
946	1888	$string =~ /[:alpha:]/
947	1889
948	1890	=begin original
949	1891
950	1892	The latter pattern would be a character class consisting of a colon,
951	1893	and the letters C<a>, C<l>, C<p> and C<h>.
	1894	POSIX character classes can be part of a larger bracketed character class.
	1895	For example,
952	1896
953	1897	=end original
954	1898
955	1899	後者のパターンは、コロンおよび C<a>, C<l>, C<p>, C<h> の文字からなる
956	1900	文字クラスです。
	1901	これら文字クラスはより大きな大かっこ文字クラスの一部にできます。
	1902	例えば:
957	1903
	1904	[01[:alpha:]%]
	1905
958	1906	=begin original
959	1907
960		~~Per~~l ~~recog~~n~~izes~~ the fol~~low~~ing ~~POSIX~~ character classes:
	1908	is valid and matches '0', '1', any alphabetic character, and the percent sign.
961	1909
962	1910	=end original
963	1911
964		~~Perl 以下~~の ~~POSIX 文~~字ク~~ラスを認識~~します:
	1912	これは妥当で、'0'、'1'、任意の英字、パーセントマークにマッチングします。
965	1913
966	1914	=begin original
967	1915
968		al~~pha~~ Any ~~alp~~habe~~tica~~l character.
	1916	Perl recognizes the following POSIX character classes:
969		alnum Any alphanumerical character.
970		ascii Any ASCII character.
971		blank A GNU extension, equal to a space or a horizontal tab ("\t").
972		cntrl Any control character.
973		digit Any digit, equivalent to "\d".
974		graph Any printable character, excluding a space.
975		lower Any lowercase character.
976		print Any printable character, including a space.
977		punct Any punctuation character.
978		space Any white space character. "\s" plus the vertical tab ("\cK").
979		upper Any uppercase character.
980		word Any "word" character, equivalent to "\w".
981		xdigit Any hexadecimal digit, '0' - '9', 'a' - 'f', 'A' - 'F'.
982	1917
983	1918	=end original
984	1919
985		al~~pha~~ 任意の英字。
	1920	Perl は以下の POSIX 文字クラスを認識します:
986		alnum 任意の英数字。
987		ascii 任意の ASCII 文字。
988		blank GNU 拡張; スペースまたは水平タブ ("\t") と同じ。
989		cntrl 任意の制御文字。
990		digit 任意の数字; "\d" と等価。
991		graph 任意の表示文字; スペースを除く。
992		lower 任意の小文字。
993		print 任意の表示文字; スペースを含む。
994		punct 任意の句読点文字。
995		space 任意の空白文字。"\s" に加えて水平タブ ("\cK")。
996		upper 任意の大文字。
997		word 任意の「単語」文字; "\w" と等価。
998		xdigit 任意の 16 進文字; '0' - '9', 'a' - 'f', 'A' - 'F'。
999	1921
1000	1922	=begin original
1001	1923
1002		~~The~~ exa~~ct set of c~~ha~~racters~~ ~~matched~~ ~~depe~~nds ~~on w~~het~~her~~ th~~e sou~~rc~~e s~~tring
	1924	alpha Any alphabetical character (e.g., [A-Za-z]).
1003		is ~~intern~~al~~ly i~~n ~~UTF-8 for~~mat or n~~ot.~~ See ~~L</Lo~~cale, ~~Unicod~~e a~~nd UTF~~-8>.
	1925	alnum Any alphanumeric character (e.g., [A-Za-z0-9]).
	1926	ascii Any character in the ASCII character set.
	1927	blank Any horizontal whitespace character (e.g. space or horizontal
	1928	tab ("\t")).
	1929	cntrl Any control character. See Note [2] below.
	1930	digit Any decimal digit (e.g., [0-9]), equivalent to "\d".
	1931	graph Any printable character, excluding a space. See Note [3] below.
	1932	lower Any lowercase character (e.g., [a-z]).
	1933	print Any printable character, including a space. See Note [4] below.
	1934	punct Any graphical character excluding "word" characters. Note [5].
	1935	space Any whitespace character. "\s" including the vertical tab
	1936	("\cK").
	1937	upper Any uppercase character (e.g., [A-Z]).
	1938	word A Perl extension (e.g., [A-Za-z0-9_]), equivalent to "\w".
	1939	xdigit Any hexadecimal digit (e.g., [0-9a-fA-F]). Note [7].
1004	1940
1005	1941	=end original
1006	1942
1007		~~マッチングする文字~~の~~正確な集合はソース文~~字~~列が内部で~~ ~~UTF-8~~ ~~形式かどうかに~~
	1943	alpha 任意の英字 (例: [A-Za-z])。
1008		~~依存します~~。
	1944	alnum 任意の英数字。(例: [A-Za-z0-9])
1009		~~L</Locale,~~ ~~Uni~~c~~ode~~ ~~and~~ ~~UTF-8>~~ ~~を参照してください~~。
	1945	ascii 任意の ASCII 文字集合の文字。
	1946	blank 任意の水平空白文字 (スペース、水平タブ ("\t") など)
	1947	cntrl 任意の制御文字。後述の [2] 参照。
	1948	digit 任意の 10 進数字 (例: [0-9]); "\d" と等価。
	1949	graph 任意の表示文字; スペースを除く。後述の [3] 参照。
	1950	lower 任意の小文字 (例: [a-z])。
	1951	print 任意の表示文字; スペースを含む。後述の [4] 参照。
	1952	punct 任意の「単語」文字を除く表示文字。[5] 参照。
	1953	space 任意の空白文字。水平タブ ("\cK") を含む "\s"。
	1954	upper 任意の大文字 (例: [A-Z])。
	1955	word Perl 拡張 (例: [A-Za-z0-9_]); "\w" と等価。
	1956	xdigit 任意の 16 進文字 (例: [0-9a-fA-F])。[7] 参照。
1010	1957
1011	1958	=begin original
1012	1959
1013		~~Most~~ ~~POSIX charac~~ter c~~lass~~es ~~have C<\~~p~~> c~~ounterparts. ~~The~~ diffe~~rence~~
	1960	Like the L<Unicode properties\|/Unicode Properties>, most of the POSIX
1014		is that the ~~C<\p> cla~~s~~ses will~~ a~~lways~~ m~~atch~~ a~~cco~~rd~~ing~~ to the Uni~~cod~~e
	1961	properties match the same regardless of whether case-insensitive (C</i>)
1015		~~properties, reg~~a~~rdless whe~~th~~er the str~~ing is in ~~UTF-8~~ f~~orma~~t or not.
	1962	matching is in effect or not. The two exceptions are C<[:upper:]> and
	1963	C<[:lower:]>. Under C</i>, they each match the union of C<[:upper:]> and
	1964	C<[:lower:]>.
1016	1965
1017	1966	=end original
1018	1967
1019		~~ほとんどの~~ ~~POSIX~~ ~~文字クラスは対応する C<\~~p> ~~を持っています。~~
	1968	L<Unicode properties\|/Unicode Properties> と同様、
1020		違いは、文字列が ~~UTF-8~~ 形式かどうかに関わらず ~~C<\p> クラスは常に~~
	1969	ほとんどの POSIX 特性は、大文字小文字無視 (C</i>) が有効かどうかに関わらず
1021		~~Unicode 特性~~に~~従って~~マッチング~~するということで~~す。
	1970	同じものにマッチングします。
	1971	二つの例外は C<[:upper:]> と C<[:lower:]> です。
	1972	C</i> の下では、これらそれぞれ C<[:upper:]> と C<[:lower:]> の和集合に
	1973	マッチングします。
1022	1974
1023	1975	=begin original
1024	1976
1025		~~The f~~o~~llowing table~~ s~~hows~~ the ~~relation between~~ POSIX character classes
	1977	Most POSIX character classes have two Unicode-style C<\p> property
1026		and the Unicode properties:
	1978	counterparts. (They are not official Unicode properties, but Perl extensions
	1979	derived from official Unicode properties.) The table below shows the relation
	1980	between POSIX character classes and these counterparts.
1027	1981
1028	1982	=end original
1029	1983
1030		以下の表は POSIX 文字クラスと Unicode 特性~~との関係を示しています:~~
	1984	ほとんどの POSIX 文字クラスには、対応する二つの Unicode 式の C<\p> 特性が
	1985	あります。
	1986	(これは公式 Unicode 特性ではなく、公式 Unicode 特性から派生した Perl
	1987	エクステンションです。)
	1988	以下の表は POSIX 文字クラスと対応するものとの関連を示します。
1031	1989
1032		~~[[:...:]] \p{...}~~ ba~~cks~~l~~ash~~
	1990	=begin original
1033	1991
1034		alpha IsA~~lph~~a
	1992	One counterpart, in the column labelled "ASCII-range Unicode" in
1035		alnum Is~~Alnum~~
	1993	the table, matches only characters in the ASCII character set.
1036		ascii IsASCII
1037		blank
1038		cntrl IsCntrl
1039		digit IsDigit \d
1040		graph IsGraph
1041		lower IsLower
1042		print IsPrint
1043		punct IsPunct
1044		space IsSpace
1045		IsSpacePerl \s
1046		upper IsUpper
1047		word IsWord
1048		xdigit IsXDigit
1049	1994
	1995	=end original
	1996
	1997	対応物の一つである、表で "ASCII-range Unicode" と書かれた列のものは、
	1998	ASCII 文字集合の文字にのみマッチングします。
	1999
1050	2000	=begin original
1051	2001
1052		~~Som~~e character classes may have a n~~on-obv~~ious name:
	2002	The other counterpart, in the column labelled "Full-range Unicode", matches any
	2003	appropriate characters in the full Unicode character set. For example,
	2004	C<\p{Alpha}> matches not just the ASCII alphabetic characters, but any
	2005	character in the entire Unicode character set considered alphabetic.
	2006	An entry in the column labelled "backslash sequence" is a (short)
	2007	equivalent.
1053	2008
1054	2009	=end original
1055	2010
1056		一部の~~文字クラスは明らか~~で~~ない名前を持ちます:~~
	2011	もう一つの対応物である、"Full-range Unicode" と書かれた列のものは、
	2012	Unicode 文字集合全体の中の適切な任意の文字にマッチングします。
	2013	例えば、C<\p{Alpha}> は単に ASCII アルファベット文字だけでなく、
	2014	Unicode 文字集合全体の中からアルファベットと考えられる任意の文字に
	2015	マッチングします。
	2016	"backslash sequence" の列は (短い) 同義語です。
1057	2017
	2018	[[:...:]] ASCII-range Full-range backslash Note
	2019	Unicode Unicode sequence
	2020	-----------------------------------------------------
	2021	alpha \p{PosixAlpha} \p{XPosixAlpha}
	2022	alnum \p{PosixAlnum} \p{XPosixAlnum}
	2023	ascii \p{ASCII}
	2024	blank \p{PosixBlank} \p{XPosixBlank} \h [1]
	2025	or \p{HorizSpace} [1]
	2026	cntrl \p{PosixCntrl} \p{XPosixCntrl} [2]
	2027	digit \p{PosixDigit} \p{XPosixDigit} \d
	2028	graph \p{PosixGraph} \p{XPosixGraph} [3]
	2029	lower \p{PosixLower} \p{XPosixLower}
	2030	print \p{PosixPrint} \p{XPosixPrint} [4]
	2031	punct \p{PosixPunct} \p{XPosixPunct} [5]
	2032	\p{PerlSpace} \p{XPerlSpace} \s [6]
	2033	space \p{PosixSpace} \p{XPosixSpace} [6]
	2034	upper \p{PosixUpper} \p{XPosixUpper}
	2035	word \p{PosixWord} \p{XPosixWord} \w
	2036	xdigit \p{PosixXDigit} \p{XPosixXDigit} [7]
	2037
1058	2038	=over 4
1059	2039
1060		=item ~~cntrl~~
	2040	=item [1]
1061	2041
1062	2042	=begin original
1063	2043
1064		Any con~~trol~~ ~~cha~~racter. ~~Usu~~a~~lly, cont~~r~~ol charact~~ers don~~'t pr~~o~~duce output~~
	2044	C<\p{Blank}> and C<\p{HorizSpace}> are synonyms.
1065		as such, but instead control the terminal somehow: for example newline
1066		and backspace are control characters. All characters with C<ord()> less
1067		than 32 are usually classified as control characters (in ASCII, the ISO
1068		Latin character sets, and Unicode), as is the character C<ord()> value
1069		of 127 (C<DEL>).
1070	2045
1071	2046	=end original
1072	2047
1073		~~任意の制御文字~~。
	2048	C<\p{Blank}> と C<\p{HorizSpace}> は同義語です。
1074		~~普通は、制御文字はそれ自体は出力されず、何か端末を制御します: 例えば~~
	2050	=item [2]
	2051
	2052	=begin original
	2053
	2054	Control characters don't produce output as such, but instead usually control
	2055	the terminal somehow: for example, newline and backspace are control characters.
	2056	On ASCII platforms, in the ASCII range, characters whose code points are
	2057	between 0 and 31 inclusive, plus 127 (C<DEL>) are control characters; on
	2058	EBCDIC platforms, their counterparts are control characters.
	2059
	2060	=end original
	2061
	2062	制御文字はそれ自体は出力されず、普通は何か端末を制御します: 例えば
1075	2063	改行と後退は制御文字です。
1076		(ASCII、ISO ~~Latin 文字集合~~、~~Unicode)~~ で ~~C<ord()>~~ が 32 未満の~~全ての文字および~~
	2064	ASCII プラットフォームで、ASCII の範囲では、符号位置が 0 から 31 までの
1077		~~C<ord()>~~ 値が 127 ~~の文字~~ (C<DEL>) ~~は普通は~~制御文字~~に分類されま~~す。
	2065	範囲の文字および 127 (C<DEL>) が制御文字です;
	2066	EBCDIC プラットフォームでは、対応するものは制御文字です。
1078	2067
1079		=item ~~graph~~
	2068	=item [3]
1080	2069
1081	2070	=begin original
1082	2071
1083	2072	Any character that is I<graphical>, that is, visible. This class consists
1084		of all ~~the~~ alphanumerical characters and all punctuation characters.
	2073	of all alphanumeric characters and all punctuation characters.
1085	2074
1086	2075	=end original
1087	2076
1088	2077	I<graphical>、つまり見える文字。
1089	2078	このクラスは全ての英数字と全ての句読点文字。
1090	2079
1091		=item ~~print~~
	2080	=item [4]
1092	2081
1093	2082	=begin original
1094	2083
1095		All printable characters, which is the set of all ~~the~~ graphical characters
	2084	All printable characters, which is the set of all graphical characters
1096		plus the space.
	2085	plus those whitespace characters which are not also controls.
1097	2086
1098	2087	=end original
1099	2088
1100		全ての表示可能な文字; 全ての graphical 文字と空白。
	2089	全ての表示可能な文字; 全ての graphical 文字に加えて制御文字でない空白文字。
1101	2090
1102		=item ~~punct~~
	2091	=item [5]
1103	2092
1104	2093	=begin original
1105	2094
1106		~~Any~~ punctua~~tio~~n (speci~~al)~~ character.
	2095	C<\p{PosixPunct}> and C<[[:punct:]]> in the ASCII range match all
	2096	non-controls, non-alphanumeric, non-space characters:
	2097	C<[-!"#$%&'()*+,./:;<=E<gt>?@[\\\]^_`{\|}~]> (although if a locale is in effect,
	2098	it could alter the behavior of C<[[:punct:]]>).
1107	2099
1108	2100	=end original
1109	2101
1110		任意の~~句読点(特殊)文~~字。
	2102	ASCII の範囲の C<\p{PosixPunct}> と C<[[:punct:]]> は全ての非制御、非英数字、
	2103	非空白文字にマッチングします:
	2104	C<[-!"#$%&'()*+,./:;<=E<gt>?@[\\\]^_`{\|}~]> (しかしロケールが有効なら、
	2105	C<[[:punct:]]> の振る舞いが変わります)。
1111	2106
	2107	=begin original
	2108
	2109	The similarly named property, C<\p{Punct}>, matches a somewhat different
	2110	set in the ASCII range, namely
	2111	C<[-!"#%&'()*,./:;?@[\\\]_{}]>. That is, it is missing the nine
	2112	characters C<[$+E<lt>=E<gt>^`\|~]>.
	2113	This is because Unicode splits what POSIX considers to be punctuation into two
	2114	categories, Punctuation and Symbols.
	2115
	2116	=end original
	2117
	2118	似たような名前の特性 C<\p{Punct}> は、ASCII 範囲の異なる集合である
	2119	C<[-!"#%&'()*,./:;?@[\\\]_{}]> にマッチングします。
	2120	つまり、C<[$+E<lt>=E<gt>^`\|~]> の 9 文字はありません。
	2121	これは、Unicode は POSIX が句読点と考えるものを二つのカテゴリ
	2122	Punctuation と Symbols に分けているからです。
	2123
	2124	=begin original
	2125
	2126	C<\p{XPosixPunct}> and (under Unicode rules) C<[[:punct:]]>, match what
	2127	C<\p{PosixPunct}> matches in the ASCII range, plus what C<\p{Punct}>
	2128	matches. This is different than strictly matching according to
	2129	C<\p{Punct}>. Another way to say it is that
	2130	if Unicode rules are in effect, C<[[:punct:]]> matches all characters
	2131	that Unicode considers punctuation, plus all ASCII-range characters that
	2132	Unicode considers symbols.
	2133
	2134	=end original
	2135
	2136	C<\p{XPosixPunct}> と (Unicode の規則の下での) C<[[:punct:]]> は、
	2137	ASCII の範囲で C<\p{PosixPunct}> がマッチングする物に加えて、
	2138	C<\p{Punct}> がマッチングする物にマッチングします。
	2139	これは C<\p{Punct}> に従って正確にマッチングする物と異なります。
	2140	Unicode 規則が有効な場合のもう一つの言い方は、C<[[:punct:]]> は Unicode が
	2141	句読点として扱うものに加えて、Unicode が "symbols" として扱う ASCII 範囲の
	2142	全ての文字にマッチングします。
	2143
	2144	=item [6]
	2145
	2146	=begin original
	2147
	2148	C<\p{XPerlSpace}> and C<\p{Space}> match identically starting with Perl
	2149	v5.18. In earlier versions, these differ only in that in non-locale
	2150	matching, C<\p{XPerlSpace}> did not match the vertical tab, C<\cK>.
	2151	Same for the two ASCII-only range forms.
	2152
	2153	=end original
	2154
	2155	C<\p{XPerlSpace}> と C<\p{Space}> は、Perl v5.18 からは同じように
	2156	マッチングします。
	2157	以前のバージョンでは、これらの違いは、非ロケールマッチングでは
	2158	C<\p{XPerlSpace}> は垂直タブ C<\cK> にもマッチングしないということだけです。
	2159	二つの ASCII のみの範囲の形式では同じです。
	2160
	2161	=item [7]
	2162
	2163	=begin original
	2164
	2165	Unlike C<[[:digit:]]> which matches digits in many writing systems, such
	2166	as Thai and Devanagari, there are currently only two sets of hexadecimal
	2167	digits, and it is unlikely that more will be added. This is because you
	2168	not only need the ten digits, but also the six C<[A-F]> (and C<[a-f]>)
	2169	to correspond. That means only the Latin script is suitable for these,
	2170	and Unicode has only two sets of these, the familiar ASCII set, and the
	2171	fullwidth forms starting at U+FF10 (FULLWIDTH DIGIT ZERO).
	2172
	2173	=end original
	2174
	2175	タイ文字やデバナーガリ文字のように多くの書記体系の数字にマッチングする
	2176	C<[[:digit:]]> と異なり、16 進数の二つの集合だけで、これ以上追加されることは
	2177	おそらくありません。
	2178	これは、対応するのに 10 の数字だけでなく、6 個の C<[A-F]> (および C<[a-f]>) も
	2179	必要だからです。
	2180	これは、Latin 用字のみがこれらに適合していて、
	2181	Unicode はこれらの二つの集合、つまり慣れ親しんだ
	2182	ASCII 集合と、U+FF10 (FULLWIDTH DIGIT ZERO) から始まる全角形式のみを
	2183	持つということです。
	2184
1112	2185	=back
1113	2186
1114		=he~~ad4 Ne~~gation
	2187	=begin original
1115	2188
1116		~~(否定)~~
	2189	There are various other synonyms that can be used besides the names
	2190	listed in the table. For example, C<\p{XPosixAlpha}> can be written as
	2191	C<\p{Alpha}>. All are listed in
	2192	L<perluniprops/Properties accessible through \p{} and \P{}>.
1117	2193
	2194	=end original
	2195
	2196	表に挙げられている名前以外にも様々なその他の同義語が使えます。
	2197	例えば、C<\p{XPosixAlpha}> は C<\p{Alpha}> と書けます。
	2198	全ての一覧は
	2199	L<perluniprops/Properties accessible through \p{} and \P{}> に
	2200	あります。
	2201
1118	2202	=begin original
1119	2203
	2204	Both the C<\p> counterparts always assume Unicode rules are in effect.
	2205	On ASCII platforms, this means they assume that the code points from 128
	2206	to 255 are Latin-1, and that means that using them under locale rules is
	2207	unwise unless the locale is guaranteed to be Latin-1 or UTF-8. In contrast, the
	2208	POSIX character classes are useful under locale rules. They are
	2209	affected by the actual rules in effect, as follows:
	2210
	2211	=end original
	2212
	2213	C<\p> に対応するものの両方は常に Unicode の規則が有効であることを仮定します。
	2214	これは、ASCII プラットフォームでは、128 から 255 の符号位置は
	2215	Latin-1 であることを仮定するということで、ロケールの規則の下で
	2216	これらを使うということは、ロケールが Latin-1 か UTF-8 であることが
	2217	補償されていない限り賢明ではないということです。
	2218	一方、POSIX 文字クラスはロケールの規則の下で有用です。
	2219	これらは次のように、実際に有効な規則に影響を受けます:
	2220
	2221	=over
	2222
	2223	=item If the C</a> modifier, is in effect ...
	2224
	2225	(C</a> が有効なら...)
	2226
	2227	=begin original
	2228
	2229	Each of the POSIX classes matches exactly the same as their ASCII-range
	2230	counterparts.
	2231
	2232	=end original
	2233
	2234	それぞれの POSIX クラスは ASCII の範囲で対応する正確に同じものに
	2235	マッチングします。
	2236
	2237	=item otherwise ...
	2238
	2239	(さもなければ ...)
	2240
	2241	=over
	2242
	2243	=item For code points above 255 ...
	2244
	2245	(256 以上の符号位置では ...)
	2246
	2247	=begin original
	2248
	2249	The POSIX class matches the same as its Full-range counterpart.
	2250
	2251	=end original
	2252
	2253	POSIX クラスはその Full の範囲で対応する同じものにマッチングします。
	2254
	2255	=item For code points below 256 ...
	2256
	2257	(255 以下の符号位置では ...)
	2258
	2259	=over
	2260
	2261	=item if locale rules are in effect ...
	2262
	2263	(ロケール規則が有効なら ...)
	2264
	2265	=begin original
	2266
	2267	The POSIX class matches according to the locale, except:
	2268
	2269	=end original
	2270
	2271	POSIX クラスはロケールに従ってマッチングします; 例外は:
	2272
	2273	=over
	2274
	2275	=item C<word>
	2276
	2277	=begin original
	2278
	2279	also includes the platform's native underscore character, no matter what
	2280	the locale is.
	2281
	2282	=end original
	2283
	2284	それに加えて、ロケールが何かに関わらず、プラットフォームのネイティブな
	2285	下線文字を使います。
	2286
	2287	=item C<ascii>
	2288
	2289	=begin original
	2290
	2291	on platforms that don't have the POSIX C<ascii> extension, this matches
	2292	just the platform's native ASCII-range characters.
	2293
	2294	=end original
	2295
	2296	POSIX C<ascii> 拡張を持たないプラットフォームでは、
	2297	これは単にプラットフォームのネイティブな ASCII の範囲の文字に
	2298	マッチングします。
	2299
	2300	=item C<blank>
	2301
	2302	=begin original
	2303
	2304	on platforms that don't have the POSIX C<blank> extension, this matches
	2305	just the platform's native tab and space characters.
	2306
	2307	=end original
	2308
	2309	on platforms that don't have the
	2310	POSIX C<blank> 格調を持たないプラットフォームでは、
	2311	これは単にプラットフォームのネイティブなタブとすぺーす文字に
	2312	マッチングします。
	2313
	2314	=back
	2315
	2316	=item if, instead, Unicode rules are in effect ...
	2317
	2318	(そうではなく、Unicode 規則が有効なら ...)
	2319
	2320	=begin original
	2321
	2322	The POSIX class matches the same as the Full-range counterpart.
	2323
	2324	=end original
	2325
	2326	POSIX クラスは Full の範囲の対応する同じものにマッチングします。
	2327
	2328	=item otherwise ...
	2329
	2330	(さもなければ ...)
	2331
	2332	=begin original
	2333
	2334	The POSIX class matches the same as the ASCII range counterpart.
	2335
	2336	=end original
	2337
	2338	POSIX クラスは ASCII の範囲の同じものにマッチングします。
	2339
	2340	=back
	2341
	2342	=back
	2343
	2344	=back
	2345
	2346	=begin original
	2347
	2348	Which rules apply are determined as described in
	2349	L<perlre/Which character set modifier is in effect?>.
	2350
	2351	=end original
	2352
	2353	どの規則を適用するかは L<perlre/Which character set modifier is in effect?> で
	2354	記述されている方法で決定されます。
	2355
	2356	=head4 Negation of POSIX character classes
	2357	X<character class, negation>
	2358
	2359	(POSIX 文字クラスの否定)
	2360
	2361	=begin original
	2362
1120	2363	A Perl extension to the POSIX character class is the ability to
1121	2364	negate it. This is done by prefixing the class name with a caret (C<^>).
1122	2365	Some examples:
1123	2366
1124	2367	=end original
1125	2368
1126	2369	POSIX 文字クラスに対する Perl の拡張は否定の機能です。
1127	2370	これはクラス名の前にキャレット (C<^>) を置くことで実現します。
1128	2371	いくつかの例です:
1129	2372
1130		POSIX Un~~icod~~e Backslash
	2373	POSIX ASCII-range Full-range backslash
1131		~~[[:^d~~i~~git:]]~~ ~~\P{IsD~~i~~git}~~ \D
	2374	Unicode Unicode sequence
1132		~~[[:^space:]] \P{IsSpace} \S~~
	2375	-----------------------------------------------------
1133		[[:^~~wor~~d:]] \P{~~IsW~~ord} \W
	2376	[[:^digit:]] \P{PosixDigit} \P{XPosixDigit} \D
	2377	[[:^space:]] \P{PosixSpace} \P{XPosixSpace}
	2378	\P{PerlSpace} \P{XPerlSpace} \S
	2379	[[:^word:]] \P{PerlWord} \P{XPosixWord} \W
1134	2380
	2381	=begin original
	2382
	2383	The backslash sequence can mean either ASCII- or Full-range Unicode,
	2384	depending on various factors as described in L<perlre/Which character set modifier is in effect?>.
	2385
	2386	=end original
	2387
	2388	逆スラッシュシーケンスは ASCII- か Full-range Unicode のどちらかを意味します;
	2389	どちらが使われるかは L<perlre/Which character set modifier is in effect?> で
	2390	記述されている様々な要素に依存します。
	2391
1135	2392	=head4 [= =] and [. .]
1136	2393
1137	2394	([= =] と [. .])
1138	2395
1139	2396	=begin original
1140	2397
1141		Perl ~~will~~ recognize the POSIX character classes C<[=class=]>, and
	2398	Perl recognizes the POSIX character classes C<[=class=]> and
1142		C<[.class.]>, but does not (yet?) support this const~~ruc~~t~~. Us~~e of
	2399	C<[.class.]>, but does not (yet?) support them. Any attempt to use
1143		~~suc~~h a construct wi~~ll l~~ead to an error.
	2400	either construct raises an exception.
1144	2401
1145	2402	=end original
1146	2403
1147	2404	Perl は POSIX 文字クラス C<[=class=]> と C<[.class.]> を認識しますが、
1148		これら~~の構文~~には(まだ?)対応していません。
	2405	これらには(まだ?)対応していません。
1149		このような構文~~の使用はエラー~~を~~引き起こ~~します。
	2406	このような構文を使おうとすると例外が発生します。
1150	2407
1151	2408	=head4 Examples
1152	2409
1153	2410	(例)
1154	2411
1155	2412	=begin original
1156	2413
1157	2414	/[[:digit:]]/ # Matches a character that is a digit.
1158	2415	/[01[:lower:]]/ # Matches a character that is either a
1159	2416	# lowercase letter, or '0' or '1'.
1160		/[[:digit:][:^xdigit:]]/ # Matches a character that can be anything,
	2417	/[[:digit:][:^xdigit:]]/ # Matches a character that can be anything
1161		# but the letters 'a' to 'f' in ~~either~~ ~~case.~~
	2418	# except the letters 'a' to 'f' and 'A' to
1162		# This is because the character ~~class contains~~
	2419	# 'F'. This is because the main character
1163		# all di~~git~~s, and ~~anything~~ t~~hat~~ ~~isn't~~ a
	2420	# class is composed of two POSIX character
1164		# hex ~~digi~~t, re~~sul~~ting ~~in a class c~~onta~~ining~~
	2421	# classes that are ORed together, one that
1165		# all ch~~aract~~ers, but the letters 'a' t~~o 'f'~~
	2422	# matches any digit, and the other that
1166		# and ~~'A'~~ to 'F'.
	2423	# matches anything that isn't a hex digit.
	2424	# The OR adds the digits, leaving only the
	2425	# letters 'a' to 'f' and 'A' to 'F' excluded.
1167	2426
1168	2427	=end original
1169	2428
1170	2429	/[[:digit:]]/ # 数字の文字にマッチングする。
1171	2430	/[01[:lower:]]/ # 小文字、'0'、'1' のいずれかの文字に
1172	2431	# マッチングする。
1173		/[[:digit:][:^xdigit:]]/ # ~~どんな~~文字に~~もマッチングしますが、大文字小文字の~~
	2432	/[[:digit:][:^xdigit:]]/ # 'a' から 'f' と 'A' から 'F' 以外の任意の文字に
1174		# ~~'a' から 'f' を除きます~~。
	2433	# マッチング。これはメインの文字クラスでは二つの
1175		# これは全ての数字~~と 16 進文字でない全ての文字を~~
	2434	# POSIX 文字クラスが OR され、一つは任意の数字に
1176		# 含む文字~~クラス~~なの~~で、このクラスには~~
	2435	# マッチングし、もう一つは 16 進文字でない全ての
1177		# ~~'a'~~ ~~から 'f' および 'A' から 'F'~~ を
	2436	# 文字にマッチングします。OR は数字を加え、
1178		# ~~除く全て~~の~~文字に~~
	2437	# 'a' から 'f' および 'A' から 'F' のみが
1179		# ~~マッチングすることにな~~ります。
	2438	# 除外されて残ります。
1180	2439
1181		=head2 ~~Local~~e~~, U~~n~~ico~~de and ~~UTF-8~~
	2440	=head3 Extended Bracketed Character Classes
	2441	X<character class>
	2442	X<set operations>
1182	2443
1183		(~~ロケール、Unicode、UTF-8~~)
	2444	(拡張大かっこ文字クラス)
1184	2445
1185	2446	=begin original
1186	2447
1187		~~Some~~ of the character classes have a ~~som~~e~~what~~ differ~~ent~~ ~~behavi~~our de~~pending~~
	2448	This is a fancy bracketed character class that can be used for more
1188		~~on the inte~~rnal encod~~ing~~ ~~of th~~e source s~~tring~~, and t~~he l~~o~~cal~~e that is
	2449	readable and less error-prone classes, and to perform set operations,
1189		in effect.
	2450	such as intersection. An example is
1190	2451
1191	2452	=end original
1192	2453
1193		~~ソース~~文字~~列の内部~~エンコー~~ディングと有効なロケールに依存~~し~~て少し異なった~~
	2454	これはしゃれた大かっこ文字クラスで、より読みやすく、エラーが発生しにくい
1194		~~振る舞いをする文字~~クラス~~もあり~~ます。
	2455	クラスや、交差などの集合演算を実行するために使用できます。
	2456	例は:
1195	2457
	2458	/(?[ \p{Thai} & \p{Digit} ])/
	2459
1196	2460	=begin original
1197	2461
1198		~~C<\~~w>, ~~C<\d>,~~ ~~C<\s>~~ and the ~~POSIX~~ character cla~~sses~~ (and their ~~neg~~ations,
	2462	This will match all the digit characters that are in the Thai script.
1199		including C<\W>, C<\D>, C<\S>) suffer from this behaviour.
1200	2463
1201	2464	=end original
1202	2465
1203		~~C<\w>, C<\d>, C<\s> および POSIX 文字クラ~~ス ~~(および C<\W>, C<\D>, C<\S> を~~
	2466	これは、タイ語スクリプト内のすべての数字と一致します。
1204		含むこれらの否定) はこの振る舞いの影響を受けます。
1205	2467
1206	2468	=begin original
1207	2469
1208		Th~~e rule~~ is ~~that i~~f the source ~~str~~ing i~~s i~~n ~~UTF-~~8 ~~form~~at, the ~~cha~~racter
	2470	This feature became available in Perl 5.18, as experimental; accepted in
1209		~~classes match according to the Unicode properties~~. ~~If the source string~~
	2471	5.36.
1210		isn't, then the character classes match according to whatever locale is
1211		in effect. If there is no locale, they match the ASCII defaults
1212		(52 letters, 10 digits and underscore for C<\w>, 0 to 9 for C<\d>, etc).
1213	2472
1214	2473	=end original
1215	2474
1216		~~ソース文字列が UTF-8 形式なら、文字クラス~~は ~~Unicod~~e 特性に~~従って~~
	2475	この機能は Perl 5.18 で実験的に利用可能になりました;
1217		~~マッチングするという規則~~です。
	2476	5.36 で受け入れられました。
1218		ソース文字列が UTF-8 形式ではなければ、文字クラスはロケールが
1219		有効かどうかに従ってマッチングします。
1220		ロケールがなければ、ASCII のデフォルト (C<\w> では 52 の英字、10 の数字と
1221		下線、C<\d> では 0 から 9など) にマッチングします。
1222	2477
1223	2478	=begin original
1224	2479
1225		This u~~sua~~l~~ly m~~eans ~~that~~ if you are ~~matching again~~st ~~cha~~racters whose C<o~~rd()>~~
	2480	The rules used by L<C<use re 'strict>\|re/'strict' mode> apply to this
1226		~~values are between 128 and 255 in~~c~~lusive, y~~o~~ur charac~~ter c~~lass may ma~~tch
	2481	construct.
1227		or not depending on the current locale, and whether the source string is
1228		in UTF-8 format. The string will be in UTF-8 format if it contains
1229		characters whose C<ord()> value exceeds 255. But a string may be in UTF-8
1230		format without it having such characters.
1231	2482
1232	2483	=end original
1233	2484
1234		~~これは普通、~~C<ord()> の~~値が 128 から 255 の範囲の~~文字に~~マッチングするなら、~~
	2485	L<C<use re 'strict>\|re/'strict' mode> で使われる規則はこの構文に
1235		~~その文字クラスは現在のロケールおよびソース文字列が UTF-8 形式かどうかに~~
	2486	適用されます。
1236		依存してマッチングしたりしなかったりします。
1237		C<ord()> 値が 255 を超える文字が含まれているなら文字列は UTF-8 形式です。
1238		しかしそのような文字がなくても UTF-8 形式かもしれません。
1239	2487
1240	2488	=begin original
1241	2489
1242		~~For~~ ~~port~~a~~bility~~ re~~aso~~ns, it ~~may b~~e better to ~~not us~~e ~~C<\w>, C<\d>, C<\s>~~
	2490	We can extend the example above:
1243		or the POSIX character classes, and use the Unicode properties instead.
1244	2491
1245	2492	=end original
1246	2493
1247		~~移植性~~の~~理由により、C<\w>, C<\d>, C<\s> や POSIX 文字クラスは使わず、~~
	2494	上記の例を拡張できます:
1248		Unicode 特性を使う方が良いです。
1249	2495
1250		=head4 Examp~~les~~
	2496	/(?[ ( \p{Thai} + \p{Lao} ) & \p{Digit} ])/
1251	2497
1252		~~(例)~~
	2498	=begin original
1253	2499
	2500	This matches digits that are in either the Thai or Laotian scripts.
	2501
	2502	=end original
	2503
	2504	これはタイ語またはラオス語のいずれかの数字と一致します。
	2505
1254	2506	=begin original
1255	2507
1256		$str = ~~"\xDF";~~ ~~# $~~str is ~~not in UTF-8 form~~at.
	2508	Notice the white space in these examples. This construct always has
1257		$str ~~=~ /^\w/; # N~~o m~~atch,~~ ~~as $s~~tr isn't in ~~UTF-8 forma~~t.
	2509	the C<E<sol>xx> modifier turned on within it.
1258		$str .= "\x{0e0b}"; # Now $str is in UTF-8 format.
1259		$str =~ /^\w/; # Match! $str is now in UTF-8 format.
1260		chop $str;
1261		$str =~ /^\w/; # Still a match! $str remains in UTF-8 format.
1262	2510
1263	2511	=end original
1264	2512
1265		~~$str = "\xDF"; # $str は UTF-8 形式ではな~~い。
	2513	これらの例の中の空白に注意してください。
1266		$str ~~=~ /^\w/; # マッチ~~ングしない~~; $str は UTF-8 形式ではない~~。
	2514	この構文では、その中では常に C<E<sol>xx> 修飾子がオンになっています。
1267		$str .= "\x{0e0b}"; # ここで $str は UTF-8 形式。
1268		$str =~ /^\w/; # マッチング! $str は UTF-8 形式。
1269		chop $str;
1270		$str =~ /^\w/; # まだマッチング! $str は UTF-8 形式のまま。
1271	2515
	2516	=begin original
	2517
	2518	The available binary operators are:
	2519
	2520	=end original
	2521
	2522	使用可能な 2 項演算子は次のとおりです:
	2523
	2524	& intersection
	2525	+ union
	2526	\| another name for '+', hence means union
	2527	- subtraction (the result matches the set consisting of those
	2528	code points matched by the first operand, excluding any that
	2529	are also matched by the second operand)
	2530	^ symmetric difference (the union minus the intersection). This
	2531	is like an exclusive or, in that the result is the set of code
	2532	points that are matched by either, but not both, of the
	2533	operands.
	2534
	2535	=begin original
	2536
	2537	There is one unary operator:
	2538
	2539	=end original
	2540
	2541	単項演算子が一つあります。
	2542
	2543	! complement
	2544
	2545	=begin original
	2546
	2547	All the binary operators left associate; C<"&"> is higher precedence
	2548	than the others, which all have equal precedence. The unary operator
	2549	right associates, and has highest precedence. Thus this follows the
	2550	normal Perl precedence rules for logical operators. Use parentheses to
	2551	override the default precedence and associativity.
	2552
	2553	=end original
	2554
	2555	すべての二項演算子は左結合です; C<"&"> はその他よりも高い優先順位を持ち、
	2556	それ以外は同等の優先順位を持ちます。
	2557	単項演算子は右結合で、最も高い優先順位を持ちます。
	2558	従って、これは通常の Perl の論理演算子に関する優先順位規則に従います。
	2559	デフォルトの優先順位と結合を上書きするにはかっこを使います。
	2560
	2561	=begin original
	2562
	2563	The main restriction is that everything is a metacharacter. Thus,
	2564	you cannot refer to single characters by doing something like this:
	2565
	2566	=end original
	2567
	2568	主な制限は、すべてがメタ文字であるということです。
	2569	したがって、以下のようにして単一文字を参照することはできません:
	2570
	2571	/(?[ a + b ])/ # Syntax error!
	2572
	2573	=begin original
	2574
	2575	The easiest way to specify an individual typable character is to enclose
	2576	it in brackets:
	2577
	2578	=end original
	2579
	2580	タイプ可能な個々の文字を指定する最も簡単な方法は、次のように
	2581	かっこで囲むことです:
	2582
	2583	/(?[ [a] + [b] ])/
	2584
	2585	=begin original
	2586
	2587	(This is the same thing as C<[ab]>.) You could also have said the
	2588	equivalent:
	2589
	2590	=end original
	2591
	2592	(これはC<[ab]>と同じことです)。
	2593	同じことを言うこともできます:
	2594
	2595	/(?[[ a b ]])/
	2596
	2597	=begin original
	2598
	2599	(You can, of course, specify single characters by using, C<\x{...}>,
	2600	C<\N{...}>, etc.)
	2601
	2602	=end original
	2603
	2604	(もちろん、C<\x{...}> や C<\N{...}> などを使用して 1 文字を
	2605	指定することもできます。)
	2606
	2607	=begin original
	2608
	2609	This last example shows the use of this construct to specify an ordinary
	2610	bracketed character class without additional set operations. Note the
	2611	white space within it. This is allowed because C<E<sol>xx> is
	2612	automatically turned on within this construct.
	2613
	2614	=end original
	2615
	2616	この最後の例では、この構文を使用して、追加の集合操作なしで
	2617	通常の大かっこ文字クラスを指定する方法を示しています。
	2618	この中に空白があることに注意してください。
	2619	C<E<sol>xx> は、この構文の内側で自動的に有効になるのでこれが許されます。
	2620
	2621	=begin original
	2622
	2623	All the other escapes accepted by normal bracketed character classes are
	2624	accepted here as well.
	2625
	2626	=end original
	2627
	2628	通常の大かっこ文字クラスで受け入れられる他のエスケープは
	2629	すべてここでも受け入れられます。
	2630
	2631	=begin original
	2632
	2633	Because this construct compiles under
	2634	L<C<use re 'strict>\|re/'strict' mode>, unrecognized escapes that
	2635	generate warnings in normal classes are fatal errors here, as well as
	2636	all other warnings from these class elements, as well as some
	2637	practices that don't currently warn outside C<re 'strict'>. For example
	2638	you cannot say
	2639
	2640	=end original
	2641
	2642	この構文は L<C<use re 'strict>\|re/'strict' mode> の下でコンパイルされるので、
	2643	通常のクラスで警告を生成する
	2644	認識されないエスケープはここでは致命的なエラーです;
	2645	これらのクラス要素からのその他すべての警告も同様で、
	2646	C<re 'strict'> の外側では、現在警告していないいくつかのプラクティスも
	2647	同様です。
	2648	例えば次のようにはできません:
	2649
	2650	/(?[ [ \xF ] ])/ # Syntax error!
	2651
	2652	=begin original
	2653
	2654	You have to have two hex digits after a braceless C<\x> (use a leading
	2655	zero to make two). These restrictions are to lower the incidence of
	2656	typos causing the class to not match what you thought it would.
	2657
	2658	=end original
	2659
	2660	中かっこのない C<\x> の後には 2 桁の 16 進数が必要です(2 桁にするには
	2661	先頭の 0 を使用します)。
	2662	これらの制限は、クラスが想定したものと一致しない原因となる
	2663	タイプミスの発生を減らすためです。
	2664
	2665	=begin original
	2666
	2667	If a regular bracketed character class contains a C<\p{}> or C<\P{}> and
	2668	is matched against a non-Unicode code point, a warning may be
	2669	raised, as the result is not Unicode-defined. No such warning will come
	2670	when using this extended form.
	2671
	2672	=end original
	2673
	2674	通常の大かっこ文字クラスに C<\p{}> や C<\P{}> が含まれていて、
	2675	非 Unicode 符号位置に対してマッチングした場合、
	2676	結果は Unicode で定義されていないので、警告が発生します。
	2677	このような警告は、拡張形式を使った場合は発生しません。
	2678
	2679	=begin original
	2680
	2681	The final difference between regular bracketed character classes and
	2682	these, is that it is not possible to get these to match a
	2683	multi-character fold. Thus,
	2684
	2685	=end original
	2686
	2687	通常の大かっこ文字クラスとこれらのクラスの最後の違いは、
	2688	これらを複数文字畳み込みにマッチングさせることができないということです。
	2689	従って:
	2690
	2691	/(?[ [\xDF] ])/iu
	2692
	2693	=begin original
	2694
	2695	does not match the string C<ss>.
	2696
	2697	=end original
	2698
	2699	は文字列 C<ss> と一致しません。
	2700
	2701	=begin original
	2702
	2703	You don't have to enclose POSIX class names inside double brackets,
	2704	hence both of the following work:
	2705
	2706	=end original
	2707
	2708	POSIX クラス名を二重かっこで囲む必要はありません;
	2709	そのため、以下の両方とも動作します:
	2710
	2711	/(?[ [:word:] - [:lower:] ])/
	2712	/(?[ [[:word:]] - [[:lower:]] ])/
	2713
	2714	=begin original
	2715
	2716	Any contained POSIX character classes, including things like C<\w> and C<\D>
	2717	respect the C<E<sol>a> (and C<E<sol>aa>) modifiers.
	2718
	2719	=end original
	2720
	2721	C<\w> や C<\D> などの POSIX 文字クラスは、C<E<sol>a>
	2722	(および C<E<sol>aa> )修飾子を尊重します。
	2723
	2724	=begin original
	2725
	2726	Note that C<< (?[ ]) >> is a regex-compile-time construct. Any attempt
	2727	to use something which isn't knowable at the time the containing regular
	2728	expression is compiled is a fatal error. In practice, this means
	2729	just three limitations:
	2730
	2731	=end original
	2732
	2733	C<< (?[ ]) >> はコンパイル時正規表現構文であることに注意してください。
	2734	正規表現を含むコンパイル時に未知のものを使用しようとすると、
	2735	致命的なエラーになります。
	2736	実際には、これは三つの制限を意味します:
	2737
	2738	=over 4
	2739
	2740	=item 1
	2741
	2742	=begin original
	2743
	2744	When compiled within the scope of C<use locale> (or the C<E<sol>l> regex
	2745	modifier), this construct assumes that the execution-time locale will be
	2746	a UTF-8 one, and the generated pattern always uses Unicode rules. What
	2747	gets matched or not thus isn't dependent on the actual runtime locale, so
	2748	tainting is not enabled. But a C<locale> category warning is raised
	2749	if the runtime locale turns out to not be UTF-8.
	2750
	2751	=end original
	2752
	2753	C<use locale> (または C<E<sol>l> 正規表現修飾子)の
	2754	スコープ内でコンパイルされると、この構文は実行時ロケールが
	2755	UTF-8 のものであることを仮定し、
	2756	生成されたパターンは常に Unicode の規則を使います。
	2757	従ってマッチングするかどうかは実際の実行時ロケールには関係なく、
	2758	汚染チェックモードは有効になりません。
	2759	しかし、実行時ロケールが UTF-8 以外になると、
	2760	C<locale> カテゴリの警告が発生します。
	2761
	2762	=item 2
	2763
	2764	=begin original
	2765
	2766	Any
	2767	L<user-defined property\|perlunicode/"User-Defined Character Properties">
	2768	used must be already defined by the time the regular expression is
	2769	compiled (but note that this construct can be used instead of such
	2770	properties).
	2771
	2772	=end original
	2773
	2774	使用される
	2775	L<ユーザー定義特性\|perlunicode/"User-Defined Character Properties"> は、
	2776	正規表現がコンパイルされるときにすでに定義されている必要があります
	2777	(ただし、このような特性の代わりにこの構文を使用することもできます)。
	2778
	2779	=item 3
	2780
	2781	=begin original
	2782
	2783	A regular expression that otherwise would compile
	2784	using C<E<sol>d> rules, and which uses this construct will instead
	2785	use C<E<sol>u>. Thus this construct tells Perl that you don't want
	2786	C<E<sol>d> rules for the entire regular expression containing it.
	2787
	2788	=end original
	2789
	2790	C<E<sol>d> 規則を使用してコンパイルされ、この構文を使用する正規表現は、
	2791	代わりに C<E<sol>u> を使用します。
	2792	したがって、この構文は、C<E<sol>d> 規則が含まれている
	2793	正規表現全体に対して C<E<sol>d> 規則が必要ないことを Perl に通知します。
	2794
	2795	=back
	2796
	2797	=begin original
	2798
	2799	Note that skipping white space applies only to the interior of this
	2800	construct. There must not be any space between any of the characters
	2801	that form the initial C<(?[>. Nor may there be space between the
	2802	closing C<])> characters.
	2803
	2804	=end original
	2805
	2806	空白のスキップは、この構造体の内部にのみ適用されることに注意してください。
	2807	最初の C<(?[> を形成する文字の間に空白を入れることはできません。
	2808	また、終わりの C<])> 文字の間に空白を入れることもできません。
	2809
	2810	=begin original
	2811
	2812	Just as in all regular expressions, the pattern can be built up by
	2813	including variables that are interpolated at regex compilation time.
	2814	But currently each such sub-component should be an already-compiled
	2815	extended bracketed character class.
	2816
	2817	=end original
	2818
	2819	すべての正規表現と同様に、正規表現コンパイル時に補完される変数を
	2820	含めることでパターンを構築できます。
	2821	しかし、現在の所、このような部分要素のそれぞれは
	2822	すでにコンパイルされた拡張大かっこ文字クラスであるべきです。
	2823
	2824	my $thai_or_lao = qr/(?[ \p{Thai} + \p{Lao} ])/;
	2825	...
	2826	qr/(?[ \p{Digit} & $thai_or_lao ])/;
	2827
	2828	=begin original
	2829
	2830	If you interpolate something else, the pattern may still compile (or it
	2831	may die), but if it compiles, it very well may not behave as you would
	2832	expect:
	2833
	2834	=end original
	2835
	2836	何か違うものを変数展開すると、パターンはやはりコンパイルされます
	2837	(あるいは die します)が、コンパイルされると、想像しているものと
	2838	かなり違う振る舞いになるかもしれません:
	2839
	2840	my $thai_or_lao = '\p{Thai} + \p{Lao}';
	2841	qr/(?[ \p{Digit} & $thai_or_lao ])/;
	2842
	2843	=begin original
	2844
	2845	compiles to
	2846
	2847	=end original
	2848
	2849	これは次のようにコンパイルされます:
	2850
	2851	qr/(?[ \p{Digit} & \p{Thai} + \p{Lao} ])/;
	2852
	2853	=begin original
	2854
	2855	This does not have the effect that someone reading the source code
	2856	would likely expect, as the intersection applies just to C<\p{Thai}>,
	2857	excluding the Laotian.
	2858
	2859	=end original
	2860
	2861	これは、ソースコードを読んでいる人が期待するような効果はありません;
	2862	なぜなら、この交差は C<\p{Thai}> だけに適用され、ラオス語には
	2863	適用されないからです。
	2864
	2865	=begin original
	2866
	2867	Due to the way that Perl parses things, your parentheses and brackets
	2868	may need to be balanced, even including comments. If you run into any
	2869	examples, please submit them to L<https://github.com/Perl/perl5/issues>,
	2870	so that we can have a concrete example for this man page.
	2871
	2872	=end original
	2873
	2874	Perl の構文解析方法によっては、コメントを含めてもかっこと大かっこの
	2875	バランスを取る必要がある場合があります。
	2876	もし何か例を見つけたら、L<https://github.com/Perl/perl5/issues> に
	2877	登録してください;
	2878	そうすれば、この man ページの具体的な例を得ることができます。
	2879
1272	2880	=begin meta
1273	2881
1274		Translate: SHIRAKATA Kentaro <argrath@ub32.org> (5.10.1)
	2882	Translate: SHIRAKATA Kentaro <argrath@ub32.org> (5.10.1-)
1275	2883	Status: completed
1276	2884
1277	2885	=end meta
1278
1279		=cut

Powered by Amon2, 翻訳, サイト. Operated by Japan Perl Association