URI::Escape 1.35 と 1.54 の差分

1	1
2	2	=encoding euc-jp
3	3
4	4	=head1 NAME
5	5
6	6	=begin original
7	7
8	8	URI::Escape - Escape and unescape unsafe characters
9	9
10	10	=end original
11	11
12	12	URI::Escape - 安全でない文字のエスケープとアンエスケープ
13	13
14	14	=head1 SYNOPSIS
15	15
16	16	use URI::Escape;
17	17	$safe = uri_escape("10% is enough\n");
18	18	$verysafe = uri_escape("foo", "\0-\377");
19	19	$str = uri_unescape($safe);
20	20
21	21	=head1 DESCRIPTION
22	22
23	23	=begin original
24	24
25	25	This module provides functions to escape and unescape URI strings as
26		defined by RFC 2396 ~~(and updated by RFC 2732)~~.
	26	defined by RFC 3986.
27		A URI consists of a restricted set of characters,
28		denoted as C<uric> in RFC 2396. The restricted set of characters
29		consists of digits, letters, and a few graphic symbols chosen from
30		those common to most of the character encodings and input facilities
31		available to Internet users:
32	27
33	28	=end original
34	29
35		このモジュールは RFC 2396 で定義され~~(そして RFC 2732 で更新され)~~ている
	30	このモジュールは RFC 3986 で定義されている
36	31	URI 文字列のエスケープとアンエスケープのための関数を提供します。
37		~~URI は RFC2396 で C<uric> として示された制限された文字集合で構成されます。~~
	33	=begin original
	34
	35	A URI consists of a restricted set of characters. The restricted set
	36	of characters consists of digits, letters, and a few graphic symbols
	37	chosen from those common to most of the character encodings and input
	38	facilities available to Internet users. They are made up of the
	39	"unreserved" and "reserved" character sets as defined in RFC 3986.
	40
	41	=end original
	42
	43	URI は制限された文字集合で構成されます。
38	44	制限された文字集合は数字、文字そしてほとんどの文字エンコーディングと
39	45	インターネットユーザが利用できる入力機能に共通なものから選ばれたいくつかの
40		記号で構成されます:
	46	記号で構成されます。
	47	これらは RFC 3986 で定義されている "unreserved" および "reserved"
	48	文字集合からなります。
41	49
42		"A" .. ~~"Z",~~ "a" .. "z", "0" .. "9",
	50	unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
43		~~";",~~ ~~"/",~~ ~~"?",~~ ":", ~~"@",~~ "&", ~~"=",~~ "+", ~~"$",~~ ",", "[", "]", ~~# reserved~~
	51	reserved = ":" / "/" / "?" / "#" / "[" / "]" / "@"
44		~~"-",~~ ~~"_",~~ ~~".",~~ "!", "~", "*", "'", "(", ")"
	52	"!" / "$" / "&" / "'" / "(" / ")"
	53	/ "*" / "+" / "," / ";" / "="
45	54
46	55	=begin original
47	56
48	57	In addition, any byte (octet) can be represented in a URI by an escape
49	58	sequence: a triplet consisting of the character "%" followed by two
50	59	hexadecimal digits. A byte can also be represented directly by a
51		character, using the US-ASCII character for that octet ~~(iff the~~
	60	character, using the US-ASCII character for that octet.
52		character is part of C<uric>).
53	61
54	62	=end original
55	63
56	64	さらに、すべてのバイト(オクテット)は、"%"とその後に続く 2 文字の 16 進数の
57	65	3 文字からなるエスケープシーケンスによって URI で表すことが出来ます。
58		バイトは US-ASCII 文字を使って直接表すことも出来ます
	66	バイトは US-ASCII 文字を使って直接表すことも出来ます。
59		(もしその文字が C<uric> の一部であれば)。
60	67
61	68	=begin original
62	69
63		Some of the ~~C<uri~~c~~> c~~haracters are I<reserved> for use as delimiters
	70	Some of the characters are I<reserved> for use as delimiters or as
64		~~or as~~ part of certain URI components. These must be escaped if they are
	71	part of certain URI components. These must be escaped if they are to
65		to be treated as ordinary data. Read RFC 2396 for further details.
	72	be treated as ordinary data. Read RFC 3986 for further details.
66	73
67	74	=end original
68	75
69		~~C<uric>~~ 文字のいくつかは区切り文字として、ある URI 構成要素の一部として
	76	文字のいくつかは区切り文字として、ある URI 構成要素の一部として
70	77	使うように予約されています。
71	78	通常のデータとして取り扱いたければ、これらはエスケープされなければなりません。
72		さらなる詳細については RFC 2396 を読んでください。
	79	さらなる詳細については RFC 3986 を読んでください。
73	80
74	81	=begin original
75	82
76	83	The functions provided (and exported by default) from this module are:
77	84
78	85	=end original
79	86
80	87	このモジュールから提供される(そしてデフォルトでエクスポートされる)関数は
81	88	以下の通りです:
82	89
83	90	=over 4
84	91
85	92	=item uri_escape( $string )
86	93
87	94	=item uri_escape( $string, $unsafe )
88	95
89	96	=begin original
90	97
91	98	Replaces each unsafe character in the $string with the corresponding
92	99	escape sequence and returns the result. The $string argument should
93	100	be a string of bytes. The uri_escape() function will croak if given a
94	101	characters with code above 255. Use uri_escape_utf8() if you know you
95	102	have such chars or/and want chars in the 128 .. 255 range treated as
96	103	UTF-8.
97	104
98	105	=end original
99	106
100	107	$string の安全でないそれぞれの文字をエスケープシーケンスで置換し、
101	108	その結果を返します。
102		~~The~~ $string ~~argument should~~
	109	$string 引数はバイト文字列である必要があります。
103		~~be a string of bytes. The~~ uri_escape() ~~function~~ ~~will croak if given a~~
	110	uri_escape() 関数は、255 を超える符号位置を持つ文字が与えられると
104		char~~acters with c~~ode a~~bove 255. Use uri_escape_utf8() if you~~ k~~now~~ ~~you~~
	111	croak します。
105		~~have such chars or/and want chars in the 128 .. 255 range treated as~~
	112	そのような文字が含まれていることが分かっている場合、あるいは
106		UTF-8.
	113	128 ... 255 の範囲を UTF-8 として扱いたい場合は
107		(~~TBT~~)
	114	uri_escape_utf8() を使ってください。
108	115
109	116	=begin original
110	117
111	118	The uri_escape() function takes an optional second argument that
112	119	overrides the set of characters that are to be escaped. The set is
113	120	specified as a string that can be used in a regular expression
114	121	character class (between [ ]). E.g.:
115	122
116	123	=end original
117	124
118	125	uri_escape() 関数は、オプションでエスケープする文字集合を上書きする 2 番目の
119	126	引数を取ります。
120	127	集合は正規表現文字クラス([]の間)で使うことが出来る文字列として
121	128	指定されます。
122	129	例:
123	130
124	131	"\x00-\x1f\x7f-\xff" # all control and hi-bit characters
125	132	"a-z" # all lower case characters
126	133	"^A-Za-z" # everything not a letter
127	134
128	135	=begin original
129	136
130	137	The default set of characters to be escaped is all those which are
131		I<not> part of the C<uric> character class shown above as well ~~as the~~
	138	I<not> part of the C<unreserved> character class shown above as well
132		reserved characters. I.e. the default is:
	139	as the reserved characters. I.e. the default is:
133	140
134	141	=end original
135	142
136		エスケープされるデフォルトの文字セットは上記で示した ~~C<uric> 文字 I<ではない>~~
	143	エスケープされるデフォルトの文字セットは上記で示した
137		部分のすべてと、予約されている文字です。
	144	C<unreserved> 文字 I<ではない> 部分のすべてと、予約されている文字です。
138	145	つまり、デフォルトは:
139	146
140		"^A-Za-z0-9\-_.!~*'()"
	147	"^A-Za-z0-9\-\._~"
141	148
142	149	=item uri_escape_utf8( $string )
143	150
144	151	=item uri_escape_utf8( $string, $unsafe )
145	152
146	153	=begin original
147	154
148	155	Works like uri_escape(), but will encode chars as UTF-8 before
149	156	escaping them. This makes this function able do deal with characters
150	157	with code above 255 in $string. Note that chars in the 128 .. 255
151	158	range will be escaped differently by this function compared to what
152	159	uri_escape() would. For chars in the 0 .. 127 range there is no
153	160	difference.
154	161
155	162	=end original
156	163
157		~~Works like~~ uri_escape(), ~~but will encode chars as UTF-8 before~~
	164	uri_escape() と同様に動作しますが、エスケープする前に、
158		~~escaping~~ ~~them.~~ T~~his~~ ~~makes this function able do deal with characters~~
	165	文字を UTF-8 としてエンコードします。
159		~~with code above 255 in~~ $string. ~~Note that chars in the 1~~2~~8 .. 2~~55
	166	これにより、$string に含まれている符号位置 255 を超える文字を
160		~~range will be escaped differently by this function compared to what~~
	167	扱えるようになります。
161		uri_escape() ~~would. For chars in the 0 .. 127 range there is no~~
	168	128 .. 255 の範囲の文字は uri_escape() と異なる形でエスケープされることに
162		~~difference.~~
	169	注意してください。
163		~~(TBT)~~
	170	0 .. 127 の範囲の文字は変わりません。
164	171
165	172	=begin original
166	173
167	174	The call:
168	175
169	176	=end original
170	177
171	178	以下の呼び出しは:
172	179
173	180	$uri = uri_escape_utf8($string);
174	181
175	182	=begin original
176	183
177	184	will be the same as:
178	185
179	186	=end original
180	187
181	188	以下と同じです:
182	189
183	190	use Encode qw(encode);
184	191	$uri = uri_escape(encode("UTF-8", $string));
185	192
186	193	=begin original
187	194
188	195	but will even work for perl-5.6 for chars in the 128 .. 255 range.
189	196
190	197	=end original
191	198
192	199	しかし文字コード 128 ... 255 に対しても perl-5.6 で動作します。
193	200
194	201	=begin original
195	202
196		Note: Javascript has a function called escape() that produce the
	203	Note: JavaScript has a function called escape() that produces the
197	204	sequence "%uXXXX" for chars in the 256 .. 65535 range. This function
198	205	has really nothing to do with URI escaping but some folks got confused
199	206	since it "does the right thing" in the 0 .. 255 range. Because of
200	207	this you sometimes see "URIs" with these kind of escapes. The
201		JavaScript encodeURI() function is similar to uri_escape_utf8().
	208	JavaScript encodeURIComponent() function is similar to uri_escape_utf8().
202	209
203	210	=end original
204	211
205		~~Note~~: Javascript ~~has~~ a ~~function~~ ~~called~~ ~~escape() that prod~~uce ~~the~~
	212	注意: JavaScript には、256 .. 65535 の範囲の文字に対して "%uXXXX" の
206		se~~quen~~c~~e "%uXXXX" for ch~~a~~rs in th~~e ~~256 .. 65535 range. This function~~
	213	並びを生成する、escape() と呼ばれる関数があります。
207		~~has~~ ~~really nothing to do with~~ URI ~~escaping but some folks got confused~~
	214	この関数は URI エスケープとは全く関係ありませんが、
208		~~since it "does the right thing" in the~~ 0 .. 255 ~~range. Because of~~
	215	0 .. 255 の範囲では「正しいことを行う」ため、
209		~~this you sometimes see "URIs" with these kind of escapes. The~~
	216	一部の人は混乱しています。
210		~~JavaScript~~ ~~encode~~URI() ~~function is similar to uri_escape_utf8().~~
	217	このため、時々このようなエスケープをされた "URI" を目にすることがあります。
211		(~~TBT~~)
	218	JavaScript の encodeURIComponent() 関数は uri_escape_utf8() と似ています。
212	219
213	220	=item uri_unescape($string,...)
214	221
215	222	=begin original
216	223
217	224	Returns a string with each %XX sequence replaced with the actual byte
218	225	(octet).
219	226
220	227	=end original
221	228
222	229	それぞれの %XX シーケンスを実際のバイト(オクテット)に変換した文字列を
223	230	返します。
224	231
225	232	=begin original
226	233
227	234	This does the same as:
228	235
229	236	=end original
230	237
231	238	これは以下のものと同じです:
232	239
233	240	$string =~ s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg;
234	241
235	242	=begin original
236	243
237	244	but does not modify the string in-place as this RE would. Using the
238	245	uri_unescape() function instead of the RE might make the code look
239	246	cleaner and is a few characters less to type.
240	247
241	248	=end original
242	249
243	250	しかし、この正規表現がするように元の文字列は変更しません。
244	251	正規表現の代りに uri_unescape() 関数を使うと、コードは見やすくなり、
245	252	タイプする文字も少し減ります。
246	253
247	254	=begin original
248	255
249	256	In a simple benchmark test I did,
250	257	calling the function (instead of the inline RE above) if a few chars
251	258	were unescaped was something like 40% slower, and something like 700% slower if none were. If
252	259	you are going to unescape a lot of times it might be a good idea to
253	260	inline the RE.
254	261
255	262	=end original
256	263
257	264	簡単なベンチマークテストを行ったところ、アンエスケープする文字が
258	265	2, 3 であれば、(上記のインライン正規表現の代わりに)関数を呼び出すと
259	266	40% ほど遅くなりました。
260	267	暗エスケープする文字が何もなければ 700% ほど遅くなりました。
261	268	何回もアンエスケープするつもりであれば、インラインの正規表現を使うのは
262	269	よい考えかもしれません。
263	270
264	271	=begin original
265	272
266	273	If the uri_unescape() function is passed multiple strings, then each
267	274	one is returned unescaped.
268	275
269	276	=end original
270	277
271	278	uri_unescape() 関数に複数の文字列を渡すと、それぞれがアンエスケープされて
272	279	返されます。
273	280
274	281	=back
275	282
276	283	=begin original
277	284
278	285	The module can also export the C<%escapes> hash, which contains the
279	286	mapping from all 256 bytes to the corresponding escape codes. Lookup
280	287	in this hash is faster than evaluating C<sprintf("%%%02X", ord($byte))>
281	288	each time.
282	289
283	290	=end original
284	291
285	292	モジュールは 256 バイト全てからの対応するエスケープコードへの
286	293	マッピングが入った C<%escape> ハッシュもエクスポートします。
287	294	毎回 C<sprintf("%%%02X", ord($byte))> を評価するよりも、このハッシュを
288	295	検索するほうが速くなります。
289	296
290	297	=head1 SEE ALSO
291	298
292	299	L<URI>
293	300
294	301	=head1 COPYRIGHT
295	302
296	303	Copyright 1995-2004 Gisle Aas.
297	304
298	305	This program is free software; you can redistribute it and/or modify
299	306	it under the same terms as Perl itself.
300	307
301	308	=begin meta
302	309
303	310	Translate: Hippo2000 <GCD00051@nifty.ne.jp> (1.04)
304		Update: SHIRAKATA Kentaro <argrath@ub32.org> (1.35)
	311	Update: SHIRAKATA Kentaro <argrath@ub32.org> (1.35-)
305		Status: ~~in pr~~ogress
	312	Status: completed
306	313
307	314	=end meta
308	315
309	316	=cut

Powered by Amon2, 翻訳, サイト. Operated by Japan Perl Association