perlpacktut 5.38.0 と 5.10.0 の差分

1	1
2		=encoding u~~tf8~~
	2	=encoding euc-jp
3	3
4	4	=head1 NAME
5	5
6	6	=begin original
7	7
8	8	perlpacktut - tutorial on C<pack> and C<unpack>
9	9
10	10	=end original
11	11
12	12	perlpacktut - C<pack> と C<unpack> のチュートリアル
13	13
14	14	=head1 DESCRIPTION
15	15
16	16	=begin original
17	17
18	18	C<pack> and C<unpack> are two functions for transforming data according
19	19	to a user-defined template, between the guarded way Perl stores values
20	20	and some well-defined representation as might be required in the
21	21	environment of a Perl program. Unfortunately, they're also two of
22	22	the most misunderstood and most often overlooked functions that Perl
23	23	provides. This tutorial will demystify them for you.
24	24
25	25	=end original
26	26
27	27	C<pack> と C<unpack> は、ユーザーが定義したテンプレートに従って、
28	28	Perl が値を保管する保護された方法と、Perl プログラムの環境で必要になる
29	29	かもしれないよく定義された表現の間を変換する二つの関数です。
30	30	残念ながら、これらは Perl が提供する関数の中でもっとも誤解され、
31	31	もっとも見落とされやすい関数でもあります。
32	32	このチュートリアルではこれらを分かりやすく説明します。
33	33
34	34	=head1 The Basic Principle
35	35
36	36	(基本原理)
37	37
38	38	=begin original
39	39
40	40	Most programming languages don't shelter the memory where variables are
41	41	stored. In C, for instance, you can take the address of some variable,
42	42	and the C<sizeof> operator tells you how many bytes are allocated to
43	43	the variable. Using the address and the size, you may access the storage
44	44	to your heart's content.
45	45
46	46	=end original
47	47
48	48	多くのプログラミング言語は変数が格納されているメモリを保護していません。
49	49	例えば、C では、ある変数のアドレスを取得できますし、
50	50	C<sizeof> 演算子は変数に何バイト割り当てられているかを返します。
51	51	アドレスとサイズを使って、心臓部にあるストレージにアクセスできます。
52	52
53	53	=begin original
54	54
55	55	In Perl, you just can't access memory at random, but the structural and
56	56	representational conversion provided by C<pack> and C<unpack> is an
57	57	excellent alternative. The C<pack> function converts values to a byte
58	58	sequence containing representations according to a given specification,
59	59	the so-called "template" argument. C<unpack> is the reverse process,
60	60	deriving some values from the contents of a string of bytes. (Be cautioned,
61	61	however, that not all that has been packed together can be neatly unpacked -
62	62	a very common experience as seasoned travellers are likely to confirm.)
63	63
64	64	=end original
65	65
66	66	Perl では、メモリにランダムにアクセスすることはできませんが、C<pack> と
67	67	C<unpack> によって提供される構造的および表現的な変換は素晴らしい
68	68	代替案です。
69	69	C<pack> 関数は、値を、「テンプレート」引数と呼ばれる使用に従った表現を含む
70	70	バイト列に変換します。
71	71	C<unpack> は逆処理で、バイトの並びから値を引き出します。
72	72	(しかし、pack された全てのデータがうまく unpack できるというわけでは
73	73	ないということは注意してください - 経験豊かな旅人が確認しそうな、とても
74	74	一般的な経験です。)
75	75
76	76	=begin original
77	77
78	78	Why, you may ask, would you need a chunk of memory containing some values
79	79	in binary representation? One good reason is input and output accessing
80	80	some file, a device, or a network connection, whereby this binary
81	81	representation is either forced on you or will give you some benefit
82	82	in processing. Another cause is passing data to some system call that
83	83	is not available as a Perl function: C<syscall> requires you to provide
84	84	parameters stored in the way it happens in a C program. Even text processing
85	85	(as shown in the next section) may be simplified with judicious usage
86	86	of these two functions.
87	87
88	88	=end original
89	89
90	90	あなたはどうしてバイナリ表現の中に値が含まれているメモリの塊が
91	91	必要なのか、と問うかもしれません。
92	92	よい理由の一つは、ファイル、デバイス、ネットワーク接続にアクセスする
93	93	入出力で、このバイナリ表現が強制されたものか、処理するためにいくらかの
94	94	利益がある場合です。
95	95	もう一つの原因は、Perl 関数として利用できないシステムコールにデータを
96	96	渡すときです: C<syscall> は C プログラムでのような形で保管された引数を
97	97	提供することを要求します。
98	98	(以下の章で示すように) テキスト処理ですら、これら 2 つの関数を賢明に
99	99	使うことで単純化できます。
100	100
101	101	=begin original
102	102
103	103	To see how (un)packing works, we'll start with a simple template
104	104	code where the conversion is in low gear: between the contents of a byte
105	105	sequence and a string of hexadecimal digits. Let's use C<unpack>, since
106	106	this is likely to remind you of a dump program, or some desperate last
107	107	message unfortunate programs are wont to throw at you before they expire
108	108	into the wild blue yonder. Assuming that the variable C<$mem> holds a
109	109	sequence of bytes that we'd like to inspect without assuming anything
110	110	about its meaning, we can write
111	111
112	112	=end original
113	113
114	114	(un)pack がどのように働くのかを見るために、変換がのろのろと行われる単純な
115	115	テンプレートコードから始めましょう: バイトシーケンスと 16 進数の文字列との
116	116	変換です。
117	117	C<unpack> を使いましょう; なぜならこれはダンププログラムや、
118	118	不幸なプログラムが息を引き取る前にあなたに投げかけることが常となっている
119	119	絶望的な最後のメッセージを思い出させそうだからです。
120	120	変数 C<$mem> に、その意味について何の仮定もおかずに調査したいバイト列が
121	121	入っていると仮定すると、以下のように書きます:
122	122
123	123	my( $hex ) = unpack( 'H*', $mem );
124	124	print "$hex\n";
125	125
126	126	=begin original
127	127
128	128	whereupon we might see something like this, with each pair of hex digits
129	129	corresponding to a byte:
130	130
131	131	=end original
132	132
133	133	するとすぐに、1 バイトに対応して 16 進数 2 文字が対応する、以下のような
134	134	ものが表示されます:
135	135
136	136	41204d414e204120504c414e20412043414e414c2050414e414d41
137	137
138	138	=begin original
139	139
140	140	What was in this chunk of memory? Numbers, characters, or a mixture of
141	141	both? Assuming that we're on a computer where ASCII (or some similar)
142	142	encoding is used: hexadecimal values in the range C<0x40> - C<0x5A>
143	143	indicate an uppercase letter, and C<0x20> encodes a space. So we might
144	144	assume it is a piece of text, which some are able to read like a tabloid;
145	145	but others will have to get hold of an ASCII table and relive that
146	146	firstgrader feeling. Not caring too much about which way to read this,
147	147	we note that C<unpack> with the template code C<H> converts the contents
148	148	of a sequence of bytes into the customary hexadecimal notation. Since
149	149	"a sequence of" is a pretty vague indication of quantity, C<H> has been
150	150	defined to convert just a single hexadecimal digit unless it is followed
151	151	by a repeat count. An asterisk for the repeat count means to use whatever
152	152	remains.
153	153
154	154	=end original
155	155
156	156	このメモリの塊はなんでしょう?
157	157	数値、文字、あるいはそれらの混合でしょうか?
158	158	使っているコンピュータが ASCII エンコーディング (あるいは似たようなもの) を
159	159	使っていると仮定します: C<0x40> - C<0x5A> 範囲の 16 進数は大文字を
160	160	示していて、C<0x20> は空白をエンコードしたものです。
161	161	それで、これは、タブロイドのように読むことのできるテキストの断片と
162	162	仮定できます; その他は ASCII テーブルを持って 1 年生の感覚を思い出す
163	163	必要があります。
164	164	これをどのようにして読むかについてはあまり気にしないことにして、
165	165	C<unpack> のテンプレートコード C<H> はバイト列の内容をいつもの 16 進表記に
166	166	変換することに注目します。
167	167	「列」というのは量についてあいまいなので、
168	168	C<H> は、引き続いて繰り返し回数がない場合は単に 1 つの 16 進数を
169	169	変換するように定義されています。
170	170	繰り返し数でのアスタリスクは、残っているもの全てを使うことを意味します。
171	171
172	172	=begin original
173	173
174	174	The inverse operation - packing byte contents from a string of hexadecimal
175	175	digits - is just as easily written. For instance:
176	176
177	177	=end original
178	178
179	179	逆操作 - 16 進数の文字列からバイトの内容に pack する - は簡単に書けます。
180	180	例えば:
181	181
182		my $s = pack( 'H2' x 10, 30..39 );
	182	my $s = pack( 'H2' x 10, map { "3$_" } ( 0..9 ) );
183	183	print "$s\n";
184	184
185	185	=begin original
186	186
187	187	Since we feed a list of ten 2-digit hexadecimal strings to C<pack>, the
188	188	pack template should contain ten pack codes. If this is run on a computer
189	189	with ASCII character coding, it will print C<0123456789>.
190	190
191	191	=end original
192	192
193	193	16 進で 2 桁の数値を示す文字列 10 個からなるリストを C<pack> に
194	194	渡しているので、pack テンプレートは 10 個の pack コードを含んでいる
195	195	必要があります。
196	196	これが ASCII 文字コードのコンピュータで実行されると、C<0123456789> を
197	197	表示します。
198	198
199	199	=head1 Packing Text
200	200
201	201	(テキストを pack する)
202	202
203	203	=begin original
204	204
205	205	Let's suppose you've got to read in a data file like this:
206	206
207	207	=end original
208	208
209	209	以下のようなデータファイルを読み込むことを考えます:
210	210
211	211	Date \|Description \| Income\|Expenditure
212		01/24/2001 Zed's Camel Emporium 1147.99
	212	01/24/2001 Ahmed's Camel Emporium 1147.99
213	213	01/28/2001 Flea spray 24.99
214	214	01/29/2001 Camel rides to tourists 235.00
215	215
216	216	=begin original
217	217
218	218	How do we do it? You might think first to use C<split>; however, since
219	219	C<split> collapses blank fields, you'll never know whether a record was
220	220	income or expenditure. Oops. Well, you could always use C<substr>:
221	221
222	222	=end original
223	223
224	224	どうすればいいでしょう?
225	225	最初に思いつくのは C<split> かもしれません;
226	226	しかし、C<split> は空白のフィールドを壊してしまうので、
227	227	そのレコードが収入だったか支出だったが分かりません。あらら。
228	228	では、C<substr> を使うとどうでしょう:
229	229
230	230	while (<>) {
231	231	my $date = substr($_, 0, 11);
232	232	my $desc = substr($_, 12, 27);
233	233	my $income = substr($_, 40, 7);
234	234	my $expend = substr($_, 52, 7);
235	235	...
236	236	}
237	237
238	238	=begin original
239	239
240	240	It's not really a barrel of laughs, is it? In fact, it's worse than it
241	241	may seem; the eagle-eyed may notice that the first field should only be
242	242	10 characters wide, and the error has propagated right through the other
243	243	numbers - which we've had to count by hand. So it's error-prone as well
244	244	as horribly unfriendly.
245	245
246	246	=end original
247	247
248	248	これはあまり愉快ではないですよね?
249	249	実際、これは思ったより悪いです; 注意深い人は最初のフィールドが 10 文字分しか
250	250	なく、エラーが他の数値に拡大してしまう - 手で数えなければなりません -
251	251	ことに気付くでしょう。
252	252	従って、これは恐ろしく不親切であると同様、間違いが発生しやすいです．
253	253
254	254	=begin original
255	255
256	256	Or maybe we could use regular expressions:
257	257
258	258	=end original
259	259
260	260	あるいは正規表現も使えます:
261	261
262	262	while (<>) {
263	263	my($date, $desc, $income, $expend) =
264	264	m\|(\d\d/\d\d/\d{4}) (.{27}) (.{7})(.*)\|;
265	265	...
266	266	}
267	267
268	268	=begin original
269	269
270	270	Urgh. Well, it's a bit better, but - well, would you want to maintain
271	271	that?
272	272
273	273	=end original
274	274
275		うわあ。
	275	うわあ。えーと、少しましです。
276		~~えーと、少しましです、~~しかし - えーと、これを保守したいと思います?
	276	しかし - えーと、これを保守したいと思います?
277	277
278	278	=begin original
279	279
280	280	Hey, isn't Perl supposed to make this sort of thing easy? Well, it does,
281	281	if you use the right tools. C<pack> and C<unpack> are designed to help
282	282	you out when dealing with fixed-width data like the above. Let's have a
283	283	look at a solution with C<unpack>:
284	284
285	285	=end original
286	286
287	287	ねえ、Perl はこの手のことを簡単にできないの?
288	288	ええ、できます、正しい道具を使えば。
289	289	C<pack> と C<unpack> は上記のような固定長データを扱う時の
290	290	助けになるように設計されています。
291	291	C<unpack> による解法を見てみましょう:
292	292
293	293	while (<>) {
294	294	my($date, $desc, $income, $expend) = unpack("A10xA27xA7A*", $_);
295	295	...
296	296	}
297	297
298	298	=begin original
299	299
300	300	That looks a bit nicer; but we've got to take apart that weird template.
301	301	Where did I pull that out of?
302	302
303	303	=end original
304	304
305	305	これはちょっとましに見えます;
306	306	でも変なテンプレートを分析しなければなりません。
307	307	これはどこから来たのでしょう?
308	308
309	309	=begin original
310	310
311	311	OK, let's have a look at some of our data again; in fact, we'll include
312	312	the headers, and a handy ruler so we can keep track of where we are.
313	313
314	314	=end original
315	315
316	316	よろしい、ここでデータをもう一度見てみましょう;
317	317	実際、ヘッダも含めて、何をしているかを追いかけるために
318	318	手書きの目盛りも付けています。
319	319
320	320	1 2 3 4 5
321	321	1234567890123456789012345678901234567890123456789012345678
322	322	Date \|Description \| Income\|Expenditure
323	323	01/28/2001 Flea spray 24.99
324	324	01/29/2001 Camel rides to tourists 235.00
325	325
326	326	=begin original
327	327
328	328	From this, we can see that the date column stretches from column 1 to
329	329	column 10 - ten characters wide. The C<pack>-ese for "character" is
330	330	C<A>, and ten of them are C<A10>. So if we just wanted to extract the
331	331	dates, we could say this:
332	332
333	333	=end original
334	334
335	335	ここから、日付の桁は 1 桁目から 10 桁目まで - 10 文字の幅があることが
336	336	わかります。
337	337	「文字」のパックは C<A> で、10 文字の場合は C<A10> です。
338	338	それで、もし単に日付を展開したいだけなら、以下のように書けます:
339	339
340	340	my($date) = unpack("A10", $_);
341	341
342	342	=begin original
343	343
344	344	OK, what's next? Between the date and the description is a blank column;
345	345	we want to skip over that. The C<x> template means "skip forward", so we
346	346	want one of those. Next, we have another batch of characters, from 12 to
347	347	38. That's 27 more characters, hence C<A27>. (Don't make the fencepost
348	348	error - there are 27 characters between 12 and 38, not 26. Count 'em!)
349	349
350	350	=end original
351	351
352	352	よろしい、次は?
353	353	日付と説明の間には空白の桁があります;これは読み飛ばしたいです。
354	354	C<x> テンプレートは「読み飛ばす」ことを意味し、
355	355	これで 1 文字読み飛ばせます。
356	356	次に、別の文字の塊が 12 桁から 38 桁まであります。
357	357	これは 27 文字あるので、C<A27> です。
358	358	(数え間違えないように - 12 から 38 の間には 26 ではなく 27 文字あります。)
359	359
360	360	=begin original
361	361
362	362	Now we skip another character and pick up the next 7 characters:
363	363
364	364	=end original
365	365
366	366	次の文字は読み飛ばして、次の 7 文字を取り出します:
367	367
368	368	my($date,$description,$income) = unpack("A10xA27xA7", $_);
369	369
370	370	=begin original
371	371
372	372	Now comes the clever bit. Lines in our ledger which are just income and
373	373	not expenditure might end at column 46. Hence, we don't want to tell our
374	374	C<unpack> pattern that we B<need> to find another 12 characters; we'll
375	375	just say "if there's anything left, take it". As you might guess from
376	376	regular expressions, that's what the C<*> means: "use everything
377	377	remaining".
378	378
379	379	=end original
380	380
381	381	ここで少し賢くやりましょう。
382	382	台帳のうち、収入だけがあって支出がない行は 46 行目で終わっています。
383	383	従って、次の 12 文字を見つける B<必要がある> ということを
384	384	C<unpack> パターンに書きたくはありません;
385	385	単に次のようにします「もし何かが残っていれば、それを取ります」。
386	386	正規表現から推測したかもしれませんが、これが C<*> の意味することです:
387	387	「残っているもの全てを使います」。
388	388
389	389	=over 3
390	390
391	391	=item *
392	392
393	393	=begin original
394	394
395	395	Be warned, though, that unlike regular expressions, if the C<unpack>
396	396	template doesn't match the incoming data, Perl will scream and die.
397	397
398	398	=end original
399	399
400		但し、正規表現とは違うことに注意してください~~; もし C<unpack> テンプレートが~~
	400	但し、正規表現とは違うことに注意してください。
401		入力データと一致しない場合、~~Perl は悲鳴をあげて die します。~~
	401	もし C<unpack> テンプレートが入力データと一致しない場合、
	402	Perl は悲鳴をあげて die します。
402	403
403	404	=back
404	405
405	406	=begin original
406	407
407	408	Hence, putting it all together:
408	409
409	410	=end original
410	411
411	412	従って、これを全部あわせると:
412	413
413		my ($date, $description, $income, $expend) =
	414	my($date,$description,$income,$expend) = unpack("A10xA27xA7xA*", $_);
414		unpack("A10xA27xA7xA*", $_);
415	415
416	416	=begin original
417	417
418	418	Now, that's our data parsed. I suppose what we might want to do now is
419	419	total up our income and expenditure, and add another line to the end of
420	420	our ledger - in the same format - saying how much we've brought in and
421	421	how much we've spent:
422	422
423	423	=end original
424	424
425	425	これで、データがパースできます。
426	426	今ほしいものが収入と支出をそれぞれ足し合わせて、台帳の最後に - 同じ形式で -
427	427	1 行付け加えることで、どれだけの収入と支出があったかを記すことだとします:
428	428
429	429	while (<>) {
430		my ($date, $desc, $income, $expend) =
	430	my($date, $desc, $income, $expend) = unpack("A10xA27xA7xA*", $_);
431		unpack("A10xA27xA7xA*", $_);
432	431	$tot_income += $income;
433	432	$tot_expend += $expend;
434	433	}
435	434
436	435	$tot_income = sprintf("%.2f", $tot_income); # Get them into
437	436	$tot_expend = sprintf("%.2f", $tot_expend); # "financial" format
438	437
439	438	$date = POSIX::strftime("%m/%d/%Y", localtime);
440	439
441	440	# OK, let's go:
442	441
443		print pack("A10xA27xA7xA*", $date, "Totals",
	442	print pack("A10xA27xA7xA*", $date, "Totals", $tot_income, $tot_expend);
444		$tot_income, $tot_expend);
445	443
446	444	=begin original
447	445
448	446	Oh, hmm. That didn't quite work. Let's see what happened:
449	447
450	448	=end original
451	449
452	450	あら、ふうむ。
453	451	これはうまく動きません。
454	452	何が起こったのか見てみましょう:
455	453
456		01/24/2001 Zed's Camel Emporium 1147.99
	454	01/24/2001 Ahmed's Camel Emporium 1147.99
457	455	01/28/2001 Flea spray 24.99
458	456	01/29/2001 Camel rides to tourists 1235.00
459	457	03/23/2001Totals 1235.001172.98
460	458
461	459	=begin original
462	460
463	461	OK, it's a start, but what happened to the spaces? We put C<x>, didn't
464	462	we? Shouldn't it skip forward? Let's look at what L<perlfunc/pack> says:
465	463
466	464	=end original
467	465
468	466	まあ、これはスタートです; しかしスペースに何が起きたのでしょう?
469	467	C<x> を指定しましたよね?
470	468	これでは飛ばせない?
471	469	L<perlfunc/pack> に書いていることを見てみましょう:
472	470
473	471	x A null byte.
474	472
475	473	=begin original
476	474
477	475	Urgh. No wonder. There's a big difference between "a null byte",
478	476	character zero, and "a space", character 32. Perl's put something
479	477	between the date and the description - but unfortunately, we can't see
480	478	it!
481	479
482	480	=end original
483	481
484		うわあ。
	482	うはあ。
485	483	当たり前です。
486	484	文字コード 0 の「ヌル文字」と、文字コード 32 の「空白」は全然違います。
487	485	Perl は日付と説明の間に何かを書いたのです - しかし残念ながら、
488	486	それは見えません!
489	487
490	488	=begin original
491	489
492	490	What we actually need to do is expand the width of the fields. The C<A>
493	491	format pads any non-existent characters with spaces, so we can use the
494	492	additional spaces to line up our fields, like this:
495	493
496	494	=end original
497	495
498	496	実際に必要なことはフィールドの幅を増やすことです。
499	497	C<A> フォーマットは存在しない文字を空白でパッディングするので、
500	498	以下のようにフィールドに空白の分だけ桁数を増やします:
501	499
502		print pack("A11 A28 A8 A*", $date, "Totals",
	500	print pack("A11 A28 A8 A*", $date, "Totals", $tot_income, $tot_expend);
503		$tot_income, $tot_expend);
504	501
505	502	=begin original
506	503
507	504	(Note that you can put spaces in the template to make it more readable,
508	505	but they don't translate to spaces in the output.) Here's what we got
509	506	this time:
510	507
511	508	=end original
512	509
513	510	(テンプレートには読みやすくするために空白を入れることができますが、
514	511	出力には反映されないことに注意してください。)
515	512	これで得られたのは以下のものです:
516	513
517		01/24/2001 Zed's Camel Emporium 1147.99
	514	01/24/2001 Ahmed's Camel Emporium 1147.99
518	515	01/28/2001 Flea spray 24.99
519	516	01/29/2001 Camel rides to tourists 1235.00
520	517	03/23/2001 Totals 1235.00 1172.98
521	518
522	519	=begin original
523	520
524	521	That's a bit better, but we still have that last column which needs to
525	522	be moved further over. There's an easy way to fix this up:
526	523	unfortunately, we can't get C<pack> to right-justify our fields, but we
527	524	can get C<sprintf> to do it:
528	525
529	526	=end original
530	527
531	528	これで少し良くなりましたが、まだ、最後の桁をもっと向こうに移動させる
532	529	必要があります。
533	530	これを修正する簡単な方法があります:
534	531	残念ながら C<pack> でフィールドを右寄せにすることは出来ませんが、
535	532	C<sprintf> を使えば出来ます:
536	533
537	534	$tot_income = sprintf("%.2f", $tot_income);
538	535	$tot_expend = sprintf("%12.2f", $tot_expend);
539	536	$date = POSIX::strftime("%m/%d/%Y", localtime);
540		print pack("A11 A28 A8 A*", $date, "Totals",
	537	print pack("A11 A28 A8 A*", $date, "Totals", $tot_income, $tot_expend);
541		$tot_income, $tot_expend);
542	538
543	539	=begin original
544	540
545	541	This time we get the right answer:
546	542
547	543	=end original
548	544
549	545	今度は正しい答えを得られました:
550	546
551	547	01/28/2001 Flea spray 24.99
552	548	01/29/2001 Camel rides to tourists 1235.00
553	549	03/23/2001 Totals 1235.00 1172.98
554	550
555	551	=begin original
556	552
557	553	So that's how we consume and produce fixed-width data. Let's recap what
558	554	we've seen of C<pack> and C<unpack> so far:
559	555
560	556	=end original
561	557
562	558	ということで、これが固定長データを読み書きする方法です。
563	559	ここまでで C<pack> と C<unpack> について見たことを復習しましょう:
564	560
565	561	=over 3
566	562
567	563	=item *
568	564
569	565	=begin original
570	566
571	567	Use C<pack> to go from several pieces of data to one fixed-width
572	568	version; use C<unpack> to turn a fixed-width-format string into several
573	569	pieces of data.
574	570
575	571	=end original
576	572
577	573	複数のデータ片を一つの固定長データにするには C<pack> を使います;
578	574	固定長フォーマット文字列を複数のデータ片にするには C<unpack> を使います。
579	575
580	576	=item *
581	577
582	578	=begin original
583	579
584	580	The pack format C<A> means "any character"; if you're C<pack>ing and
585	581	you've run out of things to pack, C<pack> will fill the rest up with
586	582	spaces.
587	583
588	584	=end original
589	585
590	586	pack フォーマット C<A> は「任意の文字」を意味します; もし C<pack> 中に
591	587	pack するものがなくなったら、C<pack> は残りを空白で埋めます。
592	588
593	589	=item *
594	590
595	591	=begin original
596	592
597	593	C<x> means "skip a byte" when C<unpack>ing; when C<pack>ing, it means
598	594	"introduce a null byte" - that's probably not what you mean if you're
599	595	dealing with plain text.
600	596
601	597	=end original
602	598
603	599	C<unpack> での C<x> は「1 バイト読み飛ばす」ことを意味します;
604	600	C<pack> では、「ヌルバイトを生成する」ことを意味します -
605	601	これは、プレーンテキストを扱っている場合はおそらく望んでいるものでは
606	602	ないでしょう。
607	603
608	604	=item *
609	605
610	606	=begin original
611	607
612	608	You can follow the formats with numbers to say how many characters
613	609	should be affected by that format: C<A12> means "take 12 characters";
614	610	C<x6> means "skip 6 bytes" or "character 0, 6 times".
615	611
616	612	=end original
617	613
618	614	フォーマットの後に数値をつけることで、フォーマットに影響される文字数を
619	615	指定します: C<A12> は「12 文字取る」ことを意味します;
620	616	C<x6> は「6 バイト読み飛ばす」や「ヌルバイト 6 つ」を意味します。
621	617
622	618	=item *
623	619
624	620	=begin original
625	621
626	622	Instead of a number, you can use C<*> to mean "consume everything else
627	623	left".
628	624
629	625	=end original
630	626
631	627	数値の代わりに、C<*> で「残っているもの全てを使う」ことを指定できます。
632	628
633	629	=begin original
634	630
635	631	B<Warning>: when packing multiple pieces of data, C<*> only means
636	632	"consume all of the current piece of data". That's to say
637	633
638	634	=end original
639	635
640	636	B<警告>: 複数のデータ片を pack するとき、C<*> は「現在のデータ片を全て
641	637	含む」という意味だけです。
642	638	これは、以下のようにすると:
643	639
644	640	pack("AA", $one, $two)
645	641
646	642	=begin original
647	643
648	644	packs all of C<$one> into the first C<A*> and then all of C<$two> into
649	645	the second. This is a general principle: each format character
650	646	corresponds to one piece of data to be C<pack>ed.
651	647
652	648	=end original
653	649
654	650	C<$one> の全てを最初の C<A*> に pack し、それから C<$two> の全てを二番目に
655	651	pack します。
656	652	ここに一般的な原則があります: 各フォーマット文字は C<pack> されるデータ片
657	653	一つに対応します。
658	654
659	655	=back
660	656
661	657	=head1 Packing Numbers
662	658
663	659	(数値を pack する)
664	660
665	661	=begin original
666	662
667	663	So much for textual data. Let's get onto the meaty stuff that C<pack>
668	664	and C<unpack> are best at: handling binary formats for numbers. There is,
669	665	of course, not just one binary format - life would be too simple - but
670	666	Perl will do all the finicky labor for you.
671	667
672	668	=end original
673	669
674	670	テキストデータについてはこれくらいです。
675	671	C<pack> と C<unpack> が最良である、いやらしい代物: 数値のためのバイナリ
676	672	フォーマットに進みましょう。
677	673	もちろん、バイナリフォーマットはひとつではありません - 人生はそれほど
678	674	単純ではありません - が、Perl は全ての細かい作業を行います。
679	675
680	676	=head2 Integers
681	677
682	678	(整数)
683	679
684	680	=begin original
685	681
686	682	Packing and unpacking numbers implies conversion to and from some
687	683	I<specific> binary representation. Leaving floating point numbers
688	684	aside for the moment, the salient properties of any such representation
689	685	are:
690	686
691	687	=end original
692	688
693	689	数値を pack や unpack するということは、I<特定の> バイナリ表現との間で
694	690	変換するということを意味します。
695	691	今のところ浮動小数点数は脇にやっておくとすると、このような表現の
696	692	主要な性質としては:
697	693
698	694	=over 4
699	695
700	696	=item *
701	697
702	698	=begin original
703	699
704	700	the number of bytes used for storing the integer,
705	701
706	702	=end original
707	703
708	704	整数の保存に使うバイト数。
709	705
710	706	=item *
711	707
712	708	=begin original
713	709
714	710	whether the contents are interpreted as a signed or unsigned number,
715	711
716	712	=end original
717	713
718	714	内容を符号なし数として解釈するか符号付き数として解釈するか。
719	715
720	716	=item *
721	717
722	718	=begin original
723	719
724	720	the byte ordering: whether the first byte is the least or most
725	721	significant byte (or: little-endian or big-endian, respectively).
726	722
727	723	=end original
728	724
729	725	バイト順序:最初のバイトは最下位バイトか最上位バイトか
730	726	(言い換えると: それぞれリトルエンディアンかビッグエンディアンか)。
731	727
732	728	=back
733	729
734	730	=begin original
735	731
736	732	So, for instance, to pack 20302 to a signed 16 bit integer in your
737	733	computer's representation you write
738	734
739	735	=end original
740	736
741	737	それで、例えば、20302 をあなたのコンピュータの符号付き 16 ビット整数に
742	738	pack するとすると、以下のように書きます:
743	739
744	740	my $ps = pack( 's', 20302 );
745	741
746	742	=begin original
747	743
748	744	Again, the result is a string, now containing 2 bytes. If you print
749	745	this string (which is, generally, not recommended) you might see
750	746	C<ON> or C<NO> (depending on your system's byte ordering) - or something
751	747	entirely different if your computer doesn't use ASCII character encoding.
752	748	Unpacking C<$ps> with the same template returns the original integer value:
753	749
754	750	=end original
755	751
756	752	再び、結果は 2 バイトからなる文字列です。
757	753	もしこの文字列を表示する(これは一般的にはお勧めできません)と、
758	754	C<ON> か C<NO> (システムのバイト順に依存します) - または、もし
759	755	コンピューターが ASCII 文字エンコーディングを使っていないなら全く違う
760	756	文字列が表示されます。
761	757	C<$ps> を同じテンプレートで unpack すると、元の整数値が返ります:
762	758
763	759	my( $s ) = unpack( 's', $ps );
764	760
765	761	=begin original
766	762
767	763	This is true for all numeric template codes. But don't expect miracles:
768	764	if the packed value exceeds the allotted byte capacity, high order bits
769	765	are silently discarded, and unpack certainly won't be able to pull them
770	766	back out of some magic hat. And, when you pack using a signed template
771	767	code such as C<s>, an excess value may result in the sign bit
772	768	getting set, and unpacking this will smartly return a negative value.
773	769
774	770	=end original
775	771
776	772	これは全ての数値テンプレートコードに対して真です。
777	773	しかし奇跡を期待してはいけません:
778	774	もし pack された値が割り当てられたバイト容量を超えると、高位ビットは
779	775	黙って捨てられ、unpack は確実に魔法の帽子からデータを
780	776	取り出すことができません。
781	777	そして、C<s> のような符号付きテンプレートコードを使って pack すると、
782	778	超えた値が符号ビットをセットすることになり、unpack すると負の値が
783	779	返されることになるかもしれません。
784	780
785	781	=begin original
786	782
787	783	16 bits won't get you too far with integers, but there is C<l> and C<L>
788	784	for signed and unsigned 32-bit integers. And if this is not enough and
789	785	your system supports 64 bit integers you can push the limits much closer
790	786	to infinity with pack codes C<q> and C<Q>. A notable exception is provided
791	787	by pack codes C<i> and C<I> for signed and unsigned integers of the
792	788	"local custom" variety: Such an integer will take up as many bytes as
793	789	a local C compiler returns for C<sizeof(int)>, but it'll use I<at least>
794	790	32 bits.
795	791
796	792	=end original
797	793
798	794	16 ビットは整数に十分とは言えませんが、符号付きと符号なしの 32 ビット
799	795	整数のための C<l> と C<L> もあります。
800	796	そして、これで十分ではなく、システムが 64 ビット整数に対応しているなら、
801	797	pack コード C<q> と C<Q> を使って限界をほぼ無限にまで押しやることができます。
802	798	注目すべき例外は pack コード C<i> と C<I> で、「ローカルに特化した」
803	799	符号付きと符号なしの整数を提供します: このような整数は C<sizeof(int)> と
804	800	したときにローカルな C コンパイラが返す値と同じバイト数ですが、
805	801	I<少なくとも> 32 ビットを使います。
806	802
807	803	=begin original
808	804
809	805	Each of the integer pack codes C<sSlLqQ> results in a fixed number of bytes,
810	806	no matter where you execute your program. This may be useful for some
811	807	applications, but it does not provide for a portable way to pass data
812	808	structures between Perl and C programs (bound to happen when you call
813	809	XS extensions or the Perl function C<syscall>), or when you read or
814	810	write binary files. What you'll need in this case are template codes that
815	811	depend on what your local C compiler compiles when you code C<short> or
816	812	C<unsigned long>, for instance. These codes and their corresponding
817	813	byte lengths are shown in the table below. Since the C standard leaves
818	814	much leeway with respect to the relative sizes of these data types, actual
819	815	values may vary, and that's why the values are given as expressions in
820	816	C and Perl. (If you'd like to use values from C<%Config> in your program
821	817	you have to import it with C<use Config>.)
822	818
823	819	=end original
824	820
825	821	整数 pack コード C<sSlLqQ> のそれぞれは、どこでプログラムが
826	822	実行されたとしても固定長のバイト列となります。
827	823	これは一部のアプリケーションでは有用ですが、(XS エクステンションや
828	824	Perl 関数 C<syscall> を呼び出すときに必要となる)
829	825	Perl と C のプログラムの間でデータ構造を渡す場合や、バイナリファイルを
830		読み書きするときの、移植性のある手段は提供しません。
	826	読み書きするときの、移植性のある手段は提供しません
831	827	この場合に必要なものは、例えば、C<short> や C<unsigned long> と書いたときに
832	828	ローカルの C コンパイラがどのようにコンパイルするかに依存する
833	829	テンプレートコードです。
834	830	これらのコードと、それに対応するバイト長は以下のテーブルの様になります。
835	831	C 標準はそれぞれのデータ型の大きさの点で多くの自由裁量を残しているので、
836	832	実際の値は異なるかもしれず、そしてこれがなぜ値が C と Perl の式として
837	833	与えられているかの理由です。
838	834	(もしプログラムで C<%Config> の値を使いたい場合は、C<use Config> として
839	835	これをインポートする必要があります。)
840	836
841	837	=begin original
842	838
843	839	signed unsigned byte length in C byte length in Perl
844	840	s! S! sizeof(short) $Config{shortsize}
845	841	i! I! sizeof(int) $Config{intsize}
846	842	l! L! sizeof(long) $Config{longsize}
847	843	q! Q! sizeof(long long) $Config{longlongsize}
848	844
849	845	=end original
850	846
851	847	符号付き符号なし C でのバイト長 Perl でのバイト長
852	848	s! S! sizeof(short) $Config{shortsize}
853	849	i! I! sizeof(int) $Config{intsize}
854	850	l! L! sizeof(long) $Config{longsize}
855	851	q! Q! sizeof(long long) $Config{longlongsize}
856	852
857	853	=begin original
858	854
859	855	The C<i!> and C<I!> codes aren't different from C<i> and C<I>; they are
860	856	tolerated for completeness' sake.
861	857
862	858	=end original
863	859
864	860	C<i!> および C<I!> は C<i> および C<I> と違いはありません; これらは
865	861	完全性のために許容されています。
866	862
867	863	=head2 Unpacking a Stack Frame
868	864
869	865	(スタックフレームを unpack する)
870	866
871	867	=begin original
872	868
873	869	Requesting a particular byte ordering may be necessary when you work with
874	870	binary data coming from some specific architecture whereas your program could
875	871	run on a totally different system. As an example, assume you have 24 bytes
876	872	containing a stack frame as it happens on an Intel 8086:
877	873
878	874	=end original
879	875
880	876	特定のアーキテクチャから来たバイナリに対して作業をする一方、プログラムが
881	877	全く違うシステムで動いている場合、特定のバイト順序の要求が必要になります。
882	878	例として、Intel 8086 のスタックフレームを含む 24 バイトを仮定します:
883	879
884	880	+---------+ +----+----+ +---------+
885	881	TOS: \| IP \| TOS+4:\| FL \| FH \| FLAGS TOS+14:\| SI \|
886	882	+---------+ +----+----+ +---------+
887	883	\| CS \| \| AL \| AH \| AX \| DI \|
888	884	+---------+ +----+----+ +---------+
889	885	\| BL \| BH \| BX \| BP \|
890	886	+----+----+ +---------+
891	887	\| CL \| CH \| CX \| DS \|
892	888	+----+----+ +---------+
893	889	\| DL \| DH \| DX \| ES \|
894	890	+----+----+ +---------+
895	891
896	892	=begin original
897	893
898	894	First, we note that this time-honored 16-bit CPU uses little-endian order,
899	895	and that's why the low order byte is stored at the lower address. To
900	896	unpack such a (unsigned) short we'll have to use code C<v>. A repeat
901	897	count unpacks all 12 shorts:
902	898
903	899	=end original
904	900
905	901	まず、この伝統のある 16 ビット CPU はリトルエンディアンを使っていて、
906	902	それが低位バイトが低位アドレスに格納されている理由であることに
907	903	注意します。
908	904	このような(符号付き)short を unpack するには、コード C<v> を使う必要が
909	905	あるでしょう。
910	906	繰り返し数によって、12 全ての short を unpack します。
911	907
912		my( $ip, $cs, $flags, $ax, $bx, $cx, $dx, $si, $di, $bp, $ds, $es ) =
	908	my( $ip, $cs, $flags, $ax, $bx, $cd, $dx, $si, $di, $bp, $ds, $es ) =
913	909	unpack( 'v12', $frame );
914	910
915	911	=begin original
916	912
917	913	Alternatively, we could have used C<C> to unpack the individually
918	914	accessible byte registers FL, FH, AL, AH, etc.:
919	915
920	916	=end original
921	917
922	918	あるいは、FL, FH, AL, AH といったバイトレジスタに個々にアクセスできるように
923	919	unpack するための C<C> もあります:
924	920
925	921	my( $fl, $fh, $al, $ah, $bl, $bh, $cl, $ch, $dl, $dh ) =
926	922	unpack( 'C10', substr( $frame, 4, 10 ) );
927	923
928	924	=begin original
929	925
930	926	It would be nice if we could do this in one fell swoop: unpack a short,
931	927	back up a little, and then unpack 2 bytes. Since Perl I<is> nice, it
932	928	proffers the template code C<X> to back up one byte. Putting this all
933	929	together, we may now write:
934	930
935	931	=end original
936	932
937	933	これを 1 回で行えたら素敵でしょう: short を unpack して、少し戻って、
938	934	それから 2 バイト unpack します。
939	935	Perl は I<素敵> なので、1 バイト戻るテンプレートコード C<X> を
940	936	提供しています。
941	937	これら全てを一緒にすると、以下のように書けます:
942	938
943	939	my( $ip, $cs,
944	940	$flags,$fl,$fh,
945	941	$ax,$al,$ah, $bx,$bl,$bh, $cx,$cl,$ch, $dx,$dl,$dh,
946	942	$si, $di, $bp, $ds, $es ) =
947	943	unpack( 'v2' . ('vXXCC' x 5) . 'v5', $frame );
948	944
949	945	=begin original
950	946
951	947	(The clumsy construction of the template can be avoided - just read on!)
952	948
953	949	=end original
954	950
955	951	(この不細工なテンプレート構造は避けられます - 読み進めてください!)
956	952
957	953	=begin original
958	954
959	955	We've taken some pains to construct the template so that it matches
960	956	the contents of our frame buffer. Otherwise we'd either get undefined values,
961	957	or C<unpack> could not unpack all. If C<pack> runs out of items, it will
962	958	supply null strings (which are coerced into zeroes whenever the pack code
963	959	says so).
964	960
965	961	=end original
966	962
967	963	テンプレートを構築するのに少し苦労したのは、フレームバッファの内容に
968	964	一致させるためです。
969	965	さもなければ未定義値を受け取ることになるか、あるいは
970	966	C<unpack> は何も unpack できなくなります。
971	967	もし C<pack> で要素がなくなったら、空文字列を補います
972	968	(pack コードがそうするように言っていれば、ゼロに強制されます)。
973	969
974	970	=head2 How to Eat an Egg on a Net
975	971
976	972	(インターネットの卵の食べ方)
977	973
978	974	=begin original
979	975
980	976	The pack code for big-endian (high order byte at the lowest address) is
981	977	C<n> for 16 bit and C<N> for 32 bit integers. You use these codes
982	978	if you know that your data comes from a compliant architecture, but,
983	979	surprisingly enough, you should also use these pack codes if you
984	980	exchange binary data, across the network, with some system that you
985	981	know next to nothing about. The simple reason is that this
986	982	order has been chosen as the I<network order>, and all standard-fearing
987	983	programs ought to follow this convention. (This is, of course, a stern
988	984	backing for one of the Lilliputian parties and may well influence the
989	985	political development there.) So, if the protocol expects you to send
990	986	a message by sending the length first, followed by just so many bytes,
991	987	you could write:
992	988
993	989	=end original
994	990
995	991	ビッグエンディアン(最下位アドレスが最上位バイト)での pack コードは、
996	992	16 ビット整数が C<n>、 32 ビット整数が C<N> です。
997	993	もし準拠したアーキテクチャからデータが来ることが分かっているなら
998	994	これらのコードを使います;
999	995	もしネットワークを通して何も知らない他のシステムとバイナリデータを
1000	996	交換する場合にもこれらの pack コードを使うべきです。
1001	997	理由は単純で、この順序が I<ネットワーク順序> として選ばれていて、標準を
1002	998	恐れる全てのプログラムがこの慣例に従っているはずだからです。
1003	999	(これはもちろん小人族の一行の一人の厳しい支援で、政治的な発展に
1004	1000	影響を与えています。)
1005	1001	それで、もし何バイトあるかの長さを先に送ることでメッセージを送ることを
1006	1002	プロトコルが想定しているなら、以下のように書けます:
1007	1003
1008	1004	my $buf = pack( 'N', length( $msg ) ) . $msg;
1009	1005
1010	1006	=begin original
1011	1007
1012	1008	or even:
1013	1009
1014	1010	=end original
1015	1011
1016	1012	あるいは:
1017	1013
1018	1014	my $buf = pack( 'NA*', length( $msg ), $msg );
1019	1015
1020	1016	=begin original
1021	1017
1022	1018	and pass C<$buf> to your send routine. Some protocols demand that the
1023	1019	count should include the length of the count itself: then just add 4
1024		to the data length. (But make sure to read L</"Lengths and Widths"> before
	1020	to the data length. (But make sure to read L<"Lengths and Widths"> before
1025	1021	you really code this!)
1026	1022
1027	1023	=end original
1028	1024
1029	1025	そして C<$buf> を送信ルーチンに渡します。
1030	1026	カウントに、カウント自身の長さも含むことを要求しているプロトコルも
1031	1027	あります: その時は単にデータ長に 4 を足してください。
1032		(しかし、実際にこれをコーディングする前に L</"Lengths and Widths"> を
	1028	(しかし、実際にこれをコーディングする前に L<"Lengths and Widths"> を
1033	1029	読んでください!)
1034	1030
1035	1031	=head2 Byte-order modifiers
1036	1032
1037	1033	(バイト順修飾子)
1038	1034
1039	1035	=begin original
1040	1036
1041	1037	In the previous sections we've learned how to use C<n>, C<N>, C<v> and
1042	1038	C<V> to pack and unpack integers with big- or little-endian byte-order.
1043	1039	While this is nice, it's still rather limited because it leaves out all
1044	1040	kinds of signed integers as well as 64-bit integers. For example, if you
1045	1041	wanted to unpack a sequence of signed big-endian 16-bit integers in a
1046	1042	platform-independent way, you would have to write:
1047	1043
1048	1044	=end original
1049	1045
1050	1046	以前の章で、ビッグエンディアンとリトルエンディアンのバイト順の整数を
1051	1047	pack および unpack するための C<n>, C<N>, C<v>, C<V> の使い方を学びました。
1052	1048	これは素敵ですが、全ての種類の符号付き整数や、64 ビット整数が
1053	1049	外れているので、まだいくらか制限されたものです。
1054	1050	例えば、ビッグエンディアンの符号付き整数の並びをプラットフォームに
1055	1051	依存しない方法で unpack したいとすると、以下のように書かなければなりません:
1056	1052
1057	1053	my @data = unpack 's', pack 'S', unpack 'n*', $buf;
1058	1054
1059	1055	=begin original
1060	1056
1061	1057	This is ugly. As of Perl 5.9.2, there's a much nicer way to express your
1062	1058	desire for a certain byte-order: the C<E<gt>> and C<E<lt>> modifiers.
1063	1059	C<E<gt>> is the big-endian modifier, while C<E<lt>> is the little-endian
1064	1060	modifier. Using them, we could rewrite the above code as:
1065	1061
1066	1062	=end original
1067	1063
1068	1064	これは醜いです。
1069	1065	Perl 5.9.2 から、バイト順に関して望み通りに記述するための、遥かに
1070	1066	素敵な方法があります: C<E<gt>> と C<E<lt>> の修飾子です。
1071	1067	C<E<gt>> はビッグエンディアン修飾子で、C<E<lt>> は
1072	1068	リトルエンディアン修飾子です。
1073	1069	これらを使うと、上述のコードは以下のように書き換えられます:
1074	1070
1075	1071	my @data = unpack 's>*', $buf;
1076	1072
1077	1073	=begin original
1078	1074
1079	1075	As you can see, the "big end" of the arrow touches the C<s>, which is a
1080	1076	nice way to remember that C<E<gt>> is the big-endian modifier. The same
1081	1077	obviously works for C<E<lt>>, where the "little end" touches the code.
1082	1078
1083	1079	=end original
1084	1080
1085	1081	見た通り、不等号の「大きい側」が C<s> に向いていて、C<E<gt>> が
1086	1082	ビッグエンディアン修飾子であることを覚える素敵な方法となっています。
1087	1083	明らかに同じことが、「小さい側」がコードに向いている C<E<lt>> にも働きます。
1088	1084
1089	1085	=begin original
1090	1086
1091	1087	You will probably find these modifiers even more useful if you have
1092	1088	to deal with big- or little-endian C structures. Be sure to read
1093		L</"Packing and Unpacking C Structures"> for more on that.
	1089	L<"Packing and Unpacking C Structures"> for more on that.
1094	1090
1095	1091	=end original
1096	1092
1097	1093	おそらく、これらの修飾子はビッグエンディアンやリトルエンディアンの C 構造体を
1098	1094	扱うときにもっと便利であることに気付くでしょう。
1099		これに関する詳細は、L</"Packing and Unpacking C Structures"> を
	1095	これに関する詳細は、L<"Packing and Unpacking C Structures"> を
1100	1096	よく読んでください。
1101	1097
1102	1098	=head2 Floating point Numbers
1103	1099
1104	1100	(浮動小数点数)
1105	1101
1106	1102	=begin original
1107	1103
1108	1104	For packing floating point numbers you have the choice between the
1109	1105	pack codes C<f>, C<d>, C<F> and C<D>. C<f> and C<d> pack into (or unpack
1110	1106	from) single-precision or double-precision representation as it is provided
1111	1107	by your system. If your systems supports it, C<D> can be used to pack and
1112		unpack (C<long double>) ~~values~~, which ~~can offer even more resolution~~
	1108	unpack extended-precision floating point values (C<long double>), which
1113		than C<f> or C<d>. B<~~Note~~ that ~~there~~ a~~re differe~~nt ~~long double formats.~~>
	1109	can offer even more resolution than C<f> or C<d>. C<F> packs an C<NV>,
	1110	which is the floating point type used by Perl internally. (There
	1111	is no such thing as a network representation for reals, so if you want
	1112	to send your real numbers across computer boundaries, you'd better stick
	1113	to ASCII representation, unless you're absolutely sure what's on the other
	1114	end of the line. For the even more adventuresome, you can use the byte-order
	1115	modifiers from the previous section also on floating point codes.)
1114	1116
1115	1117	=end original
1116	1118
1117	1119	浮動小数点数を pack するには、pack コード C<f>, C<d>, C<F>, C<D> の
1118	1120	選択肢があります。
1119	1121	C<f> と C<d> pack はシステムで提供されている単精度と倍精度の実数に
1120	1122	pack (あるいは unpack) します。
1121	1123	もしシステムが対応していれば、C<f> や C<d> より精度のある、
1122		(C<long double>) 値の pack および unpack のために
	1124	拡張精度浮動小数点数 (C<long double>) の pack および unpack のために
1123	1125	C<D> が使えます。
1124		B<long double 型式は異なることに注意してください。>
1125
1126		=begin original
1127
1128		C<F> packs an C<NV>, which is the floating point type used by Perl
1129		internally.
1130
1131		=end original
1132
1133	1126	C<F> は、Perl が内部で使用している浮動小数点型である C<NV> を pack します。
	1127	(実数に対してはネットワーク表現のようなものはないので、もし他の
1135		=begin original
1136
1137		There is no such thing as a network representation for reals, so if
1138		you want to send your real numbers across computer boundaries, you'd
1139		better stick to text representation, possibly using the hexadecimal
1140		float format (avoiding the decimal conversion loss), unless you're
1141		absolutely sure what's on the other end of the line. For the even more
1142		adventuresome, you can use the byte-order modifiers from the previous
1143		section also on floating point codes.
1144
1145		=end original
1146
1147		実数に対してはネットワーク表現のようなものはないので、もし他の
1148	1128	コンピュータに実数を送りたい場合は、ネットワークの向こう側で何が起きるかが
1149		完全に分かっているのでない限りは、~~テキスト~~表現
	1129	完全に分かっているのでない限りは、ASCII 表現で我慢した方がよいです。
1150		(できれば(10 進変換のロスを防ぐために)16 進浮動小数点形式で) を守った方が
1151		よいです。
1152	1130	より冒険的な場合でも、以前の章で触れたバイト順修飾子を浮動小数点コードにも
1153		使えます。
	1131	使えます。)
1154	1132
1155	1133	=head1 Exotic Templates
1156	1134
1157	1135	(風変わりなテンプレート)
1158	1136
1159	1137	=head2 Bit Strings
1160	1138
1161	1139	(ビット文字列)
1162	1140
1163	1141	=begin original
1164	1142
1165	1143	Bits are the atoms in the memory world. Access to individual bits may
1166	1144	have to be used either as a last resort or because it is the most
1167	1145	convenient way to handle your data. Bit string (un)packing converts
1168	1146	between strings containing a series of C<0> and C<1> characters and
1169	1147	a sequence of bytes each containing a group of 8 bits. This is almost
1170	1148	as simple as it sounds, except that there are two ways the contents of
1171	1149	a byte may be written as a bit string. Let's have a look at an annotated
1172	1150	byte:
1173	1151
1174	1152	=end original
1175	1153
1176	1154	ビットはメモリの世界の原子です。
1177	1155	個々のビットへのアクセスは、最後の手段として行われるか、それがデータを
1178	1156	扱うのに最も便利な方法であるときに行われます。
1179	1157	(un)pack したビット文字列は、C<0> と C<1> の文字からなる文字列と、
1180	1158	それぞれ 8 ビットを含むバイト列とを変換します。
1181	1159	バイトの内容をビット文字列として書くには 2 つの方法があるということを除けば、
1182	1160	これはほとんど見たままの単純さです。
1183	1161	以下の注釈付きのバイトを見てみましょう:
1184	1162
1185	1163	7 6 5 4 3 2 1 0
1186	1164	+-----------------+
1187	1165	\| 1 0 0 0 1 1 0 0 \|
1188	1166	+-----------------+
1189	1167	MSB LSB
1190	1168
1191	1169	=begin original
1192	1170
1193	1171	It's egg-eating all over again: Some think that as a bit string this should
1194	1172	be written "10001100" i.e. beginning with the most significant bit, others
1195	1173	insist on "00110001". Well, Perl isn't biased, so that's why we have two bit
1196	1174	string codes:
1197	1175
1198	1176	=end original
1199	1177
1200	1178	卵の食べ方の繰り返しです: これは "10001100" というビット文字列になるべき、
1201	1179	つまり最上位ビットから始めるべき、と考える人もいますし、
1202	1180	"00110001" と主張する人もいます。
1203	1181	ええ、Perl は偏向していないので、これが 2 つのビット文字列コードがある
1204	1182	理由です:
1205	1183
1206	1184	$byte = pack( 'B8', '10001100' ); # start with MSB
1207	1185	$byte = pack( 'b8', '00110001' ); # start with LSB
1208	1186
1209	1187	=begin original
1210	1188
1211	1189	It is not possible to pack or unpack bit fields - just integral bytes.
1212	1190	C<pack> always starts at the next byte boundary and "rounds up" to the
1213	1191	next multiple of 8 by adding zero bits as required. (If you do want bit
1214	1192	fields, there is L<perlfunc/vec>. Or you could implement bit field
1215	1193	handling at the character string level, using split, substr, and
1216	1194	concatenation on unpacked bit strings.)
1217	1195
1218	1196	=end original
1219	1197
1220	1198	ビットフィールドを pack や unpack することはできません -
1221	1199	バイト単位だけです。
1222	1200	C<pack> は常に次のバイト境界から始まり、必要な場合は 0 のビットを
1223	1201	追加することで 8 の倍数に「切り上げ」られます。
1224	1202	(もしビットフィールドがほしいなら、L<perlfunc/vec> があります。
1225	1203	あるいは、split, substr および unpack したビット文字列の結合を使って
1226	1204	文字単位のレベルでビットフィールド操作を実装することも出来ます。)
1227	1205
1228	1206	=begin original
1229	1207
1230	1208	To illustrate unpacking for bit strings, we'll decompose a simple
1231	1209	status register (a "-" stands for a "reserved" bit):
1232	1210
1233	1211	=end original
1234	1212
1235	1213	ビット文字列の unpack を図示するために、単純な状態レジスタを分解してみます
1236	1214	("-" は「予約された」ビットを意味します):
1237	1215
1238	1216	+-----------------+-----------------+
1239	1217	\| S Z - A - P - C \| - - - - O D I T \|
1240	1218	+-----------------+-----------------+
1241	1219	MSB LSB MSB LSB
1242	1220
1243	1221	=begin original
1244	1222
1245	1223	Converting these two bytes to a string can be done with the unpack
1246	1224	template C<'b16'>. To obtain the individual bit values from the bit
1247	1225	string we use C<split> with the "empty" separator pattern which dissects
1248	1226	into individual characters. Bit values from the "reserved" positions are
1249	1227	simply assigned to C<undef>, a convenient notation for "I don't care where
1250	1228	this goes".
1251	1229
1252	1230	=end original
1253	1231
1254	1232	これら 2 バイトから文字列への変換は unpack テンプレート C<'b16'> によって
1255	1233	行われます。
1256	1234	ビット文字列から個々のビット値を得るには、C<split> を「空」セパレータで
1257	1235	使うことで個々の文字に切り刻みます。
1258	1236	「予約された」位置からのビット値は単に C<undef> に代入しておきます;
1259	1237	これは「この値がどこに行こうが気にしない」ことを示す便利な記法です。
1260	1238
1261	1239	($carry, undef, $parity, undef, $auxcarry, undef, $zero, $sign,
1262	1240	$trace, $interrupt, $direction, $overflow) =
1263	1241	split( //, unpack( 'b16', $status ) );
1264	1242
1265	1243	=begin original
1266	1244
1267	1245	We could have used an unpack template C<'b12'> just as well, since the
1268	1246	last 4 bits can be ignored anyway.
1269	1247
1270	1248	=end original
1271	1249
1272	1250	ちょうど同じように、unpack テンプレート C<'b12'> も使えます;
1273	1251	最後の 4 ビットはどちらにしろ無視されるからです。
1274	1252
1275	1253	=head2 Uuencoding
1276	1254
1277	1255	(uuencode)
1278	1256
1279	1257	=begin original
1280	1258
1281		Another odd-man-out in the template alphabet is C<u>, which packs a
	1259	Another odd-man-out in the template alphabet is C<u>, which packs an
1282	1260	"uuencoded string". ("uu" is short for Unix-to-Unix.) Chances are that
1283	1261	you won't ever need this encoding technique which was invented to overcome
1284	1262	the shortcomings of old-fashioned transmission mediums that do not support
1285	1263	other than simple ASCII data. The essential recipe is simple: Take three
1286	1264	bytes, or 24 bits. Split them into 4 six-packs, adding a space (0x20) to
1287	1265	each. Repeat until all of the data is blended. Fold groups of 4 bytes into
1288	1266	lines no longer than 60 and garnish them in front with the original byte count
1289	1267	(incremented by 0x20) and a C<"\n"> at the end. - The C<pack> chef will
1290	1268	prepare this for you, a la minute, when you select pack code C<u> on the menu:
1291	1269
1292	1270	=end original
1293	1271
1294	1272	テンプレートの中のもう一つの半端者は C<u> で、「uuencode された文字列」を
1295	1273	pack します。
1296	1274	("uu" は Unix-to-Unix を縮めたものです。)
1297	1275	あなたには、単純な ASCII データしか対応していない旧式の通信メディアの欠点を
1298	1276	克服するために開発されたこのエンコーディング技術が必要になる機会は
1299	1277	なかったかもしれません。
1300	1278	本質的なレシピは単純です: 3 バイト、つまり 24 ビットを取ります。
1301	1279	これを 4 つの 6 ビットに分け、それぞれに空白 (0x20) を加えます。
1302	1280	全てのデータが混ぜられるまで繰り返します。
1303	1281	4 バイトの組を 60 文字を超えない行に折り畳み、元のバイト数(0x20 を
1304	1282	加えたもの)を先頭に置いて、末尾に C<"\n"> を置きます。
1305	1283	- あなたがメニューから pack コード C<u> を選ぶと、C<pack> シェフは
1306	1284	即席で、下ごしらえをしてくれます:
1307	1285
1308	1286	my $uubuf = pack( 'u', $bindat );
1309	1287
1310	1288	=begin original
1311	1289
1312	1290	A repeat count after C<u> sets the number of bytes to put into an
1313	1291	uuencoded line, which is the maximum of 45 by default, but could be
1314	1292	set to some (smaller) integer multiple of three. C<unpack> simply ignores
1315	1293	the repeat count.
1316	1294
1317	1295	=end original
1318	1296
1319	1297	C<u> の後の繰り返し数は uuencode された行にいれるバイト数で、デフォルトでは
1320	1298	最大の 45 ですが、3 の倍数のその他の(より小さい)数にできます。
1321	1299	C<unpack> は単に繰り返し数を無視します。
1322	1300
1323	1301	=head2 Doing Sums
1324	1302
1325	1303	(合計を計算する)
1326	1304
1327	1305	=begin original
1328	1306
1329	1307	An even stranger template code is C<%>E<lt>I<number>E<gt>. First, because
1330	1308	it's used as a prefix to some other template code. Second, because it
1331	1309	cannot be used in C<pack> at all, and third, in C<unpack>, doesn't return the
1332	1310	data as defined by the template code it precedes. Instead it'll give you an
1333	1311	integer of I<number> bits that is computed from the data value by
1334	1312	doing sums. For numeric unpack codes, no big feat is achieved:
1335	1313
1336	1314	=end original
1337	1315
1338	1316	さらに不思議なテンプレートコードは C<%>E<lt>I<number>E<gt> です。
1339	1317	第一に、これはその他のテンプレートコードの前置詞として使われるからです。
1340	1318	第二に、C<pack> では全く使えず、第三に、C<unpack> では、先行する
1341	1319	テンプレートコードによって定義された値を返さないからです。
1342	1320	代わりに、これはデータの合計として計算された I<number> ビットの整数を
1343	1321	与えます。
1344	1322	数値 unpack コードでは、大きな離れ業は行われません:
1345	1323
1346	1324	my $buf = pack( 'iii', 100, 20, 3 );
1347	1325	print unpack( '%32i3', $buf ), "\n"; # prints 123
1348	1326
1349	1327	=begin original
1350	1328
1351	1329	For string values, C<%> returns the sum of the byte values saving
1352	1330	you the trouble of a sum loop with C<substr> and C<ord>:
1353	1331
1354	1332	=end original
1355	1333
1356	1334	文字列値に対しては、C<%> はバイト値の合計を返し、C<substr> と C<ord> による
1357	1335	合計計算ループによる問題からあなたを救います:
1358	1336
1359	1337	print unpack( '%32A*', "\x01\x10" ), "\n"; # prints 17
1360	1338
1361	1339	=begin original
1362	1340
1363	1341	Although the C<%> code is documented as returning a "checksum":
1364	1342	don't put your trust in such values! Even when applied to a small number
1365	1343	of bytes, they won't guarantee a noticeable Hamming distance.
1366	1344
1367	1345	=end original
1368	1346
1369	1347	C<%> コードは「チェックサム」を返すと文書化されていますが:
1370	1348	このような値に信頼を置いてはいけません!
1371	1349	少量のバイト列に適用する場合ですら、顕著なハミング距離を保証できません。
1372	1350
1373	1351	=begin original
1374	1352
1375	1353	In connection with C<b> or C<B>, C<%> simply adds bits, and this can be put
1376	1354	to good use to count set bits efficiently:
1377	1355
1378	1356	=end original
1379	1357
1380	1358	C<b> や C<B> と共に使うと、C<%> は単にビットを加えるので、これは
1381	1359	セットされているビットを効率的に数えるためのよい方法となります:
1382	1360
1383	1361	my $bitcount = unpack( '%32b*', $mask );
1384	1362
1385	1363	=begin original
1386	1364
1387	1365	And an even parity bit can be determined like this:
1388	1366
1389	1367	=end original
1390	1368
1391	1369	そして偶数パリティビットは以下のようにして決定できます:
1392	1370
1393	1371	my $evenparity = unpack( '%1b*', $mask );
1394	1372
1395	1373	=head2 Unicode
1396	1374
1397	1375	=begin original
1398	1376
1399	1377	Unicode is a character set that can represent most characters in most of
1400	1378	the world's languages, providing room for over one million different
1401	1379	characters. Unicode 3.1 specifies 94,140 characters: The Basic Latin
1402	1380	characters are assigned to the numbers 0 - 127. The Latin-1 Supplement with
1403	1381	characters that are used in several European languages is in the next
1404	1382	range, up to 255. After some more Latin extensions we find the character
1405	1383	sets from languages using non-Roman alphabets, interspersed with a
1406	1384	variety of symbol sets such as currency symbols, Zapf Dingbats or Braille.
1407		(You might want to visit L<https://www.unicode.org/> for a look at some of
	1385	(You might want to visit L<http://www.unicode.org/> for a look at some of
1408	1386	them - my personal favourites are Telugu and Kannada.)
1409	1387
1410	1388	=end original
1411	1389
1412	1390	Unicode は世界中のほとんどの言語のほとんどの文字を表現できる文字集合で、
1413	1391	100 万以上の異なった文字のための空間を提供しています。
1414	1392	Unicode 3.1 は 94,140 文字を定義しています: 基本ラテン文字は番号
1415	1393	0 - 127 に割り当てられています。
1416	1394	いくつかのヨーロッパ言語で使われるラテン 1 補助が次の範囲で、255 までです。
1417	1395	いくつかのラテン拡張の後、非ローマアルファベットを使う言語の文字集合
1418	1396	および、通貨記号、Zapf Dingbats、点字のような様々な記号集合が
1419	1397	散らばっています。
1420		(これらのいくつかを見るために L<https://www.unicode.org/> を訪れるのも
	1398	(これらのいくつかを見るために L<http://www.unicode.org/> を訪れるのも
1421	1399	良いでしょう - 私の個人的なお気に入りは Telugu と Kannada です。)
1422	1400
1423	1401	=begin original
1424	1402
1425	1403	The Unicode character sets associates characters with integers. Encoding
1426	1404	these numbers in an equal number of bytes would more than double the
1427	1405	requirements for storing texts written in Latin alphabets.
1428	1406	The UTF-8 encoding avoids this by storing the most common (from a western
1429	1407	point of view) characters in a single byte while encoding the rarer
1430	1408	ones in three or more bytes.
1431	1409
1432	1410	=end original
1433	1411
1434	1412	Unicode 文字集合は文字と整数を結び付けます。
1435	1413	これらの数値を同じバイト数でエンコードすると、ラテンアルファベットで
1436	1414	書かれたテキストを保管するのに 2 倍以上のバイト数が必要になります。
1437	1415	UTF-8 エンコーディングは(西洋からの視点において)もっとも共通の文字を
1438	1416	1 バイトに格納し、より稀なものを 3 バイト以上にエンコードすることで
1439	1417	これを回避しています。
1440	1418
1441	1419	=begin original
1442	1420
1443	1421	Perl uses UTF-8, internally, for most Unicode strings.
1444	1422
1445	1423	=end original
1446	1424
1447	1425	Perl はほとんどの Unicode 文字列に対して内部的に UTF-8 を使います。
1448	1426
1449	1427	=begin original
1450	1428
1451	1429	So what has this got to do with C<pack>? Well, if you want to compose a
1452	1430	Unicode string (that is internally encoded as UTF-8), you can do so by
1453	1431	using template code C<U>. As an example, let's produce the Euro currency
1454	1432	symbol (code number 0x20AC):
1455	1433
1456	1434	=end original
1457	1435
1458	1436	それで、これで C<pack> は何ができるのでしょう?
1459	1437	えっと、もし Unicode 文字列 (これは内部では UTF-8 で
1460	1438	エンコードされています) を構成したいなら、テンプレートコード C<U> を
1461	1439	使うことでできます。
1462	1440	例として、ユーロ通貨記号 (コード番号 0x20AC) を生成してみましょう:
1463	1441
1464	1442	$UTF8{Euro} = pack( 'U', 0x20AC );
1465	1443	# Equivalent to: $UTF8{Euro} = "\x{20ac}";
1466	1444
1467	1445	=begin original
1468	1446
1469	1447	Inspecting C<$UTF8{Euro}> shows that it contains 3 bytes:
1470	1448	"\xe2\x82\xac". However, it contains only 1 character, number 0x20AC.
1471	1449	The round trip can be completed with C<unpack>:
1472	1450
1473	1451	=end original
1474	1452
1475	1453	C<$UTF8{Euro}> を検査すると、3 バイトであることがわかります:
1476	1454	"\xe2\x82\xac" です。
1477	1455	しかし、これは番号 0x20AC の 1 文字だけを含んでいます。
1478	1456	往復は C<unpack> を使って完了します:
1479	1457
1480	1458	$Unicode{Euro} = unpack( 'U', $UTF8{Euro} );
1481	1459
1482	1460	=begin original
1483	1461
1484	1462	Unpacking using the C<U> template code also works on UTF-8 encoded byte
1485	1463	strings.
1486	1464
1487	1465	=end original
1488	1466
1489	1467	C<U> テンプレートコードを使った unpack テンプレートコードは
1490	1468	UTF-8 エンコードされたバイト文字列に対しても動作します。
1491	1469
1492	1470	=begin original
1493	1471
1494	1472	Usually you'll want to pack or unpack UTF-8 strings:
1495	1473
1496	1474	=end original
1497	1475
1498	1476	普通は UTF-8 文字列を pack または unpack したいでしょう:
1499	1477
1500	1478	# pack and unpack the Hebrew alphabet
1501	1479	my $alefbet = pack( 'U*', 0x05d0..0x05ea );
1502	1480	my @hebrew = unpack( 'U*', $utf );
1503	1481
1504	1482	=begin original
1505	1483
1506	1484	Please note: in the general case, you're better off using
1507		~~L<C<~~Encode::decode~~('UTF-8', $~~utf~~)>\|Encode/decode>~~ to decode a UTF-8
	1485	Encode::decode_utf8 to decode a UTF-8 encoded byte string to a Perl
1508		encoded ~~byte~~ string to a Perl Unicode string~~, and~~
	1486	Unicode string, and Encode::encode_utf8 to encode a Perl Unicode string
1509		~~L<C<Enc~~o~~de::encode('~~UTF-8', $st~~r)>\|Encod~~e/e~~ncod~~e> to encode a ~~Per~~l Unicode
	1487	to UTF-8 bytes. These functions provide means of handling invalid byte
1510		string to UTF-8 bytes. These functions provide means of handling invalid byte
1511	1488	sequences and generally have a friendlier interface.
1512	1489
1513	1490	=end original
1514	1491
1515	1492	注意: 一般的な場合には、UTF-8 エンコードされたバイト文字列を Perl の
1516		Unicode 文字列にデコードするには
	1493	Unicode 文字列にデコードするには Encode::decode_utf8 を使い、
1517		L<C<Encode::decode('UTF-8', $utf)>\|Encode/decode> を使い、
1518	1494	Perl の Unicode 文字列を UTF-8 のバイト文字列にエンコードするには
1519		~~L<C<~~Encode::encode~~('UTF-8', $s~~t~~r)>\|Encode/encode>~~ を使った方がよいです。
	1495	Encode::encode_utf8 を使った方がよいです。
1520	1496	これらの関数は不正なバイト列を扱う手段を提供し、一般的により親切な
1521	1497	インターフェースを持ちます。
1522	1498
1523	1499	=head2 Another Portable Binary Encoding
1524	1500
1525	1501	(その他の移植性のあるバイナリエンコーディング)
1526	1502
1527	1503	=begin original
1528	1504
1529	1505	The pack code C<w> has been added to support a portable binary data
1530	1506	encoding scheme that goes way beyond simple integers. (Details can
1531		be found at L<https://~~githu~~b.co~~m/mwo~~r~~ks-project~~/~~mw_scarab/blob/mas~~ter/Scarab~~-0.1.00d19/d~~oc/binary~~-serializatio~~n~~.txt>,~~
	1507	be found at L<http://Casbah.org/>, the Scarab project.) A BER (Binary Encoded
1532		the Scarab project.) A BER (Binary Encoded
1533	1508	Representation) compressed unsigned integer stores base 128
1534	1509	digits, most significant digit first, with as few digits as possible.
1535	1510	Bit eight (the high bit) is set on each byte except the last. There
1536	1511	is no size limit to BER encoding, but Perl won't go to extremes.
1537	1512
1538	1513	=end original
1539	1514
1540	1515	pack コード C<w> は、単純な整数とは程遠い、移植性のある
1541	1516	バイナリデータエンコーディングスキームに対応するために追加されました。
1542		(詳細については Scarab プロジェクト
	1517	(詳細については Scarab プロジェクト L<http://Casbah.org/> にあります。)
1543		L<https://github.com/mworks-project/mw_scarab/blob/master/Scarab-0.1.00d19/doc/binary-serialization.txt>
1544		にあります。)
1545	1518	BER (Binary Encoded Representation) 圧縮符号なし整数は 128 を基数として、
1546	1519	最上位ビットを最初にして、可能な限り少ない桁になるように保管します。
1547	1520	ビット 8 (最上位ビット) は、最後以外のバイトでセットされます。
1548	1521	BER エンコーディングにはサイズ制限がありませんが、Perl は極端なことは
1549	1522	しません。
1550	1523
1551	1524	my $berbuf = pack( 'w', 1, 128, 128+1, 128128+127 );
1552	1525
1553	1526	=begin original
1554	1527
1555	1528	A hex dump of C<$berbuf>, with spaces inserted at the right places,
1556	1529	shows 01 8100 8101 81807F. Since the last byte is always less than
1557	1530	128, C<unpack> knows where to stop.
1558	1531
1559	1532	=end original
1560	1533
1561	1534	C<$berbuf> を、適切な位置に空白を入れつつ 16 進ダンプを取ると、
1562	1535	01 8100 8101 81807F となります。
1563	1536	最後のバイトは常に 128 より小さくなるので、C<unpack> は停止する位置が
1564	1537	わかります。
1565	1538
1566	1539	=head1 Template Grouping
1567	1540
1568	1541	(テンプレートのグループ化)
1569	1542
1570	1543	=begin original
1571	1544
1572	1545	Prior to Perl 5.8, repetitions of templates had to be made by
1573	1546	C<x>-multiplication of template strings. Now there is a better way as
1574	1547	we may use the pack codes C<(> and C<)> combined with a repeat count.
1575	1548	The C<unpack> template from the Stack Frame example can simply
1576	1549	be written like this:
1577	1550
1578	1551	=end original
1579	1552
1580	1553	Perl 5.8 以前では、テンプレートの繰り返しはテンプレート文字列を C<x> 回
1581	1554	繰り返すことで作る必要がありました。
1582	1555	今では、pack コード C<(> と C<)> に繰り返し数を組み合わせて使うという
1583	1556	よりよい方法があります。
1584	1557	スタックフレームの例の C<unpack> テンプレートは単に以下のように書けます:
1585	1558
1586	1559	unpack( 'v2 (vXXCC)5 v5', $frame )
1587	1560
1588	1561	=begin original
1589	1562
1590	1563	Let's explore this feature a little more. We'll begin with the equivalent of
1591	1564
1592	1565	=end original
1593	1566
1594	1567	この機能についてもうすこしだけ探求してみましょう。
1595	1568	以下と等価なものから始めます:
1596	1569
1597	1570	join( '', map( substr( $_, 0, 1 ), @str ) )
1598	1571
1599	1572	=begin original
1600	1573
1601	1574	which returns a string consisting of the first character from each string.
1602	1575	Using pack, we can write
1603	1576
1604	1577	=end original
1605	1578
1606	1579	これは、それぞれの文字列の最初の文字からなる文字列を返します。
1607	1580	pack を使うと、以下のように書けます:
1608	1581
1609	1582	pack( '(A)'.@str, @str )
1610	1583
1611	1584	=begin original
1612	1585
1613	1586	or, because a repeat count C<*> means "repeat as often as required",
1614	1587	simply
1615	1588
1616	1589	=end original
1617	1590
1618	1591	あるいは、繰り返し数 C<*> は「必要なだけ繰り返す」ことを意味するので、
1619	1592	単に以下のようになります:
1620	1593
1621	1594	pack( '(A)*', @str )
1622	1595
1623	1596	=begin original
1624	1597
1625	1598	(Note that the template C<A*> would only have packed C<$str[0]> in full
1626	1599	length.)
1627	1600
1628	1601	=end original
1629	1602
1630	1603	(テンプレートは C<A*> は C<$str[0]> を完全な長さで pack するだけという
1631	1604	ことに注意してください。)
1632	1605
1633	1606	=begin original
1634	1607
1635	1608	To pack dates stored as triplets ( day, month, year ) in an array C<@dates>
1636	1609	into a sequence of byte, byte, short integer we can write
1637	1610
1638	1611	=end original
1639	1612
1640	1613	配列 C<@dates> に 3 つ組 (日、月、年) として保管されている日付を
1641	1614	バイト、バイト、short に pack するには、以下のように書きます
1642	1615
1643	1616	$pd = pack( '(CCS)*', map( @$_, @dates ) );
1644	1617
1645	1618	=begin original
1646	1619
1647	1620	To swap pairs of characters in a string (with even length) one could use
1648	1621	several techniques. First, let's use C<x> and C<X> to skip forward and back:
1649	1622
1650	1623	=end original
1651	1624
1652	1625	ある文字列の中の(同じ長さの)部分文字列の組を交換するには、いくつかの
1653	1626	技が使えます。
1654	1627	まず、読み飛ばして戻ってくるために C<x> と C<X> を使いましょう:
1655	1628
1656	1629	$s = pack( '(A)', unpack( '(xAXXAx)', $s ) );
1657	1630
1658	1631	=begin original
1659	1632
1660	1633	We can also use C<@> to jump to an offset, with 0 being the position where
1661	1634	we were when the last C<(> was encountered:
1662	1635
1663	1636	=end original
1664	1637
1665	1638	また、オフセットに飛ぶために C<@> も使えます; ここで 0 は最後に C<(> に
1666	1639	遭遇した位置になります:
1667	1640
1668	1641	$s = pack( '(A)', unpack( '(@1A @0A @2)', $s ) );
1669	1642
1670	1643	=begin original
1671	1644
1672	1645	Finally, there is also an entirely different approach by unpacking big
1673	1646	endian shorts and packing them in the reverse byte order:
1674	1647
1675	1648	=end original
1676	1649
1677	1650	最後に、ビッグエンディアンの short として unpack して、逆のバイト順で
1678	1651	pack するという、全く異なった手法もあります:
1679	1652
1680	1653	$s = pack( '(v)', unpack( '(n)', $s );
1681	1654
1682	1655	=head1 Lengths and Widths
1683	1656
1684	1657	(長さと幅)
1685	1658
1686	1659	=head2 String Lengths
1687	1660
1688	1661	(文字列の長さ)
1689	1662
1690	1663	=begin original
1691	1664
1692	1665	In the previous section we've seen a network message that was constructed
1693	1666	by prefixing the binary message length to the actual message. You'll find
1694	1667	that packing a length followed by so many bytes of data is a
1695	1668	frequently used recipe since appending a null byte won't work
1696	1669	if a null byte may be part of the data. Here is an example where both
1697	1670	techniques are used: after two null terminated strings with source and
1698	1671	destination address, a Short Message (to a mobile phone) is sent after
1699	1672	a length byte:
1700	1673
1701	1674	=end original
1702	1675
1703	1676	前の章で、実際のメッセージの前にメッセージの長さをバイナリで前置することで
1704	1677	構成されたネットワークメッセージを見ました。
1705	1678	NUL バイトを追加するという方法は、データの一部として NUL バイトが
1706	1679	含まれているときには動作しないので、引き続くデータの長さを pack するという
1707	1680	方法はよく見られます。
1708	1681	以下は両方の技術を使った例です: 送り元と送り先のアドレスを示す 2 つの
1709	1682	NUL 終端文字列の後、(携帯電話への)ショートメッセージがその長さの後に
1710	1683	送られます:
1711	1684
1712	1685	my $msg = pack( 'ZZCA*', $src, $dst, length( $sm ), $sm );
1713	1686
1714	1687	=begin original
1715	1688
1716	1689	Unpacking this message can be done with the same template:
1717	1690
1718	1691	=end original
1719	1692
1720	1693	このメッセージを unpack するには同じテンプレートで可能です:
1721	1694
1722	1695	( $src, $dst, $len, $sm ) = unpack( 'ZZCA*', $msg );
1723	1696
1724	1697	=begin original
1725	1698
1726	1699	There's a subtle trap lurking in the offing: Adding another field after
1727	1700	the Short Message (in variable C<$sm>) is all right when packing, but this
1728	1701	cannot be unpacked naively:
1729	1702
1730	1703	=end original
1731	1704
1732	1705	遠くに微妙な罠が顔を覗かせています: (変数 C<$sm> に入っている)
1733	1706	ショートメッセージの後にフィールドを追加すると、pack は問題ありませんが、
1734	1707	ネイティブに unpack 出来なくなります。
1735	1708
1736	1709	# pack a message
1737	1710	my $msg = pack( 'ZZCA*C', $src, $dst, length( $sm ), $sm, $prio );
1738	1711
1739	1712	# unpack fails - $prio remains undefined!
1740	1713	( $src, $dst, $len, $sm, $prio ) = unpack( 'ZZCA*C', $msg );
1741	1714
1742	1715	=begin original
1743	1716
1744	1717	The pack code C<A*> gobbles up all remaining bytes, and C<$prio> remains
1745	1718	undefined! Before we let disappointment dampen the morale: Perl's got
1746	1719	the trump card to make this trick too, just a little further up the sleeve.
1747	1720	Watch this:
1748	1721
1749	1722	=end original
1750	1723
1751	1724	pack コード C<A*> は残り全てのコードを読み込んでしまい、C<$prio> が
1752	1725	未定義のままになってしまうのです!
1753	1726	がっかりして士気をくじかれる前に: Perl はこのようなトリックに対しても
1754	1727	切り札を持っています; もう少し袖をまくってください。
1755	1728	これを見てください:
1756	1729
1757	1730	# pack a message: ASCIIZ, ASCIIZ, length/string, byte
1758	1731	my $msg = pack( 'Z* Z* C/A* C', $src, $dst, $sm, $prio );
1759	1732
1760	1733	# unpack
1761	1734	( $src, $dst, $sm, $prio ) = unpack( 'Z* Z* C/A* C', $msg );
1762	1735
1763	1736	=begin original
1764	1737
1765	1738	Combining two pack codes with a slash (C</>) associates them with a single
1766	1739	value from the argument list. In C<pack>, the length of the argument is
1767	1740	taken and packed according to the first code while the argument itself
1768	1741	is added after being converted with the template code after the slash.
1769	1742	This saves us the trouble of inserting the C<length> call, but it is
1770	1743	in C<unpack> where we really score: The value of the length byte marks the
1771	1744	end of the string to be taken from the buffer. Since this combination
1772	1745	doesn't make sense except when the second pack code isn't C<a>, C<A>
1773	1746	or C<Z*>, Perl won't let you.
1774	1747
1775	1748	=end original
1776	1749
1777	1750	二つの pack コードをスラッシュ (C</>) で繋ぐことで、引数リストの 1 つの
1778	1751	値と結び付けられます。
1779	1752	C<pack> では、引数の長さが取られて最初のコードに従って pack される一方、
1780	1753	引数自体はスラッシュの後のテンプレートコードによって変換された後
1781	1754	追加されます。
1782	1755	これは C<length> 呼び出しを挿入することによるトラブルを救いますが、
1783	1756	本当に効果があるのは C<unpack> においてです: 長さを示すバイトの値は
1784	1757	バッファから取られる文字列の末尾をマークします。
1785	1758	この組み合わせは 2 つ目の pack コードが C<a>, C<A>, C<Z*> でない場合
1786	1759	以外は意味がないので、Perl はそうはさせません。
1787	1760
1788	1761	=begin original
1789	1762
1790	1763	The pack code preceding C</> may be anything that's fit to represent a
1791	1764	number: All the numeric binary pack codes, and even text codes such as
1792	1765	C<A4> or C<Z*>:
1793	1766
1794	1767	=end original
1795	1768
1796	1769	C</> の前に置く pack コードは、数値を表現するのに適したものであれば
1797	1770	なんでも使えます:
1798	1771	全ての数値バイナリ pack コードおよび、C<A4> や C<Z*> のような
1799	1772	テキストコードにも対応します:
1800	1773
1801	1774	# pack/unpack a string preceded by its length in ASCII
1802	1775	my $buf = pack( 'A4/A*', "Humpty-Dumpty" );
1803	1776	# unpack $buf: '13 Humpty-Dumpty'
1804	1777	my $txt = unpack( 'A4/A*', $buf );
1805	1778
1806	1779	=begin original
1807	1780
1808	1781	C</> is not implemented in Perls before 5.6, so if your code is required to
1809		work on ~~anci~~ent Perls you'll need to C<unpack( 'Z* Z* C')> to get the length,
	1782	work on older Perls you'll need to C<unpack( 'Z* Z* C')> to get the length,
1810	1783	then use it to make a new unpack string. For example
1811	1784
1812	1785	=end original
1813	1786
1814	1787	C</> は 5.6 以前の Perl には実装されていないので、もし、より古い Perl で
1815	1788	動作することが要求される場合は、長さを得るために C<unpack( 'Z* Z* C')> を
1816	1789	使って、それから新しい unpack 文字列を作ってそれを使う必要があります。
1817	1790	例えば:
1818	1791
1819		# pack a message: ASCIIZ, ASCIIZ, length, string, byte
	1792	# pack a message: ASCIIZ, ASCIIZ, length, string, byte (5.005 compatible)
1820		# (5.005 compatible)
1821	1793	my $msg = pack( 'Z* Z* C A* C', $src, $dst, length $sm, $sm, $prio );
1822	1794
1823	1795	# unpack
1824	1796	( undef, undef, $len) = unpack( 'Z* Z* C', $msg );
1825	1797	($src, $dst, $sm, $prio) = unpack ( "Z* Z* x A$len C", $msg );
1826	1798
1827	1799	=begin original
1828	1800
1829	1801	But that second C<unpack> is rushing ahead. It isn't using a simple literal
1830	1802	string for the template. So maybe we should introduce...
1831	1803
1832	1804	=end original
1833	1805
1834	1806	しかしこの 2 番目の C<unpack> は先走りました。
1835	1807	これはテンプレートとして単純なリテラル文字列を使っていません。
1836	1808	それでは説明するべきでしょう…
1837	1809
1838	1810	=head2 Dynamic Templates
1839	1811
1840	1812	(動的テンプレート)
1841	1813
1842	1814	=begin original
1843	1815
1844	1816	So far, we've seen literals used as templates. If the list of pack
1845	1817	items doesn't have fixed length, an expression constructing the
1846	1818	template is required (whenever, for some reason, C<()*> cannot be used).
1847	1819	Here's an example: To store named string values in a way that can be
1848	1820	conveniently parsed by a C program, we create a sequence of names and
1849	1821	null terminated ASCII strings, with C<=> between the name and the value,
1850	1822	followed by an additional delimiting null byte. Here's how:
1851	1823
1852	1824	=end original
1853	1825
1854	1826	これまでは、テンプレートとして使われるリテラルを見てきました。
1855	1827	pack するアイテムのリストが固定長でない場合(何らかの理由で C<()*> が
1856	1828	使えないなら)、テンプレートを構成する式が必要です。
1857	1829	以下は例です: C プログラムで使いやすい形で名前付き文字列値を保管するために、
1858	1830	一続きの名前と NUL 終端された ASCII 文字列を作ります;
1859	1831	名前と値の間には C<=> を置いて、最後に追加のデリミタとなる NUL バイトを
1860	1832	置きます。
1861	1833	以下のようにします:
1862	1834
1863	1835	my $env = pack( '(AAZ*)' . keys( %Env ) . 'C',
1864	1836	map( { ( $_, '=', $Env{$_} ) } keys( %Env ) ), 0 );
1865	1837
1866	1838	=begin original
1867	1839
1868	1840	Let's examine the cogs of this byte mill, one by one. There's the C<map>
1869	1841	call, creating the items we intend to stuff into the C<$env> buffer:
1870	1842	to each key (in C<$_>) it adds the C<=> separator and the hash entry value.
1871	1843	Each triplet is packed with the template code sequence C<AAZ*> that
1872	1844	is repeated according to the number of keys. (Yes, that's what the C<keys>
1873	1845	function returns in scalar context.) To get the very last null byte,
1874	1846	we add a C<0> at the end of the C<pack> list, to be packed with C<C>.
1875	1847	(Attentive readers may have noticed that we could have omitted the 0.)
1876	1848
1877	1849	=end original
1878	1850
1879	1851	このバイト処理機の要素を一つ一つ調査してみましょう。
1880	1852	C<map> 呼び出しは、C<$env> バッファに入れることを想定している内容の
1881	1853	アイテムを作成します:
1882	1854	(C<$_> の) それぞれのキーについて、C<=> セパレータとハッシュエントリの値を
1883	1855	追加します。
1884	1856	それぞれの 3 つ組は、キーの数
1885	1857	(はい、これは C<keys> 関数がスカラコンテキストで返すものです。)
1886	1858	に従って繰り返されるテンプレートコードの
1887	1859	並び C<AAZ*> で pack されます。
1888	1860	まさに最後の NUL バイトを得るために、C<pack> リストの最後に C<C> で
1889	1861	pack するための C<0> を追加します。
1890	1862	(注意深い読者なら、この 0 は省略できることに気付いたかもしれません。)
1891	1863
1892	1864	=begin original
1893	1865
1894	1866	For the reverse operation, we'll have to determine the number of items
1895	1867	in the buffer before we can let C<unpack> rip it apart:
1896	1868
1897	1869	=end original
1898	1870
1899	1871	逆操作のために、C<unpack> に分解させる前にバッファにあるアイテムの数を
1900	1872	決定する必要があります:
1901	1873
1902	1874	my $n = $env =~ tr/\0// - 1;
1903	1875	my %env = map( split( /=/, $_ ), unpack( "(Z*)$n", $env ) );
1904	1876
1905	1877	=begin original
1906	1878
1907	1879	The C<tr> counts the null bytes. The C<unpack> call returns a list of
1908	1880	name-value pairs each of which is taken apart in the C<map> block.
1909	1881
1910	1882	=end original
1911	1883
1912	1884	C<tr> はヌルバイトを数えます。
1913	1885	C<unpack> 呼び出しは名前-値の組のリストを返し、そのそれぞれが
1914	1886	C<map> ブロックで分割されます。
1915	1887
1916	1888	=head2 Counting Repetitions
1917	1889
1918	1890	(繰り返しを数える)
1919	1891
1920	1892	=begin original
1921	1893
1922	1894	Rather than storing a sentinel at the end of a data item (or a list of items),
1923	1895	we could precede the data with a count. Again, we pack keys and values of
1924	1896	a hash, preceding each with an unsigned short length count, and up front
1925	1897	we store the number of pairs:
1926	1898
1927	1899	=end original
1928	1900
1929	1901	データアイテム(あるいはアイテムのリスト)の最後に見張りをおくのではなく、
1930	1902	データの数を先においておくこともできます。
1931	1903	再び、ハッシュのキーと値を pack します; それぞれの前には符号なし short で
1932	1904	長さが置かれ、先頭には組の数を保管します:
1933	1905
1934	1906	my $env = pack( 'S(S/A* S/A)', scalar keys( %Env ), %Env );
1935	1907
1936	1908	=begin original
1937	1909
1938	1910	This simplifies the reverse operation as the number of repetitions can be
1939	1911	unpacked with the C</> code:
1940	1912
1941	1913	=end original
1942	1914
1943	1915	繰り返し数は C</> コードで unpack できるので、逆操作は単純になります:
1944	1916
1945	1917	my %env = unpack( 'S/(S/A* S/A*)', $env );
1946	1918
1947	1919	=begin original
1948	1920
1949	1921	Note that this is one of the rare cases where you cannot use the same
1950	1922	template for C<pack> and C<unpack> because C<pack> can't determine
1951	1923	a repeat count for a C<()>-group.
1952	1924
1953	1925	=end original
1954	1926
1955	1927	これは、C<pack> は C<()> グループの繰り返し数を決定できないので、
1956	1928	C<pack> と C<unpack> で同じテンプレートが使えない珍しい場合であることに
1957	1929	注意してください。
1958	1930
1959		=head2 Intel HEX
1960
1961		=begin original
1962
1963		Intel HEX is a file format for representing binary data, mostly for
1964		programming various chips, as a text file. (See
1965		L<https://en.wikipedia.org/wiki/.hex> for a detailed description, and
1966		L<https://en.wikipedia.org/wiki/SREC_(file_format)> for the Motorola
1967		S-record format, which can be unravelled using the same technique.)
1968		Each line begins with a colon (':') and is followed by a sequence of
1969		hexadecimal characters, specifying a byte count I<n> (8 bit),
1970		an address (16 bit, big endian), a record type (8 bit), I<n> data bytes
1971		and a checksum (8 bit) computed as the least significant byte of the two's
1972		complement sum of the preceding bytes. Example: C<:0300300002337A1E>.
1973
1974		=end original
1975
1976		Intel HEX バイナリデータを表現するためのファイル形式で、ほとんどの場合
1977		様々なデータをテキストファイルとしてプログラミングするためのものです。
1978		(詳細な記述については L<https://en.wikipedia.org/wiki/.hex> を、
1979		同じテクニックを使って展開できる Motorola S-record 形式については
1980		L<https://en.wikipedia.org/wiki/SREC_(file_format)> を参照してください。)
1981		それぞれの行はコロン (':') で始まり、バイトカウント I<n> (8 ビット)、
1982		アドレス (16 ビット、ビッグエンディアン)、レコード型 (8 ビット)、
1983		I<n> バイトのデータ、そこまでのバイト列の合計の最下位バイトの 2 の補数で
1984		表されるチェックサム (8 ビット)、からなる 16 進文字の並びが続きます。
1985		例: C<:0300300002337A1E>。
1986
1987		=begin original
1988
1989		The first step of processing such a line is the conversion, to binary,
1990		of the hexadecimal data, to obtain the four fields, while checking the
1991		checksum. No surprise here: we'll start with a simple C<pack> call to
1992		convert everything to binary:
1993
1994		=end original
1995
1996		このような行を処理するための最初のステップは、四つのフィールドを得るために
1997		16 進データをバイナリに変換して、チェックサムを調べることです。
1998		ここには驚きはありません: 全てをバイナリに変換するための単純な
1999		C<pack> 呼び出しから始めます:
2000
2001		my $binrec = pack( 'H*', substr( $hexrec, 1 ) );
2002
2003		=begin original
2004
2005		The resulting byte sequence is most convenient for checking the checksum.
2006		Don't slow your program down with a for loop adding the C<ord> values
2007		of this string's bytes - the C<unpack> code C<%> is the thing to use
2008		for computing the 8-bit sum of all bytes, which must be equal to zero:
2009
2010		=end original
2011
2012		結果のバイト並びははチェックサムを計算するのに最も便利です。
2013		この文字列のバイトの C<ord> の値を加算するループでプログラムの速度を
2014		落とすようなことをしないでください - C<unpack> の C<%> コードは
2015		全てのバイトの 8 ビットの合計を計算するためのもので、これは 0 でなければ
2016		なりません:
2017
2018		die unless unpack( "%8C*", $binrec ) == 0;
2019
2020		=begin original
2021
2022		Finally, let's get those four fields. By now, you shouldn't have any
2023		problems with the first three fields - but how can we use the byte count
2024		of the data in the first field as a length for the data field? Here
2025		the codes C<x> and C<X> come to the rescue, as they permit jumping
2026		back and forth in the string to unpack.
2027
2028		=end original
2029
2030		最後に、四つのフィールドを取り出しましょう。
2031		ここまでで、最初の三つのフィールドを取り出すのには何の問題もないはずです -
2032		しかし最初のフィールドにあるデータのバイトカウントをデータフィールドの
2033		長さに使うにはどうすればいいでしょう?
2034		ここで C<x> と C<X> が助けにやってきて、戻って文字列の 4 番目を
2035		unpack できるようにします。
2036
2037		my( $addr, $type, $data ) = unpack( "x n C X4 C x3 /a", $bin );
2038
2039		=begin original
2040
2041		Code C<x> skips a byte, since we don't need the count yet. Code C<n> takes
2042		care of the 16-bit big-endian integer address, and C<C> unpacks the
2043		record type. Being at offset 4, where the data begins, we need the count.
2044		C<X4> brings us back to square one, which is the byte at offset 0.
2045		Now we pick up the count, and zoom forth to offset 4, where we are
2046		now fully furnished to extract the exact number of data bytes, leaving
2047		the trailing checksum byte alone.
2048
2049		=end original
2050
2051		C<x> コードは、まだカウントは必要ではないので 1 バイト飛ばします。
2052		C<n> コードは 16 ビットビッグエンディアン整数アドレスを取得し、
2053		C<C> はレコード型を unpack します。
2054		データが始まる位置であるオフセット 4 に来て、カウントが必要です。
2055		C<X4> は 1 マス目、つまりオフセット 0 のバイトに戻ります。
2056		ここでカウントを取りだして、正確な数のデータを展開するために提供されている
2057		オフセット 4 に移動して、末尾のチェックサムバイトだけを残します。
2058
2059	1931	=head1 Packing and Unpacking C Structures
2060	1932
2061	1933	(C の構造体を pack/unpack する)
2062	1934
2063	1935	=begin original
2064	1936
2065	1937	In previous sections we have seen how to pack numbers and character
2066	1938	strings. If it were not for a couple of snags we could conclude this
2067	1939	section right away with the terse remark that C structures don't
2068	1940	contain anything else, and therefore you already know all there is to it.
2069	1941	Sorry, no: read on, please.
2070	1942
2071	1943	=end original
2072	1944
2073	1945	前のセクションで、数値と文字列を pack する方法を見ました。
2074	1946	もしここに障害がないなら、「C 構造体には他に何もなく、従ってあなたは
2075		既に C 構造体を pack/unpack するための全てを知っています」
	1947	既に C 構造体を pack/unpack するための全てを知っています。」
2076	1948	という簡潔な見解と共にこの章をすぐに締めくくることができます。
2077	1949	すみません、そうではありません: どうか読み進めてください。
2078	1950
2079	1951	=begin original
2080	1952
2081	1953	If you have to deal with a lot of C structures, and don't want to
2082	1954	hack all your template strings manually, you'll probably want to have
2083	1955	a look at the CPAN module C<Convert::Binary::C>. Not only can it parse
2084	1956	your C source directly, but it also has built-in support for all the
2085	1957	odds and ends described further on in this section.
2086	1958
2087	1959	=end original
2088	1960
2089	1961	もし大量の C 構造体を扱う必要があって、全てのテンプレート文字列を手動で
2090	1962	ハックしたくないなら、おそらく一度 CPAN モジュール C<Convert::Binary::C> を
2091	1963	見たほうが良いでしょう。
2092	1964	C ソースを直接パースできるだけでなく、この章でさらに記述される全ての
2093	1965	雑務に対する組み込みのサポートがあります。
2094	1966
2095	1967	=head2 The Alignment Pit
2096	1968
2097	1969	(アライメントの落とし穴)
2098	1970
2099	1971	=begin original
2100	1972
2101	1973	In the consideration of speed against memory requirements the balance
2102	1974	has been tilted in favor of faster execution. This has influenced the
2103	1975	way C compilers allocate memory for structures: On architectures
2104	1976	where a 16-bit or 32-bit operand can be moved faster between places in
2105	1977	memory, or to or from a CPU register, if it is aligned at an even or
2106	1978	multiple-of-four or even at a multiple-of eight address, a C compiler
2107	1979	will give you this speed benefit by stuffing extra bytes into structures.
2108	1980	If you don't cross the C shoreline this is not likely to cause you any
2109	1981	grief (although you should care when you design large data structures,
2110	1982	or you want your code to be portable between architectures (you do want
2111	1983	that, don't you?)).
2112	1984
2113	1985	=end original
2114	1986
2115	1987	速度とメモリ消費のバランスは、より速く実行できる方に傾いています。
2116	1988	これは C コンパイラが構造体のためにメモリを割り当てる方法に影響します:
2117	1989	偶数、4 の倍数、あるいは 8 の倍数のアドレスにアライメントされていれば、
2118	1990	16 ビットや 32 ビットのオペランドや CPU レジスタへの出し入れが速くなる
2119	1991	アーキテクチャでは、C コンパイラは構造体に追加のバイトを入れることで
2120	1992	この速度メリットを受けるようにします。
2121	1993	もし C の海岸線を越えないのなら、これがなんらかの面倒を引き起こすことは
2122	1994	ありそうにありません (しかし、大きなデータ構造を設計したり、
2123	1995	アーキテクチャ間で移植性のあるコードがほしい場合(そうしたくないですか?)、
2124		気にするべきです)。
	1996	気にするべきです。)。
2125	1997
2126	1998	=begin original
2127	1999
2128	2000	To see how this affects C<pack> and C<unpack>, we'll compare these two
2129	2001	C structures:
2130	2002
2131	2003	=end original
2132	2004
2133	2005	これが C<pack> と C<unpack> にどのように影響を与えるかを見るために、
2134	2006	これら 2 つの C 構造体を比較してみます:
2135	2007
2136	2008	typedef struct {
2137	2009	char c1;
2138	2010	short s;
2139	2011	char c2;
2140	2012	long l;
2141	2013	} gappy_t;
2142	2014
2143	2015	typedef struct {
2144	2016	long l;
2145	2017	short s;
2146	2018	char c1;
2147	2019	char c2;
2148	2020	} dense_t;
2149	2021
2150	2022	=begin original
2151	2023
2152	2024	Typically, a C compiler allocates 12 bytes to a C<gappy_t> variable, but
2153	2025	requires only 8 bytes for a C<dense_t>. After investigating this further,
2154	2026	we can draw memory maps, showing where the extra 4 bytes are hidden:
2155	2027
2156	2028	=end original
2157	2029
2158	2030	典型的には、C コンパイラは C<gappy_t> 変数には 12 バイトを割り当てますが、
2159	2031	C<dense_t> には 8 バイトしか割り当てません。
2160	2032	これをさらに調査した後、余分な 4 バイトが隠れていることが分かる
2161	2033	メモリマップが書けます:
2162	2034
2163	2035	0 +4 +8 +12
2164	2036	+--+--+--+--+--+--+--+--+--+--+--+--+
2165	2037	\|c1\|xx\| s \|c2\|xx\|xx\|xx\| l \| xx = fill byte
2166	2038	+--+--+--+--+--+--+--+--+--+--+--+--+
2167	2039	gappy_t
2168	2040
2169	2041	0 +4 +8
2170	2042	+--+--+--+--+--+--+--+--+
2171	2043	\| l \| h \|c1\|c2\|
2172	2044	+--+--+--+--+--+--+--+--+
2173	2045	dense_t
2174	2046
2175	2047	=begin original
2176	2048
2177	2049	And that's where the first quirk strikes: C<pack> and C<unpack>
2178	2050	templates have to be stuffed with C<x> codes to get those extra fill bytes.
2179	2051
2180	2052	=end original
2181	2053
2182	2054	そしてこれが最初の思いがけない一撃の理由です:
2183	2055	C<pack> と C<unpack> のテンプレートは、これらの余分に埋めるバイトのために
2184	2056	C<X> コードを詰める必要があります。
2185	2057
2186	2058	=begin original
2187	2059
2188	2060	The natural question: "Why can't Perl compensate for the gaps?" warrants
2189	2061	an answer. One good reason is that C compilers might provide (non-ANSI)
2190	2062	extensions permitting all sorts of fancy control over the way structures
2191	2063	are aligned, even at the level of an individual structure field. And, if
2192	2064	this were not enough, there is an insidious thing called C<union> where
2193	2065	the amount of fill bytes cannot be derived from the alignment of the next
2194	2066	item alone.
2195	2067
2196	2068	=end original
2197	2069
2198	2070	自然な質問: 「なぜ Perl は隙間を埋め合わせられないの?」には答えるのが
2199	2071	当然です。
2200	2072	一つのよい理由は、個々の構造体フィールドのレベルでさえ、構造体の
2201	2073	アライメント方法のあらゆる種類の制御方法を許している(非 ANSI の)拡張を
2202	2074	提供しているCコンパイラがあるからです。
2203	2075	そして、もしこれが十分でないなら、埋めるバイト数が次のアイテムの
2204	2076	アライメントだけでは決定されない、C<union> と呼ばれる
2205	2077	陰険なものがあります。
2206	2078
2207	2079	=begin original
2208	2080
2209	2081	OK, so let's bite the bullet. Here's one way to get the alignment right
2210	2082	by inserting template codes C<x>, which don't take a corresponding item
2211	2083	from the list:
2212	2084
2213	2085	=end original
2214	2086
2215	2087	よし、では困難に耐えましょう。
2216	2088	これは、リストから対応する要素を使わないテンプレートコード C<x> を
2217	2089	挿入することでアライメントを正しくする一つの方法です:
2218	2090
2219	2091	my $gappy = pack( 'cxs cxxx l!', $c1, $s, $c2, $l );
2220	2092
2221	2093	=begin original
2222	2094
2223	2095	Note the C<!> after C<l>: We want to make sure that we pack a long
2224	2096	integer as it is compiled by our C compiler. And even now, it will only
2225	2097	work for the platforms where the compiler aligns things as above.
2226	2098	And somebody somewhere has a platform where it doesn't.
2227	2099	[Probably a Cray, where C<short>s, C<int>s and C<long>s are all 8 bytes. :-)]
2228	2100
2229	2101	=end original
2230	2102
2231	2103	C<l> の後の C<!> に注意してください: long 整数を C コンパイラで
2232	2104	コンパイルされるのと同じ形になるのを確実にしたいです。
2233	2105	そしてこの時点でも、これはコンパイラが上述のようにアライメントする
2234	2106	プラットフォームでのみ動作します。
2235	2107	そしてどこかの誰かはそうでないプラットフォームを使っています。
2236	2108	[おそらくは、Cray です; これは C<short>, C<int>, C<long> が全て
2237	2109	8 バイトです。:-)]
2238	2110
2239	2111	=begin original
2240	2112
2241	2113	Counting bytes and watching alignments in lengthy structures is bound to
2242	2114	be a drag. Isn't there a way we can create the template with a simple
2243	2115	program? Here's a C program that does the trick:
2244	2116
2245	2117	=end original
2246	2118
2247	2119	とても長い構造体のバイト数を数えてアライメントを監視するのは面倒なことです。
2248	2120	単純なプログラムでテンプレートを作る方法はないでしょうか?
2249	2121	以下は技を使った C プログラムです:
2250	2122
2251	2123	#include <stdio.h>
2252	2124	#include <stddef.h>
2253	2125
2254	2126	typedef struct {
2255	2127	char fc1;
2256	2128	short fs;
2257	2129	char fc2;
2258	2130	long fl;
2259	2131	} gappy_t;
2260	2132
2261	2133	#define Pt(struct,field,tchar) \
2262	2134	printf( "@%d%s ", offsetof(struct,field), # tchar );
2263	2135
2264	2136	int main() {
2265	2137	Pt( gappy_t, fc1, c );
2266	2138	Pt( gappy_t, fs, s! );
2267	2139	Pt( gappy_t, fc2, c );
2268	2140	Pt( gappy_t, fl, l! );
2269	2141	printf( "\n" );
2270	2142	}
2271	2143
2272	2144	=begin original
2273	2145
2274	2146	The output line can be used as a template in a C<pack> or C<unpack> call:
2275	2147
2276	2148	=end original
2277	2149
2278	2150	出力行は C<pack> や C<unpack> 呼び出しのテンプレートとして使えます。
2279	2151
2280	2152	my $gappy = pack( '@0c @2s! @4c @8l!', $c1, $s, $c2, $l );
2281	2153
2282	2154	=begin original
2283	2155
2284	2156	Gee, yet another template code - as if we hadn't plenty. But
2285	2157	C<@> saves our day by enabling us to specify the offset from the beginning
2286	2158	of the pack buffer to the next item: This is just the value
2287	2159	the C<offsetof> macro (defined in C<E<lt>stddef.hE<gt>>) returns when
2288	2160	given a C<struct> type and one of its field names ("member-designator" in
2289	2161	C standardese).
2290	2162
2291	2163	=end original
2292	2164
2293	2165	げー、新しいテンプレートコードです - まだ十分ではありませんでした。
2294	2166	しかし C<@> は次のアイテムのための pack バッファの先頭からのオフセットを
2295	2167	指定できるようにすることで手間を省きます:
2296	2168	これは単に、C<struct> 型とそのフィールド名(C 標準での「メンバ指定子」)を
2297	2169	与えたときに (C<E<lt>stddef.hE<gt>> で定義されている)C<offsetof> マクロが
2298	2170	返す値です。
2299	2171
2300	2172	=begin original
2301	2173
2302	2174	Neither using offsets nor adding C<x>'s to bridge the gaps is satisfactory.
2303	2175	(Just imagine what happens if the structure changes.) What we really need
2304	2176	is a way of saying "skip as many bytes as required to the next multiple of N".
2305		In fluent templates, you say this with C<x!N> where N is replaced by the
	2177	In fluent Templatese, you say this with C<x!N> where N is replaced by the
2306	2178	appropriate value. Here's the next version of our struct packaging:
2307	2179
2308	2180	=end original
2309	2181
2310	2182	オフセットを使ったり、隙間を渡すために C<x> を追加することでは
2311	2183	十分ではありません。
2312	2184	(構造体が変更されたときに何が起こるかを単に想像してみてください。)
2313	2185	本当に必要なものは、「次の N の倍数のバイトになるまでスキップする」と
2314	2186	書く各方法です。
2315	2187	雄弁なテンプレートでは、これは C<x!N> とできます (ここで N は適切な値
2316	2188	に置き換えられます)。
2317	2189	これは構造体のパッケージ化の次のバージョンです:
2318	2190
2319	2191	my $gappy = pack( 'c x!2 s c x!4 l!', $c1, $s, $c2, $l );
2320	2192
2321	2193	=begin original
2322	2194
2323	2195	That's certainly better, but we still have to know how long all the
2324	2196	integers are, and portability is far away. Rather than C<2>,
2325	2197	for instance, we want to say "however long a short is". But this can be
2326	2198	done by enclosing the appropriate pack code in brackets: C<[s]>. So, here's
2327	2199	the very best we can do:
2328	2200
2329	2201	=end original
2330	2202
2331	2203	これは確実により良いものですが、未だに全ての整数の長さを知る必要があり、
2332	2204	移植性とはかけ離れています。
2333	2205	例えば、C<2> の代わりに、「とにかく short の長さ」と書きたいです。
2334	2206	しかし、これは適切な pack コードを大かっこで囲むこと (C<[s]>) で
2335	2207	可能です。
2336	2208	それで、これはできる限り最良のものです:
2337	2209
2338	2210	my $gappy = pack( 'c x![s] s c x![l!] l!', $c1, $s, $c2, $l );
2339	2211
2340	2212	=head2 Dealing with Endian-ness
2341	2213
2342	2214	(エンディアンを扱う)
2343	2215
2344	2216	=begin original
2345	2217
2346	2218	Now, imagine that we want to pack the data for a machine with a
2347	2219	different byte-order. First, we'll have to figure out how big the data
2348	2220	types on the target machine really are. Let's assume that the longs are
2349	2221	32 bits wide and the shorts are 16 bits wide. You can then rewrite the
2350	2222	template as:
2351	2223
2352	2224	=end original
2353	2225
2354	2226	ここで、異なるバイト順のマシンのためのデータを pack したいとします。
2355	2227	まず、ターゲットマシンでの実際のデータ型の大きさを知る必要があります。
2356	2228	long は 32 ビット幅で short が 16 ビット幅と仮定しましょう。
2357	2229	ここでテンプレートは以下のように書き換えられます:
2358	2230
2359	2231	my $gappy = pack( 'c x![s] s c x![l] l', $c1, $s, $c2, $l );
2360	2232
2361	2233	=begin original
2362	2234
2363	2235	If the target machine is little-endian, we could write:
2364	2236
2365	2237	=end original
2366	2238
2367	2239	もしターゲットマシンがリトルエンディアンなら、以下のように書けます:
2368	2240
2369	2241	my $gappy = pack( 'c x![s] s< c x![l] l<', $c1, $s, $c2, $l );
2370	2242
2371	2243	=begin original
2372	2244
2373	2245	This forces the short and the long members to be little-endian, and is
2374	2246	just fine if you don't have too many struct members. But we could also
2375	2247	use the byte-order modifier on a group and write the following:
2376	2248
2377	2249	=end original
2378	2250
2379	2251	これは short と long のメンバをリトルエンディアンに強制し、もし
2380	2252	あまり多くの構造体メンバがない場合は十分です。
2381	2253	しかし、グループにバイト順修飾子を使うことも出来、以下のように書けます:
2382	2254
2383	2255	my $gappy = pack( '( c x![s] s c x![l] l )<', $c1, $s, $c2, $l );
2384	2256
2385	2257	=begin original
2386	2258
2387	2259	This is not as short as before, but it makes it more obvious that we
2388	2260	intend to have little-endian byte-order for a whole group, not only
2389	2261	for individual template codes. It can also be more readable and easier
2390	2262	to maintain.
2391	2263
2392	2264	=end original
2393	2265
2394	2266	これは以前ほど短くありませんが、ここのテンプレートだけでなく、グループ全体に
2395	2267	リトルエンディアンのバイト順を意図していることがより明らかです。
2396	2268	これはまたより読みやすく、管理もより簡単です。
2397	2269
2398	2270	=head2 Alignment, Take 2
2399	2271
2400	2272	(アライメント、第二幕)
2401	2273
2402	2274	=begin original
2403	2275
2404	2276	I'm afraid that we're not quite through with the alignment catch yet. The
2405	2277	hydra raises another ugly head when you pack arrays of structures:
2406	2278
2407	2279	=end original
2408	2280
2409	2281	アライメントの捕捉について、十分に説明していないのではないかと
2410	2282	心配しています。
2411	2283	構造体の配列を pack しようとすると、ヒドラはまた別の醜い頭をもたげてきます。
2412	2284
2413	2285	typedef struct {
2414	2286	short count;
2415	2287	char glyph;
2416	2288	} cell_t;
2417	2289
2418	2290	typedef cell_t buffer_t[BUFLEN];
2419	2291
2420	2292	=begin original
2421	2293
2422	2294	Where's the catch? Padding is neither required before the first field C<count>,
2423	2295	nor between this and the next field C<glyph>, so why can't we simply pack
2424	2296	like this:
2425	2297
2426	2298	=end original
2427	2299
2428	2300	どこに罠があるのでしょう?
2429	2301	最初のフィールド C<count> の前や、これと次のフィールド C<glyph> の間に
2430	2302	パッディングは不要です; それならなぜ以下のように簡単に
2431	2303	pack できないのでしょう:
2432	2304
2433	2305	# something goes wrong here:
2434	2306	pack( 's!a' x @buffer,
2435	2307	map{ ( $_->{count}, $_->{glyph} ) } @buffer );
2436	2308
2437	2309	=begin original
2438	2310
2439	2311	This packs C<3*@buffer> bytes, but it turns out that the size of
2440	2312	C<buffer_t> is four times C<BUFLEN>! The moral of the story is that
2441	2313	the required alignment of a structure or array is propagated to the
2442	2314	next higher level where we have to consider padding I<at the end>
2443	2315	of each component as well. Thus the correct template is:
2444	2316
2445	2317	=end original
2446	2318
2447	2319	これは C<3*@buffer> バイトに pack しますが、C<buffer_t> のサイズは
2448	2320	C<BUFLEN> の 4 倍になるのです!
2449	2321	このお話の教訓は、構造体や配列で必要なアライメントは、それぞれの要素の
2450	2322	I<最後に> パッディングを考慮する必要がある場所で、より高いレベルに
2451	2323	伝播するということです。
2452	2324	従って、正しいテンプレートは:
2453	2325
2454	2326	pack( 's!ax' x @buffer,
2455	2327	map{ ( $_->{count}, $_->{glyph} ) } @buffer );
2456	2328
2457	2329	=head2 Alignment, Take 3
2458	2330
2459	2331	(アライメント、第三幕)
2460	2332
2461	2333	=begin original
2462	2334
2463	2335	And even if you take all the above into account, ANSI still lets this:
2464	2336
2465	2337	=end original
2466	2338
2467	2339	上記のことを全て頭に入れたとしても、ANSI は以下のような場合:
2468	2340
2469	2341	typedef struct {
2470	2342	char foo[2];
2471	2343	} foo_t;
2472	2344
2473	2345	=begin original
2474	2346
2475	2347	vary in size. The alignment constraint of the structure can be greater than
2476	2348	any of its elements. [And if you think that this doesn't affect anything
2477	2349	common, dismember the next cellphone that you see. Many have ARM cores, and
2478	2350	the ARM structure rules make C<sizeof (foo_t)> == 4]
2479	2351
2480	2352	=end original
2481	2353
2482	2354	サイズは様々であるとしています。
2483	2355	構造のアライメント制約は、それぞれの要素よりも大きいかもしれません。
2484	2356	[そしてもしこれが一般的には何も影響を与えないと考えているなら、
2485	2357	次に見た携帯電話を分解してみてください。
2486	2358	多くは ARM コアを使っていて、ARM 構造体ルールでは
2487	2359	C<sizeof (foo_t)> == 4 となります]
2488	2360
2489	2361	=head2 Pointers for How to Use Them
2490	2362
2491	2363	(ポインタをどう扱うかのポインタ)
2492	2364
2493	2365	=begin original
2494	2366
2495	2367	The title of this section indicates the second problem you may run into
2496	2368	sooner or later when you pack C structures. If the function you intend
2497	2369	to call expects a, say, C<void *> value, you I<cannot> simply take
2498	2370	a reference to a Perl variable. (Although that value certainly is a
2499	2371	memory address, it's not the address where the variable's contents are
2500	2372	stored.)
2501	2373
2502	2374	=end original
2503	2375
2504	2376	この章のタイトルは、C の構造体を pack するときに遅かれ早かれ出会うことになる
2505	2377	2 番目の問題を指し示しています。
2506	2378	呼び出そうとしている関数が、例えば、C<void *> の値を想定している場合、
2507	2379	単純に Perl の変数のリファレンスを使うことは I<できません>。
2508	2380	(確かに値はメモリアドレスですが、値の内容が保持されているアドレスでは
2509	2381	ないからです。)
2510	2382
2511	2383	=begin original
2512	2384
2513	2385	Template code C<P> promises to pack a "pointer to a fixed length string".
2514	2386	Isn't this what we want? Let's try:
2515	2387
2516	2388	=end original
2517	2389
2518	2390	テンプレートコード C<P> は、「固定長文字列へのポインタ」を pack することを
2519	2391	約束します。
2520	2392	これが望みのものではないですか?
2521	2393	試してみましょう:
2522	2394
2523	2395	# allocate some storage and pack a pointer to it
2524	2396	my $memory = "\x00" x $size;
2525	2397	my $memptr = pack( 'P', $memory );
2526	2398
2527	2399	=begin original
2528	2400
2529	2401	But wait: doesn't C<pack> just return a sequence of bytes? How can we pass this
2530	2402	string of bytes to some C code expecting a pointer which is, after all,
2531	2403	nothing but a number? The answer is simple: We have to obtain the numeric
2532	2404	address from the bytes returned by C<pack>.
2533	2405
2534	2406	=end original
2535	2407
2536	2408	ちょっと待った: C<pack> は単にバイトシーケンスを返すのでは?
2537	2409	どうやってこのバイトの文字列を、ポインタ、つまり結局は数値でしかないものを
2538	2410	想定している C のコードに渡せるのでしょう?
2539	2411	答えは単純です: C<pack> で返されたバイト列から数値のアドレスを得なければ
2540	2412	なりません。
2541	2413
2542	2414	my $ptr = unpack( 'L!', $memptr );
2543	2415
2544	2416	=begin original
2545	2417
2546	2418	Obviously this assumes that it is possible to typecast a pointer
2547	2419	to an unsigned long and vice versa, which frequently works but should not
2548	2420	be taken as a universal law. - Now that we have this pointer the next question
2549	2421	is: How can we put it to good use? We need a call to some C function
2550	2422	where a pointer is expected. The read(2) system call comes to mind:
2551	2423
2552	2424	=end original
2553	2425
2554	2426	明らかに、これはポインタから unsigned long への、およびその逆の型キャストが
2555	2427	可能であることを仮定しています; これはしばしば動作しますが、普遍的な
2556	2428	原則として扱うべきではありません。
2557	2429	- ここでこのポインタを得ましたが、次の質問は: これをうまく使うには
2558	2430	どうするのがよいでしょう?
2559	2431	ポインタを想定している C 関数を呼び出す必要があります。
2560	2432	read(2) システムコールが心に浮かびます:
2561	2433
2562	2434	ssize_t read(int fd, void *buf, size_t count);
2563	2435
2564	2436	=begin original
2565	2437
2566	2438	After reading L<perlfunc> explaining how to use C<syscall> we can write
2567	2439	this Perl function copying a file to standard output:
2568	2440
2569	2441	=end original
2570	2442
2571	2443	L<perlfunc> にある C<syscall> の使い方の説明を読んだ後、ファイルを
2572	2444	標準出力にコピーする Perl 関数を書けます:
2573	2445
2574		require 'syscall.ph'; ~~# run h2ph to generate this file~~
	2446	require 'syscall.ph';
2575	2447	sub cat($){
2576	2448	my $path = shift();
2577	2449	my $size = -s $path;
2578	2450	my $memory = "\x00" x $size; # allocate some memory
2579	2451	my $ptr = unpack( 'L', pack( 'P', $memory ) );
2580	2452	open( F, $path ) \|\| die( "$path: cannot open ($!)\n" );
2581	2453	my $fd = fileno(F);
2582	2454	my $res = syscall( &SYS_read, fileno(F), $ptr, $size );
2583	2455	print $memory;
2584	2456	close( F );
2585	2457	}
2586	2458
2587	2459	=begin original
2588	2460
2589	2461	This is neither a specimen of simplicity nor a paragon of portability but
2590	2462	it illustrates the point: We are able to sneak behind the scenes and
2591	2463	access Perl's otherwise well-guarded memory! (Important note: Perl's
2592	2464	C<syscall> does I<not> require you to construct pointers in this roundabout
2593	2465	way. You simply pass a string variable, and Perl forwards the address.)
2594	2466
2595	2467	=end original
2596	2468
2597	2469	これは単純さの見本でもなければ移植性の模範でもありませんが、要点を
2598	2470	示しています: 舞台裏に忍び込んで、その他の点では良く守られている Perl の
2599	2471	メモリにアクセスできます!
2600	2472	(重要な注意: Perl の C<syscall> は、この回りくどい方法でポインタを
2601	2473	構成する必要は I<ありません> 。
2602	2474	単に文字列変数を渡せば、Perl がアドレスを転送します。)
2603	2475
2604	2476	=begin original
2605	2477
2606	2478	How does C<unpack> with C<P> work? Imagine some pointer in the buffer
2607	2479	about to be unpacked: If it isn't the null pointer (which will smartly
2608	2480	produce the C<undef> value) we have a start address - but then what?
2609	2481	Perl has no way of knowing how long this "fixed length string" is, so
2610	2482	it's up to you to specify the actual size as an explicit length after C<P>.
2611	2483
2612	2484	=end original
2613	2485
2614	2486	C<unpack> では C<P> はどのように動作するのでしょう?
2615	2487	unpack されようとしているバッファにあるポインタを想像します:
2616	2488	もしそれが(賢く C<undef> 値を生成する)ヌルポインタでない場合、開始アドレスを
2617	2489	得ることになります - でも、それで?
2618	2490	Perl はこの「固定長文字列」の長さを知る方法がないので、C<P> の後ろに
2619	2491	明示的な長さとして実際の大きさを指定する必要があります。
2620	2492
2621	2493	my $mem = "abcdefghijklmn";
2622	2494	print unpack( 'P5', pack( 'P', $mem ) ); # prints "abcde"
2623	2495
2624	2496	=begin original
2625	2497
2626	2498	As a consequence, C<pack> ignores any number or C<*> after C<P>.
2627	2499
2628	2500	=end original
2629	2501
2630	2502	結果として、C<pack> は C<P> の後の数値や C<*> を無視します。
2631	2503
2632	2504	=begin original
2633	2505
2634	2506	Now that we have seen C<P> at work, we might as well give C<p> a whirl.
2635	2507	Why do we need a second template code for packing pointers at all? The
2636	2508	answer lies behind the simple fact that an C<unpack> with C<p> promises
2637	2509	a null-terminated string starting at the address taken from the buffer,
2638	2510	and that implies a length for the data item to be returned:
2639	2511
2640	2512	=end original
2641	2513
2642	2514	ここで C<P> の動作は見たので、同様に C<p> を試してみます。
2643	2515	とにかく、なぜポインタを pack するのに 2 番目のテンプレートコードが
2644	2516	必要なのでしょう?
2645	2517	答えは、
2646	2518	C<unpack> の C<p> はバッファから取った NUL 終端された文字列がその
2647	2519	アドレスから始まっていることを約束していて、返されるデータアイテムの
2648	2520	長さを暗示しているという単純な事実の後ろに横たわっています:
2649	2521
2650	2522	my $buf = pack( 'p', "abc\x00efhijklmn" );
2651	2523	print unpack( 'p', $buf ); # prints "abc"
2652	2524
2653	2525	=begin original
2654	2526
2655	2527	Albeit this is apt to be confusing: As a consequence of the length being
2656	2528	implied by the string's length, a number after pack code C<p> is a repeat
2657	2529	count, not a length as after C<P>.
2658	2530
2659	2531	=end original
2660	2532
2661	2533	それでもこれは混乱しがちです: 長さが文字列の長さを暗示しているので、
2662	2534	C<p> の後の数値は繰り返し数であって、C<P> の後のように長さではありません。
2663	2535
2664	2536	=begin original
2665	2537
2666	2538	Using C<pack(..., $x)> with C<P> or C<p> to get the address where C<$x> is
2667	2539	actually stored must be used with circumspection. Perl's internal machinery
2668	2540	considers the relation between a variable and that address as its very own
2669	2541	private matter and doesn't really care that we have obtained a copy. Therefore:
2670	2542
2671	2543	=end original
2672	2544
2673	2545	C<$x> が実際に保管されているアドレスを得るために C<pack(..., $x)> で
2674	2546	C<P> や C<p> を使うことは慎重に行われなければなりません。
2675	2547	Perl の内部機構は変数とそのアドレスの関係をとてもプライベートな問題と
2676	2548	考え、私たちがコピーを得たことを実際には気にしません。
2677	2549	従って:
2678	2550
2679	2551	=over 4
2680	2552
2681	2553	=item *
2682	2554
2683	2555	=begin original
2684	2556
2685	2557	Do not use C<pack> with C<p> or C<P> to obtain the address of variable
2686	2558	that's bound to go out of scope (and thereby freeing its memory) before you
2687	2559	are done with using the memory at that address.
2688	2560
2689	2561	=end original
2690	2562
2691	2563	その変数のアドレスのメモリを使い終わる前にスコープから出る(従って
2692	2564	メモリが開放される)ような変数のアドレスを得るために
2693	2565	C<pack> の C<p> や C<P> を使わないでください。
2694	2566
2695	2567	=item *
2696	2568
2697	2569	=begin original
2698	2570
2699	2571	Be very careful with Perl operations that change the value of the
2700	2572	variable. Appending something to the variable, for instance, might require
2701	2573	reallocation of its storage, leaving you with a pointer into no-man's land.
2702	2574
2703	2575	=end original
2704	2576
2705	2577	変数の値を変更する Perl 操作にとても注意してください。
2706	2578	例えば、値に何かを追加すると、その保管場所を再配置することになって、
2707	2579	ポインタを誰もいないところにしたままにすることに
2708	2580	なるかもしれません。
2709	2581
2710	2582	=item *
2711	2583
2712	2584	=begin original
2713	2585
2714	2586	Don't think that you can get the address of a Perl variable
2715	2587	when it is stored as an integer or double number! C<pack('P', $x)> will
2716	2588	force the variable's internal representation to string, just as if you
2717	2589	had written something like C<$x .= ''>.
2718	2590
2719	2591	=end original
2720	2592
2721	2593	整数や倍精度実数として保管されている Perl 変数のアドレスを取れるとは
2722	2594	考えないでください!
2723	2595	C<pack('P', $x)> は、ちょうど C<$x .= ''> のようなものを書いたのと同様に、
2724	2596	変数の内部表現を文字列に強制します。
2725	2597
2726	2598	=back
2727	2599
2728	2600	=begin original
2729	2601
2730	2602	It's safe, however, to P- or p-pack a string literal, because Perl simply
2731	2603	allocates an anonymous variable.
2732	2604
2733	2605	=end original
2734	2606
2735	2607	しかし、文字列リテラルを P または p で pack することは安全です;
2736	2608	なぜなら Perl は単に無名変数を割り当てるからです。
2737	2609
2738	2610	=head1 Pack Recipes
2739	2611
2740	2612	(pack レシピ)
2741	2613
2742	2614	=begin original
2743	2615
2744	2616	Here are a collection of (possibly) useful canned recipes for C<pack>
2745	2617	and C<unpack>:
2746	2618
2747	2619	=end original
2748	2620
2749	2621	以下に C<pack> と C<unpack> に関する、(多分)役に立つレシピをまとめます:
2750	2622
2751	2623	# Convert IP address for socket functions
2752	2624	pack( "C4", split /\./, "123.4.5.6" );
2753	2625
2754	2626	# Count the bits in a chunk of memory (e.g. a select vector)
2755	2627	unpack( '%32b*', $mask );
2756	2628
2757	2629	# Determine the endianness of your system
2758	2630	$is_little_endian = unpack( 'c', pack( 's', 1 ) );
2759	2631	$is_big_endian = unpack( 'xc', pack( 's', 1 ) );
2760	2632
2761	2633	# Determine the number of bits in a native integer
2762	2634	$bits = unpack( '%32I!', ~0 );
2763	2635
2764	2636	# Prepare argument for the nanosleep system call
2765	2637	my $timespec = pack( 'L!L!', $secs, $nanosecs );
2766	2638
2767	2639	=begin original
2768	2640
2769	2641	For a simple memory dump we unpack some bytes into just as
2770	2642	many pairs of hex digits, and use C<map> to handle the traditional
2771	2643	spacing - 16 bytes to a line:
2772	2644
2773	2645	=end original
2774	2646
2775	2647	単純なメモリダンプのために、バイト列を 16 進数の組に unpack し、
2776	2648	C<map> を使って伝統的な表現 - 1 行に 16 バイト - に加工します:
2777	2649
2778	2650	my $i;
2779	2651	print map( ++$i % 16 ? "$_ " : "$_\n",
2780	2652	unpack( 'H2' x length( $mem ), $mem ) ),
2781	2653	length( $mem ) % 16 ? "\n" : '';
2782	2654
2783	2655	=head1 Funnies Section
2784	2656
2785	2657	(ネタ部門)
2786	2658
2787	2659	# Pulling digits out of nowhere...
2788	2660	print unpack( 'C', pack( 'x' ) ),
2789	2661	unpack( '%B*', pack( 'A' ) ),
2790	2662	unpack( 'H', pack( 'A' ) ),
2791	2663	unpack( 'A', unpack( 'C', pack( 'A' ) ) ), "\n";
2792	2664
2793	2665	# One for the road ;-)
2794	2666	my $advice = pack( 'all u can in a van' );
2795	2667
2796	2668	=head1 Authors
2797	2669
2798	2670	(著者)
2799	2671
	2672	=begin original
	2673
2800	2674	Simon Cozens and Wolfgang Laun.
2801	2675
	2676	=end original
	2677
	2678	Simon Cozens と Wolfgang Laun。
	2679
2802	2680	=begin meta
2803	2681
2804		Translate: ~~SHIRA~~K~~ATA K~~entaro <argrath@ub32.org> (5.8.8-)
	2682	Translate: Kentaro Shirakata <argrath@ub32.org> (5.8.8-)
2805		Status: completed
2806	2683
2807	2684	=end meta

Powered by Amon2, 翻訳, サイト. Operated by Japan Perl Association