=encoding euc-jp

=head1 NAME

=begin original

HTML::PullParser - Alternative HTML::Parser interface

=end original

HTML::PullParser - 代替 HTML::Parser インタフェース

(訳注: (TBR)がついている段落は「みんなの自動翻訳＠TexTra」による
機械翻訳です。)

=head1 SYNOPSIS

 use HTML::PullParser;

 $p = HTML::PullParser->new(file => "index.html",
                            start => 'event, tagname, @attr',
                            end   => 'event, tagname',
                            ignore_elements => [qw(script style)],
                           ) || die "Can't open: $!";
 while (my $token = $p->get_token) {
     #...do something with $token
 }

=head1 DESCRIPTION

=begin original

The HTML::PullParser is an alternative interface to the HTML::Parser class.
It basically turns the HTML::Parser inside out.  You associate a file
(or any IO::Handle object or string) with the parser at construction time and
then repeatedly call $parser->get_token to obtain the tags and text
found in the parsed document.

=end original

HTML::PullParserは、HTML::Parserクラスの代替インタフェースです。
これは基本的にHTML::Parserを裏返しにします。
ファイル(または任意のIO::Handleオブジェクトまたは文字列)を構築時にパーサーに関連付け、$parser->get_tokenを繰り返し呼び出して、解析された文書内で見つかったタグとテキストを取得します。
(TBR)

=begin original

The following methods are provided:

=end original

次のメソッドが用意されています。
(TBR)

=over 4

=item $p = HTML::PullParser->new( file => $file, %options )

=item $p = HTML::PullParser->new( doc => \$doc, %options )

=begin original

A C<HTML::PullParser> can be made to parse from either a file or a
literal document based on whether the C<file> or C<doc> option is
passed to the parser's constructor.

=end original

C<HTML::PullParser>は、C<file>またはC<doc>オプションがパーサーのコンストラクタに渡されたかどうかに基づいて、ファイルまたはリテラル文書のいずれかから解析するように設定できます。
(TBR)

=begin original

The C<file> passed in can either be a file name or a file handle
object.  If a file name is passed, and it can't be opened for reading,
then the constructor will return an undefined value and $!  will tell
you why it failed.  Otherwise the argument is taken to be some object
that the C<HTML::PullParser> can read() from when it needs more data.
The stream will be read() until EOF, but not closed.

=end original

渡されるC<file>は、ファイル名またはファイルハンドルオブジェクトのいずれかです。
ファイル名が渡され、読み込み用に開くことができない場合、コンストラクタは未定義の値を返し、$!は失敗した理由を示します。
それ以外の場合、引数は、C<HTML::PullParser>がさらにデータを必要とするときにread()できるオブジェクトと見なされます。
ストリームはEOFまでread()されますが、閉じられません。
(TBR)

=begin original

A C<doc> can be passed plain or as a reference
to a scalar.  If a reference is passed then the value of this scalar
should not be changed before all tokens have been extracted.

=end original

C<doc>は、プレーンまたはスカラーへの参照として渡すことができます。
参照が渡される場合、このスカラーの値は、すべてのトークンが抽出される前に変更されるべきではありません。
(TBR)

=begin original

Next the information to be returned for the different token types must
be set up.  This is done by simply associating an argspec (as defined
in L<HTML::Parser>) with the events you have an interest in.  For
instance, if you want C<start> tokens to be reported as the string
C<'S'> followed by the tagname and the attributes you might pass an
C<start>-option like this:

=end original

次に、異なるトークンタイプに対して返される情報を設定する必要があります。
これは、(L<HTML::Parser>で定義されているように)argspecを目的のイベントに関連付けるだけで実行されます。
たとえば、C<start>トークンを文字列C<'S'>の後にタグ名と属性を付けて報告する場合は、次のようにC<start>-オプションを渡します。
(TBR)

   $p = HTML::PullParser->new(
          doc   => $document_to_parse,
          start => '"S", tagname, @attr',
          end   => '"E", tagname',
        );

=begin original

At last other C<HTML::Parser> options, like C<ignore_tags>, and
C<unbroken_text>, can be passed in.  Note that you should not use the
I<event>_h options to set up parser handlers.  That would confuse the
inner logic of C<HTML::PullParser>.

=end original

最後に、C<ignore_tags>やC<unbroken_text>などの他のC<HTML::Parser>オプションを渡すことができます。
I<event>_hオプションを使用してパーサーハンドラを設定しないでください。
これはC<HTML::PullParser>の内部ロジックを混乱させます。
(TBR)

=item $token = $p->get_token

=begin original

This method will return the next I<token> found in the HTML document,
or C<undef> at the end of the document.  The token is returned as an
array reference.  The content of this array match the argspec set up
during C<HTML::PullParser> construction.

=end original

このメソッドは、文書内で見つかった次のI<token>、または文書の最後のC<undef>を返します。
トークンは配列参照として返されます。
この配列の内容は、C<HTML::PullParser>の構築時に設定されたargspecと一致します。
(TBR)

=item $p->unget_token( @tokens )

=begin original

If you find out you have read too many tokens you can push them back,
so that they are returned again the next time $p->get_token is called.

=end original

読み取ったトークンが多すぎることがわかった場合は、それらをプッシュバックして、次に$p->get_tokenが呼び出されたときに再び返されるようにすることができます。
(TBR)

=back

=head1 EXAMPLES

=begin original

The 'eg/hform' script shows how we might parse the form section of
HTML::Documents using HTML::PullParser.

=end original

'eg/hform'スクリプトは、HTML::PullParserを使用してHTML::Documentsのフォームセクションを解析する方法を示しています。
(TBR)

=head1 SEE ALSO

L<HTML::Parser>, L<HTML::TokeParser>

=head1 COPYRIGHT

Copyright 1998-2001 Gisle Aas.

This library is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.

=begin meta

Translate: SHIRAKATA Kentaro <argrath@ub32.org>
Status: in progress

=end meta

=cut