GNU.WIKI: The GNU/Linux Knowledge Base

  [HOME] [PHP Manual] [HowTo] [ABS] [MAN1] [MAN2] [MAN3] [MAN4] [MAN5] [MAN6] [MAN7] [MAN8] [MAN9]

  [0-9] [Aa] [Bb] [Cc] [Dd] [Ee] [Ff] [Gg] [Hh] [Ii] [Jj] [Kk] [Ll] [Mm] [Nn] [Oo] [Pp] [Qq] [Rr] [Ss] [Tt] [Uu] [Vv] [Ww] [Xx] [Yy] [Zz]


NAME

       XML::Handler::YAWriter - Yet another Perl SAX XML Writer

SYNOPSIS

         use XML::Handler::YAWriter;

         my $ya = new XML::Handler::YAWriter( %options );
         my $perlsax = new XML::Parser::PerlSAX( 'Handler' => $ya );

DESCRIPTION

       YAWriter implements Yet Another XML::Handler::Writer. The reasons for
       this one are that I needed a flexible escaping technique, and want some
       kind of pretty printing. If an instance of YAWriter is created without
       any options, the default behavior is to produce an array of strings
       containing the XML in :

         @{$ya->{Strings}}

   Options
       Options are given in the usual 'key' => 'value' idiom.

       Output IO::File
           This option tells YAWriter to use an already open file for output,
           instead of using $ya->{Strings} to store the array of strings. It
           should be noted that the only thing the object needs to implement
           is the print method. So anything can be used to receive a stream of
           strings from YAWriter.

       AsFile string
           This option will cause start_document to open named file and
           end_document to close it. Use the literal dash "-" if you want to
           print on standard output.

       AsPipe string
           This option will cause start_document to open a pipe and
           end_document to close it. The pipe is a normal shell command.
           Secure shell comes handy but has a 2GB limit on most systems.

       AsArray boolean
           This option will force storage of the XML in $ya->{Strings}, even
           if the Output option is given.

       AsString boolean
           This option will cause end_document to return the complete XML
           document in a single string. Most SAX drivers return the value of
           end_document as a result of their parse method. As this may not
           work with some combinations of SAX drivers and filters, a join of
           $ya->{Strings} in the controlling method is preferred.

       Encoding string
           This will change the default encoding from UTF-8 to anything you
           like.  You should ensure that given data are already in this
           encoding or provide an Escape hash, to tell YAWriter about the
           recoding.

       Escape hash
           The Escape hash defines substitutions that have to be done to any
           string, with the exception of the processing_instruction and
           doctype_decl methods, where I think that escaping of target and
           data would cause more trouble than necessary.

           The default value for Escape is

               $XML::Handler::YAWriter::escape = {
                       '&'  => '&',
                       '<'  => '&lt;',
                       '>'  => '&gt;',
                       '"'  => '&quot;',
                       '--' => '&#45;&#45;'
                       };

           YAWriter will use an evaluated sub to make the recoding based on a
           given Escape hash reasonably fast. Future versions may use XS to
           improve this performance bottleneck.

       Pretty hash
           Hash of string => boolean tuples, to define kind of prettyprinting.
           Default to undef. Possible string values:

           AddHiddenNewline boolean
               Add hidden newline before ">"

           AddHiddenAttrTab boolean
               Add hidden tabulation for attributes

           CatchEmptyElement boolean
               Catch empty Elements, apply "/>" compression

           CatchWhiteSpace boolean
               Catch whitespace with comments

           CompactAttrIndent
               Places Attributes on the same line as the Element

           IsSGML boolean
               This option will cause start_document, processing_instruction
               and doctype_decl to appear as SGML. The SGML is still well-
               formed of course, if your SAX events are well-formed.

           NoComments boolean
               Supress Comments

           NoDTD boolean
               Supress DTD

           NoPI boolean
               Supress Processing Instructions

           NoProlog boolean
               Supress <?xml ... ?> Prolog

           NoWhiteSpace boolean
               Supress WhiteSpace to clean documents from prior pretty
               printing.

           PrettyWhiteIndent boolean
               Add visible indent before any eventstring

           PrettyWhiteNewline boolean
               Add visible newlines before any eventstring

           SAX1 boolean (not yet implemented)
               Output only SAX1 compliant eventstrings

   Notes:
       Correct handling of start_document and end_document is required!

       The YAWriter Object initialises its structures during start_document
       and does its cleanup during end_document.  If you forget to call
       start_document, any other method will break during the run. Most likely
       place is the encode method, trying to eval undef as a subroutine. If
       you forget to call end_document, you should not use a single instance
       of YAWriter more than once.

       For small documents AsArray may be the fastest method and AsString the
       easiest one to receive the output of YAWriter. But AsString and AsArray
       may run out of memory with infinite SAX streams. The only method
       XML::Handler::Writer calls on a given Output object is the print
       method. So it's easy to use a self written Output object to improve
       streaming.

       A single instance of XML::Handler::YAWriter is able to produce more
       than one file in a single run. Be sure to provide a fresh IO::File as
       Output before you call start_document and close this File after calling
       end_document. Or provide a filename in AsFile, so start_document and
       end_document can open and close its own filehandle.

       Automatic recoding between 8bit and 16bit does not work in any Perl
       correctly !

       I have Perl-5.00563 at home and here I can specify "use utf8;" in the
       right places to make recoding work. But I dislike saying "use 5.00555;"
       because many systems run 5.00503.

       If you use some 8bit character set internally and want use national
       characters, either state your character as Encoding to be ISO-8859-1,
       or provide an Escape hash similar to the following :

           $ya->{'Escape'} = {
                           '&'  => '&amp;',
                           '<'  => '&lt;',
                           '>'  => '&gt;',
                           '"'  => '&quot;',
                           '--' => '&#45;&#45;'
                           'oe' => '&ouml;'
                           'ae' => '&auml;'
                           'ue' => '&uuml;'
                           'Oe' => '&Ouml;'
                           'Ae' => '&Auml;'
                           'Ue' => '&Uuml;'
                           'ss' => '&szlig;'
                           };

       You may abuse YAWriter to clean whitespace from XML documents. Take a
       look at test.pl, doing just that with an XML::Edifact message, without
       querying the DTD. This may work in 99% of the cases where you want to
       get rid of ignorable whitespace caused by the various forms of pretty
       printing.

           my $ya = new XML::Handler::YAWriter(
               'Output' => new IO::File ( ">-" );
               'Pretty' => {
                   'NoWhiteSpace'=>1,
                   'NoComments'=>1,
                   'AddHiddenNewline'=>1,
                   'AddHiddenAttrTab'=>1,
               } );

       XML::Handler::Writer implements any method XML::Parser::PerlSAX wants.
       This extends the Java SAX1.0 specification. I have in mind using
       Pretty=>SAX1=>1 to disable this feature, if abusing YAWriter for a SAX
       proxy.

AUTHOR

       Michael Koehne, Kraehe@Copyleft.De

Thanks

       "Derksen, Eduard (Enno), CSCIO" <enno@att.com> helped me with the
       Escape hash and gave quite a lot of useful comments.

SEE ALSO

       perl and XML::Parser::PerlSAX



  All copyrights belong to their respective owners. Other content (c) 2014-2018, GNU.WIKI. Please report site errors to webmaster@gnu.wiki.
Page load time: 0.108 seconds. Last modified: November 04 2018 12:49:43.