GNU.WIKI: The GNU/Linux Knowledge Base

  [HOME] [PHP Manual] [HowTo] [ABS] [MAN1] [MAN2] [MAN3] [MAN4] [MAN5] [MAN6] [MAN7] [MAN8] [MAN9]

  [0-9] [Aa] [Bb] [Cc] [Dd] [Ee] [Ff] [Gg] [Hh] [Ii] [Jj] [Kk] [Ll] [Mm] [Nn] [Oo] [Pp] [Qq] [Rr] [Ss] [Tt] [Uu] [Vv] [Ww] [Xx] [Yy] [Zz]


NAME

       XML::Easy::Text - XML parsing and serialisation

SYNOPSIS

               use XML::Easy::Text qw(
                       xml10_read_content_object xml10_read_element
                       xml10_read_document xml10_read_extparsedent_object
               );

               $content = xml10_read_content_object($text);
               $element = xml10_read_element($text);
               $element = xml10_read_document($text);
               $content = xml10_read_extparsedent_object($text);

               use XML::Easy::Text qw(
                       xml10_write_content xml10_write_element
                       xml10_write_document xml10_write_extparsedent
               );

               $text = xml10_write_content($content);
               $text = xml10_write_element($element);
               $text = xml10_write_document($element, "UTF-8");
               $text = xml10_write_extparsedent($content, "UTF-8");

DESCRIPTION

       This module supplies functions that parse and serialise XML data
       according to the XML 1.0 specification.

       This module is oriented towards the use of XML to represent data for
       interchange purposes, rather than the use of XML as markup of
       principally textual data.  It does not perform any schema processing,
       and does not interpret DTDs or any other kind of schema.  It adheres
       strictly to the XML specification, in all its awkward details, except
       for the aforementioned DTDs.

       XML data in memory is represented using a tree of XML::Easy::Content
       and XML::Easy::Element objects.  Such a tree encapsulates all the
       structure and data content of an XML element or document, without any
       irrelevant detail resulting from the textual syntax.  These node trees
       are readily manipulated by the functions in XML::Easy::NodeBasics.

       The functions of this module are implemented in C for performance, with
       a pure Perl backup version (which has good performance compared to
       other pure Perl parsers) for systems that can't handle XS modules.

FUNCTIONS

       All functions "die" on error.

   Parsing
       These function take textual XML and extract the abstract XML content.
       In the terminology of the XML specification, they constitute a non-
       validating processor: they check for well-formedness of the XML, but
       not for adherence of the content to any schema.

       The inputs (to be parsed) for these functions are always character
       strings.  XML text is frequently encoded using UTF-8, or some other
       Unicode encoding, so that it can contain characters from the full
       Unicode repertoire.  In that case, something must perform UTF-8
       decoding (or decoding of some other character encoding) to convert the
       octets of a file to the characters on which these functions operate.  A
       Perl I/O layer can do the job (see perlio), or it can be performed
       explicitly using the "decode" function in the Encode module.

       xml10_read_content_object(TEXT)
           TEXT must be a character string.  It is parsed against the content
           production of the XML 1.0 grammar; i.e., as a sequence of the kind
           of matter that can appear between the start-tag and end-tag of an
           element.  Returns a reference to an XML::Easy::Content object.

           Normally one would not want to use this function directly, but
           prefer the higher-level "xml10_read_document" function.  This
           function exists for the construction of custom XML parsers in
           situations that don't match the full XML grammar.

       xml10_read_content_twine(TEXT)
           Performs the same parsing job as "xml10_read_content_object", but
           returns the resulting content chunk in the form of twine (see
           "Twine" in XML::Easy::NodeBasics) rather than a content object.

           The returned array must not be subsequently modified.  If possible,
           it will be marked as read-only in order to prevent modification.

       xml10_read_content(TEXT)
           Deprecated alias for "xml10_read_content_twine".

       xml10_read_element(TEXT)
           TEXT must be a character string.  It is parsed against the element
           production of the XML 1.0 grammar; i.e., as an item bracketed by
           tags and containing content that may recursively include other
           elements.  Returns a reference to an XML::Easy::Element object.

           Normally one would not want to use this function directly, but
           prefer the higher-level "xml10_read_document" function.  This
           function exists for the construction of custom XML parsers in
           situations that don't match the full XML grammar.

       xml10_read_document(TEXT)
           TEXT must be a character string.  It is parsed against the document
           production of the XML 1.0 grammar; i.e., as a root element
           (possibly containing subelements) optionally preceded and followed
           by non-content matter, possibly headed by an XML declaration.  (A
           document type declaration is not accepted; this module does not
           process schemata.)  Returns a reference to an XML::Easy::Element
           object which represents the root element.  Nothing is returned
           relating to the XML declaration or other non-content matter.

           This is the most likely function to use to process incoming XML
           data.  Beware that the encoding declaration in the XML declaration,
           if any, does not affect the interpretation of the input as a
           sequence of characters.

       xml10_read_extparsedent_object(TEXT)
           TEXT must be a character string.  It is parsed against the
           extParsedEnt production of the XML 1.0 grammar; i.e., as a sequence
           of content (containing character data and subelements), possibly
           headed by a text declaration (which is similar to, but not the same
           as, an XML declaration).  Returns a reference to an
           XML::Easy::Content object.

           This is a relatively obscure part of the XML grammar, used when a
           subpart of a document is stored in a separate file.  You're more
           likely to require the "xml10_read_document" function.

       xml10_read_extparsedent_twine(TEXT)
           Performs the same parsing job as "xml10_read_extparsedent_object",
           but returns the resulting content chunk in the form of twine (see
           "Twine" in XML::Easy::NodeBasics) rather than a content object.

           The returned array must not be subsequently modified.  If possible,
           it will be marked as read-only in order to prevent modification.

       xml10_read_extparsedent(TEXT)
           Deprecated alias for "xml10_read_extparsedent_twine".

   Serialisation
       These function take abstract XML data and serialise it as textual XML.
       They do not perform indentation, default attribute suppression, or any
       other schema-dependent processing.

       The outputs of these functions are always character strings.  XML text
       is frequently encoded using UTF-8, or some other Unicode encoding, so
       that it can contain characters from the full Unicode repertoire.  In
       that case, something must perform UTF-8 encoding (or encoding of some
       other character encoding) to convert the characters generated by these
       functions to the octets of a file.  A Perl I/O layer can do the job
       (see perlio), or it can be performed explicitly using the "encode"
       function in the Encode module.

       xml10_write_content(CONTENT)
           CONTENT must be a reference to either an XML::Easy::Content object
           or a twine array (see "Twine" in XML::Easy::NodeBasics).  The XML
           1.0 textual representation of that content is returned.

       xml10_write_element(ELEMENT)
           ELEMENT must be a reference to an XML::Easy::Element object.  The
           XML 1.0 textual representation of that element is returned.

       xml10_write_document(ELEMENT[, ENCODING])
           ELEMENT must be a reference to an XML::Easy::Element object.  The
           XML 1.0 textual form of a document with that element as the root
           element is returned.  The document includes an XML declaration.  If
           ENCODING is supplied, it must be a valid character encoding name,
           and the XML declaration specifies it in an encoding declaration.
           (The returned string consists of unencoded characters regardless of
           the encoding specified.)

       xml10_write_extparsedent(CONTENT[, ENCODING])
           CONTENT must be a reference to either an XML::Easy::Content object
           or a twine array (see "Twine" in XML::Easy::NodeBasics).  The XML
           1.0 textual form of an external parsed entity encapsulating that
           content is returned.  If ENCODING is supplied, it must be a valid
           character encoding name, and the returned entity includes a text
           declaration that specifies the encoding name in an encoding
           declaration.  (The returned string consists of unencoded characters
           regardless of the encoding specified.)

SEE ALSO

       XML::Easy::NodeBasics, XML::Easy::Syntax,
       <http://www.w3.org/TR/REC-xml/>

AUTHOR

       Andrew Main (Zefram) <zefram@fysh.org>

COPYRIGHT

       Copyright (C) 2008, 2009 PhotoBox Ltd

       Copyright (C) 2009, 2010, 2011 Andrew Main (Zefram) <zefram@fysh.org>

LICENSE

       This module is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.



  All copyrights belong to their respective owners. Other content (c) 2014-2018, GNU.WIKI. Please report site errors to webmaster@gnu.wiki.
Page load time: 0.266 seconds. Last modified: November 04 2018 12:49:43.