GNU.WIKI: The GNU/Linux Knowledge Base

  [HOME] [PHP Manual] [HowTo] [ABS] [MAN1] [MAN2] [MAN3] [MAN4] [MAN5] [MAN6] [MAN7] [MAN8] [MAN9]

  [0-9] [Aa] [Bb] [Cc] [Dd] [Ee] [Ff] [Gg] [Hh] [Ii] [Jj] [Kk] [Ll] [Mm] [Nn] [Oo] [Pp] [Qq] [Rr] [Ss] [Tt] [Uu] [Vv] [Ww] [Xx] [Yy] [Zz]


       XML::RSSLite - lightweight, "relaxed" RSS (and XML-ish) parser


         use XML::RSSLite;

         . . .

         parseRSS(\%result, \$content);

         print "=== Channel ===
               "Title: $result{'title'}
               "Desc:  $result{'description'}
               "Link:  $result{'link'}


         foreach $item (@{$result{'item'}}) {
         print "  --- Item ---
               "  Title: $item->{'title'}
               "  Desc:  $item->{'description'}
               "  Link:  $item->{'link'}



       This module attempts to extract the maximum amount of content from
       available documents, and is less concerned with XML compliance than
       alternatives. Rather than rely on XML::Parser, it uses heuristics and
       good old-fashioned Perl regular expressions. It stores the data in a
       simple hash structure, and "aliases" certain tags so that when done,
       you can count on having the minimal data necessary for re-constructing
       a valid RSS file. This means you get the basic title, description, and
       link for a channel and its items.

       This module extracts more usable links by parsing "scriptingNews" and
       "weblog" formats in addition to RDF & RSS. It also "sanitizes" the
       output for best results. The munging includes:

       Remove html tags to leave plain text
       Remove characters other than 0-9~!@#$%^&*()-+=a-zA-Z[];',.:"<>?\s
       Remove leading whitespace from URIs
       Use <url> tags when <link> is empty
       Use misplaced urls in <title> when <link> is empty
       Exract links from <a href=...> if required
       Limit links to ftp and http(s)
       Join relative item urls (beginning with / or #) to the site base

       parseRSS($outHashRef, $inScalarRef)
           $inScalarRef is a reference to a scalar containing the document to
           be parsed, the contents will effectively be destroyed. $outHashRef
           is a reference to the hash within which to store the parsed

       parseXML(\%parsedTree, \$parseThis, 'topTag', $comments);
           parsedTree - required
               Reference to hash to store the parsed document within.

           parseThis  - required
               Reference to scalar containing the document to parse.

           topTag     - optional
               Tag to consider the root node, leaving this undefined is not

           comments   - optional
               false will remove contents from parseThis
               true will not remove comments from parseThis
               array reference is true, comments are stored here

       This is not a conforming parser. It does not handle the following


             <foo bar=">">


             <foo><bar> <bar></bar> <bar></bar> </bar></foo>


             <![CDATA[ ]]>



       It's non-validating, without a DTD the following cannot be properly

           This may or may not be arriving in some future release.


       perl(1), "XML::RSS", "XML::SAX::PurePerl", "XML::Parser::Lite",


       Jerrad Pierce <>.

       Scott Thomason <>


       Portions Copyright (c) 2002,2003,2009 Jerrad Pierce, (c) 2000 Scott
       Thomason.  All rights reserved. This program is free software; you can
       redistribute it and/or modify it under the same terms as Perl itself.

  All copyrights belong to their respective owners. Other content (c) 2014-2018, GNU.WIKI. Please report site errors to
Page load time: 0.108 seconds. Last modified: November 04 2018 12:49:43.