685 <p> |
685 <p> |
686 The <code>letters</code> relation stays rock steady and <code>relpipe-tr-cut 'numbers'</code> does not affect it in any way. |
686 The <code>letters</code> relation stays rock steady and <code>relpipe-tr-cut 'numbers'</code> does not affect it in any way. |
687 </p> |
687 </p> |
688 |
688 |
689 |
689 |
|
690 <h2>Read an Atom feed using XQuery and relpipe-in-xml</h2> |
|
691 |
|
692 <p> |
|
693 Atom Syndication Format is a standard for publishing web feeds a.k.a web syndication. |
|
694 These feeds are usually consumed by a <em>feed reeder</em> that aggregates news from many websites and displays them in a uniform format. |
|
695 The Atom feed is an XML with a list of recent news containing their titles, URLs and short annotations. |
|
696 It also contains some metadata (website author, title etc.). |
|
697 </p> |
|
698 <p> |
|
699 Using this simple XQuery<m:podČarou>see <a href="https://en.wikibooks.org/wiki/XQuery">XQuery</a> at Wikibooks</m:podČarou> |
|
700 <em>FLWOR Expression</em> |
|
701 we convert the Atom feed into the XML serialization of relational data: |
|
702 </p> |
|
703 |
|
704 <m:pre jazyk="xq" src="examples/atom.xq" odkaz="ano"/> |
|
705 |
|
706 <p> |
|
707 This is similar operation to <a href="https://www.postgresql.org/docs/current/functions-xml.html">xmltable</a> used in SQL databases. |
|
708 It converts an XML tree structure to the relational form. |
|
709 In our case, the output is still XML, but in a format that can be read by <code>relpipe-in-xml</code>. |
|
710 All put together in a single shell script: |
|
711 </p> |
|
712 |
|
713 <m:pre jazyk="bash" src="examples/atom.sh"/> |
|
714 |
|
715 <p>Will generate a table with web news:</p> |
|
716 |
|
717 <m:pre jazyk="text" src="examples/atom.txt"/> |
|
718 |
|
719 <p> |
|
720 For frequent usage we can create a script or funcrion called <code>relpipe-in-atom</code> |
|
721 that reads Atom XML on STDIN and generates relational data on STDOUT. |
|
722 And then do any of these: |
|
723 </p> |
|
724 |
|
725 <m:pre jazyk="bash"><![CDATA[wget … | relpipe-in-atom | relpipe-out-tabular |
|
726 wget … | relpipe-in-atom | relpipe-out-csv |
|
727 wget … | relpipe-in-atom | relpipe-out-gui |
|
728 wget … | relpipe-in-atom | relpipe-out-nullbyte | while read_nullbyte published title url; do echo "$title"; done |
|
729 wget … | relpipe-in-atom | relpipe-out-csv | csv2rec | … |
|
730 ]]></m:pre> |
|
731 |
|
732 <p> |
|
733 There are several implementations of XQuery. |
|
734 <a href="http://galax.sourceforge.net/">Galax</a> is one of them. |
|
735 <a href="http://xqilla.sourceforge.net/">XQilla</a> or |
|
736 <a href="http://basex.org/basex/xquery/">BaseX</a> are another ones (and support newer versions of the standard). |
|
737 There are also XSLT processors like <a href="http://xmlsoft.org/XSLT/xsltproc2.html">xsltproc</a>. |
|
738 BaseX can be used instead of Galax – we just replace |
|
739 <code>galax-run -context-item /dev/stdin</code> with <code>basex -i /dev/stdin</code>. |
|
740 </p> |
|
741 |
|
742 <p> |
|
743 Reading Atom feeds in a terminal might not be the best way to get news from a website, |
|
744 but this simple example learns us how to convert arbitrary XML to relational data. |
|
745 And of course, we can generate multiple relations from a single XML using a single XQuery script. |
|
746 XQuery can be also used for operations like JOIN or UNION and for filtering and other transformations |
|
747 as will be shown in further examples. |
|
748 </p> |
|
749 |
|
750 |
690 </text> |
751 </text> |
691 |
752 |
692 </stránka> |
753 </stránka> |