|
1 <stránka |
|
2 xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana" |
|
3 xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro"> |
|
4 |
|
5 <nadpis>Reading an Atom feed using XQuery</nadpis> |
|
6 <perex>converting arbitrary XML into relational data using XQuery</perex> |
|
7 <m:pořadí-příkladu>01100</m:pořadí-příkladu> |
|
8 |
|
9 <text xmlns="http://www.w3.org/1999/xhtml"> |
|
10 |
|
11 <p> |
|
12 Atom Syndication Format is a standard for publishing web feeds a.k.a web syndication. |
|
13 These feeds are usually consumed by a <em>feed reeder</em> that aggregates news from many websites and displays them in a uniform format. |
|
14 The Atom feed is an XML with a list of recent news containing their titles, URLs and short annotations. |
|
15 It also contains some metadata (website author, title etc.). |
|
16 </p> |
|
17 <p> |
|
18 Using this simple XQuery<m:podČarou>see <a href="https://en.wikibooks.org/wiki/XQuery">XQuery</a> at Wikibooks</m:podČarou> |
|
19 <em>FLWOR Expression</em> |
|
20 we convert the Atom feed into the XML serialization of relational data: |
|
21 </p> |
|
22 |
|
23 <m:pre jazyk="xq" src="examples/atom.xq" odkaz="ano"/> |
|
24 |
|
25 <p> |
|
26 This is similar operation to <a href="https://www.postgresql.org/docs/current/functions-xml.html">xmltable</a> used in SQL databases. |
|
27 It converts an XML tree structure to the relational form. |
|
28 In our case, the output is still XML, but in a format that can be read by <code>relpipe-in-xml</code>. |
|
29 All put together in a single shell script: |
|
30 </p> |
|
31 |
|
32 <m:pre jazyk="bash" src="examples/atom.sh"/> |
|
33 |
|
34 <p>Will generate a table with web news:</p> |
|
35 |
|
36 <m:pre jazyk="text" src="examples/atom.txt"/> |
|
37 |
|
38 <p> |
|
39 For frequent usage we can create a script or funcrion called <code>relpipe-in-atom</code> |
|
40 that reads Atom XML on STDIN and generates relational data on STDOUT. |
|
41 And then do any of these: |
|
42 </p> |
|
43 |
|
44 <m:pre jazyk="bash"><![CDATA[wget … | relpipe-in-atom | relpipe-out-tabular |
|
45 wget … | relpipe-in-atom | relpipe-out-csv |
|
46 wget … | relpipe-in-atom | relpipe-out-gui |
|
47 wget … | relpipe-in-atom | relpipe-out-nullbyte | while read_nullbyte published title url; do echo "$title"; done |
|
48 wget … | relpipe-in-atom | relpipe-out-csv | csv2rec | … |
|
49 ]]></m:pre> |
|
50 |
|
51 <p> |
|
52 There are several implementations of XQuery. |
|
53 <a href="http://galax.sourceforge.net/">Galax</a> is one of them. |
|
54 <a href="http://xqilla.sourceforge.net/">XQilla</a> or |
|
55 <a href="http://basex.org/basex/xquery/">BaseX</a> are another ones (and support newer versions of the standard). |
|
56 There are also XSLT processors like <a href="http://xmlsoft.org/XSLT/xsltproc2.html">xsltproc</a>. |
|
57 BaseX can be used instead of Galax – we just replace |
|
58 <code>galax-run -context-item /dev/stdin</code> with <code>basex -i /dev/stdin</code>. |
|
59 </p> |
|
60 |
|
61 <p> |
|
62 Reading Atom feeds in a terminal might not be the best way to get news from a website, |
|
63 but this simple example learns us how to convert arbitrary XML to relational data. |
|
64 And of course, we can generate multiple relations from a single XML using a single XQuery script. |
|
65 XQuery can be also used for operations like JOIN or UNION and for filtering and other transformations |
|
66 as will be shown in further examples. |
|
67 </p> |
|
68 |
|
69 </text> |
|
70 |
|
71 </stránka> |