--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-cli-stdin.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,77 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Reading STDIN</nadpis>
+ <perex>generating relational data from values on standard input</perex>
+ <m:pořadí-příkladu>00200</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ The number of <abbr title="Command-line interface">CLI</abbr> arguments is limited and they are passed at once to the process.
+ So there is option to pass the values from STDIN instead of CLI arguments.
+ Values on STDIN are expected to be separated by the null-byte.
+ We can generate such data e.g. using <code>echo</code> and <code>tr</code> (or using <code>printf</code> or other commands):
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[echo -e "1\nHello\ntrue\n2\nWorld\nfalse" \
+ | tr \\n \\0 \
+ | relpipe-in-cli generate-from-stdin relation_from_stdin 3 \
+ a integer \
+ b string \
+ c boolean \
+ | relpipe-out-tabular]]></m:pre>
+
+ <p>
+ The output is same as above.
+ We can use this approach to convert various formats to relational data.
+ There are lot of data already in the form of null-separated values e.g. the process arguments:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[cat /proc/$(pidof mc)/cmdline \
+ | relpipe-in-cli generate-from-stdin mc_args 1 a string \
+ | relpipe-out-tabular
+]]></m:pre>
+
+ <p>If we have <code>mc /etc/ /tmp/</code> running in some other terminal, the output will be:</p>
+
+ <pre><![CDATA[mc_args:
+ ╭────────────╮
+ │ a (string) │
+ ├────────────┤
+ │ mc │
+ │ /etc/ │
+ │ /tmp/ │
+ ╰────────────╯
+Record count: 3]]></pre>
+
+ <p>
+ Also the <code>find</code> command can produce data separated by the null-byte:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[find /etc/ -name '*ssh*_*' -print0 \
+ | relpipe-in-cli generate-from-stdin files 1 file_name string \
+ | relpipe-out-tabular]]></m:pre>
+
+ <p>Will display something like this:</p>
+
+ <pre><![CDATA[files:
+ ╭───────────────────────────────────╮
+ │ file_name (string) │
+ ├───────────────────────────────────┤
+ │ /etc/ssh/ssh_host_ecdsa_key │
+ │ /etc/ssh/sshd_config │
+ │ /etc/ssh/ssh_host_ed25519_key.pub │
+ │ /etc/ssh/ssh_host_ecdsa_key.pub │
+ │ /etc/ssh/ssh_host_rsa_key │
+ │ /etc/ssh/ssh_config │
+ │ /etc/ssh/ssh_host_ed25519_key │
+ │ /etc/ssh/ssh_import_id │
+ │ /etc/ssh/ssh_host_rsa_key.pub │
+ ╰───────────────────────────────────╯
+Record count: 9]]></pre>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-filesystem-file.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,118 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Reading files metadata using relpipe-in-filesystem</nadpis>
+ <perex>accessing file metadata like path, type, size or owner</perex>
+ <m:pořadí-příkladu>01200</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ Our filesystems contain valuable information and using proper tools we can extract them.
+ Using <code>relpipe-in-filesystem</code> we can gather metadata of our files and process them in relational way.
+ This tools does not traverse our filesystem (remember the rule: <em>do one thing and do it well</em>),
+ instead, it eats a list of file paths separated by <code>\0</code>.
+ It is typically used together with the <code>find</code> command, but we can also create such list by hand using e.g. <code>printf</code> command or <code>tr \\n \\0</code>.
+ </p>
+
+ <m:pre jazyk="bash">find /etc/ssh/ -print0 | relpipe-in-filesystem | relpipe-out-tabular</m:pre>
+
+ <p>
+ In the basic scenario, it behaves like <code>ls -l</code>, just more modular and machine-readable:
+ </p>
+
+ <pre><![CDATA[filesystem:
+ ╭───────────────────────────────────┬───────────────┬────────────────┬────────────────┬────────────────╮
+ │ path (string) │ type (string) │ size (integer) │ owner (string) │ group (string) │
+ ├───────────────────────────────────┼───────────────┼────────────────┼────────────────┼────────────────┤
+ │ /etc/ssh/ │ d │ 0 │ root │ root │
+ │ /etc/ssh/moduli │ f │ 553122 │ root │ root │
+ │ /etc/ssh/ssh_host_ecdsa_key │ f │ 227 │ root │ root │
+ │ /etc/ssh/sshd_config │ f │ 3262 │ root │ root │
+ │ /etc/ssh/ssh_host_ed25519_key.pub │ f │ 91 │ root │ root │
+ │ /etc/ssh/ssh_host_ecdsa_key.pub │ f │ 171 │ root │ root │
+ │ /etc/ssh/ssh_host_rsa_key │ f │ 1679 │ root │ root │
+ │ /etc/ssh/ssh_config │ f │ 1580 │ root │ root │
+ │ /etc/ssh/ssh_host_ed25519_key │ f │ 399 │ root │ root │
+ │ /etc/ssh/ssh_import_id │ f │ 338 │ root │ root │
+ │ /etc/ssh/ssh_host_rsa_key.pub │ f │ 391 │ root │ root │
+ ╰───────────────────────────────────┴───────────────┴────────────────┴────────────────┴────────────────╯
+Record count: 11]]></pre>
+
+ <p>
+ We can specify desired attributes and also their aliases:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[find /etc/ssh/ -print0 \
+ | relpipe-in-filesystem \
+ --file path --as artefact \
+ --file size \
+ --file owner --as dear_owner \
+ | relpipe-out-tabular]]></m:pre>
+
+ <p>And we will get a subset with renamed attributes:</p>
+
+ <pre><![CDATA[filesystem:
+ ╭───────────────────────────────────┬────────────────┬─────────────────────╮
+ │ artefact (string) │ size (integer) │ dear_owner (string) │
+ ├───────────────────────────────────┼────────────────┼─────────────────────┤
+ │ /etc/ssh/ │ 0 │ root │
+ │ /etc/ssh/moduli │ 553122 │ root │
+ │ /etc/ssh/ssh_host_ecdsa_key │ 227 │ root │
+ │ /etc/ssh/sshd_config │ 3262 │ root │
+ │ /etc/ssh/ssh_host_ed25519_key.pub │ 91 │ root │
+ │ /etc/ssh/ssh_host_ecdsa_key.pub │ 171 │ root │
+ │ /etc/ssh/ssh_host_rsa_key │ 1679 │ root │
+ │ /etc/ssh/ssh_config │ 1580 │ root │
+ │ /etc/ssh/ssh_host_ed25519_key │ 399 │ root │
+ │ /etc/ssh/ssh_import_id │ 338 │ root │
+ │ /etc/ssh/ssh_host_rsa_key.pub │ 391 │ root │
+ ╰───────────────────────────────────┴────────────────┴─────────────────────╯
+Record count: 11]]></pre>
+
+ <p>
+ We can also choose, which path format fits our needs best:
+ </p>
+
+
+ <m:pre jazyk="bash"><![CDATA[find ../../etc/ssh/ -print0 \
+ | relpipe-in-filesystem \
+ --file path \
+ --file path_absolute \
+ --file path_canonical \
+ --file name \
+ | relpipe-out-tabular]]></m:pre>
+
+ <p>The <code>path</code> attribute contains the exact same value as was on input. Other formats are derived:</p>
+
+ <pre><![CDATA[filesystem:
+ ╭────────────────────────────────────────┬───────────────────────────────────────────────────┬───────────────────────────────────┬──────────────────────────╮
+ │ path (string) │ path_absolute (string) │ path_canonical (string) │ name (string) │
+ ├────────────────────────────────────────┼───────────────────────────────────────────────────┼───────────────────────────────────┼──────────────────────────┤
+ │ ../../etc/ssh/ │ /home/hack/../../etc/ssh/ │ /etc/ssh │ │
+ │ ../../etc/ssh/moduli │ /home/hack/../../etc/ssh/moduli │ /etc/ssh/moduli │ moduli │
+ │ ../../etc/ssh/ssh_host_ecdsa_key │ /home/hack/../../etc/ssh/ssh_host_ecdsa_key │ /etc/ssh/ssh_host_ecdsa_key │ ssh_host_ecdsa_key │
+ │ ../../etc/ssh/sshd_config │ /home/hack/../../etc/ssh/sshd_config │ /etc/ssh/sshd_config │ sshd_config │
+ │ ../../etc/ssh/ssh_host_ed25519_key.pub │ /home/hack/../../etc/ssh/ssh_host_ed25519_key.pub │ /etc/ssh/ssh_host_ed25519_key.pub │ ssh_host_ed25519_key.pub │
+ │ ../../etc/ssh/ssh_host_ecdsa_key.pub │ /home/hack/../../etc/ssh/ssh_host_ecdsa_key.pub │ /etc/ssh/ssh_host_ecdsa_key.pub │ ssh_host_ecdsa_key.pub │
+ │ ../../etc/ssh/ssh_host_rsa_key │ /home/hack/../../etc/ssh/ssh_host_rsa_key │ /etc/ssh/ssh_host_rsa_key │ ssh_host_rsa_key │
+ │ ../../etc/ssh/ssh_config │ /home/hack/../../etc/ssh/ssh_config │ /etc/ssh/ssh_config │ ssh_config │
+ │ ../../etc/ssh/ssh_host_ed25519_key │ /home/hack/../../etc/ssh/ssh_host_ed25519_key │ /etc/ssh/ssh_host_ed25519_key │ ssh_host_ed25519_key │
+ │ ../../etc/ssh/ssh_import_id │ /home/hack/../../etc/ssh/ssh_import_id │ /etc/ssh/ssh_import_id │ ssh_import_id │
+ │ ../../etc/ssh/ssh_host_rsa_key.pub │ /home/hack/../../etc/ssh/ssh_host_rsa_key.pub │ /etc/ssh/ssh_host_rsa_key.pub │ ssh_host_rsa_key.pub │
+ ╰────────────────────────────────────────┴───────────────────────────────────────────────────┴───────────────────────────────────┴──────────────────────────╯
+Record count: 11]]></pre>
+
+ <p>
+ We can also <em>select</em> symlink targets or their types.
+ If some file is missing or is inaccessible due to permissions, only <code>path</code> is printed for it.
+ </p>
+
+ <p>
+ Tip: if we are looking for files in the current directory and want omit the „.“ we just call: <code>find -printf '%P\0'</code> instead of <code>find -print0</code>.
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-filesystem-xattr.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,67 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Reading extended attributes using relpipe-in-filesystem</nadpis>
+ <perex>accessing xattr of given files e.g. xdg.origin.url</perex>
+ <m:pořadí-příkladu>01300</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+
+ <p>
+ Extended attributes (xattr) are additional <em>key=value</em> pairs that can be attached to our files.
+ They are not stored inside the files, but on the filesystem.
+ Thus they are independent of particular file format (which might not support metadata)
+ and we can use them e.g. for tagging, cataloguing or adding some notes to our files.
+ Some tools like GNU Wget use extended attributes to store metadata like the original URL from which the file was downloaded.
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[wget --recursive --level=1 https://relational-pipes.globalcode.info/
+find -type f -printf '%P\0' \
+ | relpipe-in-filesystem --file path --file size --xattr xdg.origin.url \
+ | relpipe-out-tabular
+]]></m:pre>
+
+ <p>And now we know, where the files on our disk came from:</p>
+
+ <pre><![CDATA[filesystem:
+ ╭───────────────────────────┬────────────────┬────────────────────────────────────────────────────────────────────╮
+ │ path (string) │ size (integer) │ xdg.origin.url (string) │
+ ├───────────────────────────┼────────────────┼────────────────────────────────────────────────────────────────────┤
+ │ index.html │ 12159 │ https://relational-pipes.globalcode.info/v_0/ │
+ │ v_0/atom.xml │ 4613 │ https://relational-pipes.globalcode.info/v_0/atom.xml │
+ │ v_0/rss.xml │ 4926 │ https://relational-pipes.globalcode.info/v_0/rss.xml │
+ │ v_0/js/skript.js │ 2126 │ https://relational-pipes.globalcode.info/v_0/js/skript.js │
+ │ v_0/css/styl.css │ 2988 │ https://relational-pipes.globalcode.info/v_0/css/styl.css │
+ │ v_0/css/relpipe.css │ 1095 │ https://relational-pipes.globalcode.info/v_0/css/relpipe.css │
+ │ v_0/css/syntaxe.css │ 3584 │ https://relational-pipes.globalcode.info/v_0/css/syntaxe.css │
+ │ v_0/index.xhtml │ 12159 │ https://relational-pipes.globalcode.info/v_0/index.xhtml │
+ │ v_0/grafika/logo.png │ 3298 │ https://relational-pipes.globalcode.info/v_0/grafika/logo.png │
+ │ v_0/principles.xhtml │ 17171 │ https://relational-pipes.globalcode.info/v_0/principles.xhtml │
+ │ v_0/roadmap.xhtml │ 11097 │ https://relational-pipes.globalcode.info/v_0/roadmap.xhtml │
+ │ v_0/faq.xhtml │ 11080 │ https://relational-pipes.globalcode.info/v_0/faq.xhtml │
+ │ v_0/specification.xhtml │ 12983 │ https://relational-pipes.globalcode.info/v_0/specification.xhtml │
+ │ v_0/implementation.xhtml │ 10810 │ https://relational-pipes.globalcode.info/v_0/implementation.xhtml │
+ │ v_0/examples.xhtml │ 76958 │ https://relational-pipes.globalcode.info/v_0/examples.xhtml │
+ │ v_0/license.xhtml │ 65580 │ https://relational-pipes.globalcode.info/v_0/license.xhtml │
+ │ v_0/screenshots.xhtml │ 5708 │ https://relational-pipes.globalcode.info/v_0/screenshots.xhtml │
+ │ v_0/download.xhtml │ 5204 │ https://relational-pipes.globalcode.info/v_0/download.xhtml │
+ │ v_0/contact.xhtml │ 4940 │ https://relational-pipes.globalcode.info/v_0/contact.xhtml │
+ │ v_0/classic-example.xhtml │ 9539 │ https://relational-pipes.globalcode.info/v_0/classic-example.xhtml │
+ ╰───────────────────────────┴────────────────┴────────────────────────────────────────────────────────────────────╯
+Record count: 20]]></pre>
+
+ <p>
+ If we like the BeOS/Haiku style, we can create empty files with some attributes attached and use our filesystem as a simple database
+ and query it using relational tools.
+ It will lack indexing, but for basic scenarios like <em>address book</em> it will be fast enough
+ and we can feel a bit of BeOS/Haiku atmosphere in our contemporary GNU/Linux systems.
+ But be careful with that because some editors delete and recreate files while saving them, which destroys the xattrs.
+ Tools like <code>rsync</code> or <code>tar</code> with <code>--xattrs</code> option will backup our attributes securely.
+ </p>
+
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-grep-cut-fstab.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,170 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Doing projection and restriction using cut and grep</nadpis>
+ <perex>SELECT mount_point FROM fstab WHERE type IN ('btrfs', 'xfs')</perex>
+ <m:pořadí-příkladu>01000</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ While reading classic pipelines involving <code>grep</code> and <code>cut</code> commands
+ we must notice that there is some similarity with simple SQL queries looking like:
+ </p>
+
+ <m:pre jazyk="SQL">SELECT "some", "cut", "fields" FROM stdin WHERE grep_matches(whole_line);</m:pre>
+
+ <p>
+ And that is true: <code>grep</code> does restriction<m:podČarou>
+ <a href="https://en.wikipedia.org/wiki/Selection_(relational_algebra)">selecting</a> only certain records from the original relation according to their match with given conditions</m:podČarou>
+ and <code>cut</code> does projection<m:podČarou>limited subset of what <a href="https://en.wikipedia.org/wiki/Projection_(relational_algebra)">projection</a> means</m:podČarou>.
+ Now we can do these relational operations using our relational tools called <code>relpipe-tr-grep</code> and <code>relpipe-tr-cut</code>.
+ </p>
+
+ <p>
+ Assume that we need only <code>mount_point</code> fields from our <code>fstab</code> where <code>type</code> is <code>btrfs</code> or <code>xfs</code>
+ and we want to do something (a shell script block) with these directory paths.
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
+ | relpipe-tr-grep 'fstab' 'type' '^btrfs|xfs$' \
+ | relpipe-tr-cut 'fstab' 'mount_point' \
+ | relpipe-out-nullbyte \
+ | while read -r -d '' m; do
+ echo "$m";
+ done]]></m:pre>
+
+ <p>
+ The <code>relpipe-tr-cut</code> tool has similar syntax to its <em>grep</em> and <em>sed</em> siblings and also uses the power of regular expressions.
+ In this case it modifies on-the-fly the <code>fstab</code> relation and drops all its attributes except the <code>mount_point</code> one.
+ </p>
+
+ <p>
+ Then we pass the data to the Bash <code>while</code> cycle.
+ In such simple scenario (just <code>echo</code>), we could use <code>xargs</code> as in examples above,
+ but in this syntax, we can write whole block of shell commands for each record/value and do more complex actions with them.
+ </p>
+
+ <h2>More projections with relpipe-tr-cut</h2>
+
+ <p>
+ Assume that we have a simple relation containing numbers:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[seq 0 8 \
+ | tr \\n \\0 \
+ | relpipe-in-cli generate-from-stdin numbers 3 a integer b integer c integer \
+ > numbers.rp]]></m:pre>
+
+ <p>and second one containing letters:</p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate letters 2 a string b string A B C D > letters.rp]]></m:pre>
+
+ <p>We saved them into two files and then combined them into a single file. We will work with them as they are a single stream of relations:</p>
+
+ <m:pre jazyk="bash"><![CDATA[cat numbers.rp letters.rp > both.rp;
+cat both.rp | relpipe-out-tabular]]></m:pre>
+
+ <p>Will print:</p>
+
+ <pre><![CDATA[numbers:
+ ╭─────────────┬─────────────┬─────────────╮
+ │ a (integer) │ b (integer) │ c (integer) │
+ ├─────────────┼─────────────┼─────────────┤
+ │ 0 │ 1 │ 2 │
+ │ 3 │ 4 │ 5 │
+ │ 6 │ 7 │ 8 │
+ ╰─────────────┴─────────────┴─────────────╯
+Record count: 3
+letters:
+ ╭─────────────┬─────────────╮
+ │ a (string) │ b (string) │
+ ├─────────────┼─────────────┤
+ │ A │ B │
+ │ C │ D │
+ ╰─────────────┴─────────────╯
+Record count: 2]]></pre>
+
+ <p>We can put away the <code>a</code> attribute from the <code>numbers</code> relation:</p>
+
+ <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|c' | relpipe-out-tabular</m:pre>
+
+ <p>and leave the <code>letters</code> relation unaffected:</p>
+
+ <pre><![CDATA[numbers:
+ ╭─────────────┬─────────────╮
+ │ b (integer) │ c (integer) │
+ ├─────────────┼─────────────┤
+ │ 1 │ 2 │
+ │ 4 │ 5 │
+ │ 7 │ 8 │
+ ╰─────────────┴─────────────╯
+Record count: 3
+letters:
+ ╭─────────────┬─────────────╮
+ │ a (string) │ b (string) │
+ ├─────────────┼─────────────┤
+ │ A │ B │
+ │ C │ D │
+ ╰─────────────┴─────────────╯
+Record count: 2]]></pre>
+
+ <p>Or we can remove <code>a</code> from both relations resp. keep there only attributes whose names match <code>'b|c'</code> regex:</p>
+
+ <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut '.*' 'b|c' | relpipe-out-tabular</m:pre>
+
+ <p>Instead of <code>'.*'</code> we could use <code>'numbers|letters'</code> and in this case it will give the same result:</p>
+
+ <pre><![CDATA[numbers:
+ ╭─────────────┬─────────────╮
+ │ b (integer) │ c (integer) │
+ ├─────────────┼─────────────┤
+ │ 1 │ 2 │
+ │ 4 │ 5 │
+ │ 7 │ 8 │
+ ╰─────────────┴─────────────╯
+Record count: 3
+letters:
+ ╭─────────────╮
+ │ b (string) │
+ ├─────────────┤
+ │ B │
+ │ D │
+ ╰─────────────╯
+Record count: 2]]></pre>
+
+ <p>All the time, we are reducing the attributes. But we can also multiply them or change their order:</p>
+
+ <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|a|c' 'b' 'a' 'a' | relpipe-out-tabular</m:pre>
+
+ <p>
+ n.b. the order in <code>'b|a|c'</code> does not matter and if such regex matches, it preserves the original order of the attributes;
+ but if we use multiple regexes to specify attributes, their order and count matters:
+ </p>
+
+ <pre><![CDATA[numbers:
+ ╭─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────╮
+ │ a (integer) │ b (integer) │ c (integer) │ b (integer) │ a (integer) │ a (integer) │
+ ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
+ │ 0 │ 1 │ 2 │ 1 │ 0 │ 0 │
+ │ 3 │ 4 │ 5 │ 4 │ 3 │ 3 │
+ │ 6 │ 7 │ 8 │ 7 │ 6 │ 6 │
+ ╰─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────╯
+Record count: 3
+letters:
+ ╭─────────────┬─────────────╮
+ │ a (string) │ b (string) │
+ ├─────────────┼─────────────┤
+ │ A │ B │
+ │ C │ D │
+ ╰─────────────┴─────────────╯
+Record count: 2]]></pre>
+
+ <p>
+ The <code>letters</code> relation stays rock steady and <code>relpipe-tr-cut 'numbers'</code> does not affect it in any way.
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-grep-fstab.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,44 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Filtering /etc/fstab using relpipe-tr-grep</nadpis>
+ <perex>list only records with desired filesystem types</perex>
+ <m:pořadí-příkladu>00900</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ If we are interested only in certain records in some relation, we can filter it using <code>relpipe-tr-grep</code>.
+ If we want to list e.g. only Btrfs and XFS file systems from our <code>fstab</code> (see above), we will run:
+ </p>
+
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-tr-grep 'fstab' 'type' 'btrfs|xfs' | relpipe-out-tabular]]></m:pre>
+
+ <p>and we will get following filtered result:</p>
+ <pre><![CDATA[fstab:
+ ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬──────────────────┬────────────────┬────────────────╮
+ │ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
+ ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼──────────────────┼────────────────┼────────────────┤
+ │ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │
+ │ │ /dev/mapper/sdf_crypt │ /mnt/private │ xfs │ relatime │ 0 │ 2 │
+ ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴──────────────────┴────────────────┴────────────────╯
+Record count: 2]]></pre>
+
+ <p>
+ Command arguments are similar to <code>relpipe-tr-sed</code>.
+ Everything is a regular expression.
+ Only relations matching the regex will be filtered, others will flow through the pipeline unmodified.
+ If the attribute regex matches more attribute names, filtering will be done with logical OR
+ i.e. the record is included if at least one of that attributes matches the search regex.
+ </p>
+
+ <p>
+ If we need exact match of the whole attribute, we have to use something like <code>'^btrfs|xfs$'</code>,
+ otherwise mere substring-match is enough to include the record.
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-hello-world.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,83 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Hello Wordl!</nadpis>
+ <perex>generating relational data from CLI arguments</perex>
+ <m:pořadí-příkladu>00100</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ Let's start with an obligatory Hello World example.
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate "relation_from_cli" 3 \
+ "a" "integer" \
+ "b" "string" \
+ "c" "boolean" \
+ "1" "Hello" "true" \
+ "2" "World!" "false"]]></m:pre>
+
+ <p>
+ This command generates relational data.
+ In order to see them, we need to convert them to some other format.
+ For now, we will use the "tabular" format and pipe relational data to the <code>relpipe-out-tabular</code>.
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate "relation_from_cli" 3 \
+ "a" "integer" \
+ "b" "string" \
+ "c" "boolean" \
+ "1" "Hello" "true" \
+ "2" "World!" "false" \
+ | relpipe-out-tabular]]></m:pre>
+
+ <p>Output:</p>
+
+ <pre><![CDATA[relation_from_cli:
+ ╭─────────────┬────────────┬─────────────╮
+ │ a (integer) │ b (string) │ c (boolean) │
+ ├─────────────┼────────────┼─────────────┤
+ │ 1 │ Hello │ true │
+ │ 2 │ World! │ false │
+ ╰─────────────┴────────────┴─────────────╯
+Record count: 2
+]]></pre>
+
+ <p>
+ The syntax is simple as we see above. We specify the name of the relation, number of attributes,
+ and then their definitions (names and types),
+ followed by the data.
+ </p>
+
+ <p>
+ A single stream may contain multiple relations:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[(relpipe-in-cli generate a 1 x string hello; \
+ relpipe-in-cli generate b 1 y string world) \
+ | relpipe-out-tabular]]></m:pre>
+
+ <p>
+ Thus we can combine various commands or files and pass the result to a single relational output filter (<code>relpipe-out-tabular</code> in this case) and get:
+ </p>
+
+ <pre><![CDATA[a:
+ ╭────────────╮
+ │ x (string) │
+ ├────────────┤
+ │ hello │
+ ╰────────────╯
+Record count: 1
+b:
+ ╭────────────╮
+ │ y (string) │
+ ├────────────┤
+ │ world │
+ ╰────────────╯
+Record count: 1]]></pre>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-in-fstab.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,47 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Reading fstab</nadpis>
+ <perex>converting /etc/fstab or /etc/mtab to relational data</perex>
+ <m:pořadí-příkladu>00200</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ Using command <code>relpipe-in-fstab</code> we can convert the <code>/etc/fstab</code> or <code>/etc/mtab</code> to relational data
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-out-tabular]]></m:pre>
+
+ <p>
+ and see them as a nice table:
+ </p>
+
+ <pre><![CDATA[fstab:
+ ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
+ │ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
+ ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
+ │ UUID │ 29758270-fd25-4a6c-a7bb-9a18302816af │ / │ ext4 │ relatime,user_xattr,errors=remount-ro │ 0 │ 1 │
+ │ │ /dev/sr0 │ /media/cdrom0 │ udf,iso9660 │ user,noauto │ 0 │ 0 │
+ │ │ /dev/sde │ /mnt/data │ ext4 │ relatime,user_xattr,errors=remount-ro │ 0 │ 2 │
+ │ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │
+ │ │ /dev/mapper/sdf_crypt │ /mnt/private │ xfs │ relatime │ 0 │ 2 │
+ ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
+Record count: 5]]></pre>
+
+ <p>And we can do the same also with a remote <code>fstab</code> or <code>mtab</code>; just by adding <code>ssh</code> to the pipeline:</p>
+
+ <m:pre jazyk="bash"><![CDATA[ssh example.com cat /etc/mtab | relpipe-in-fstab | relpipe-out-tabular]]></m:pre>
+
+ <p>
+ The <code>cat</code> runs remotely. The <code>relpipe-in-fstab</code> and <code>relpipe-out-tabular</code> run on our machine.
+ </p>
+
+ <p>
+ n.b. the <code>relpipe-in-fstab</code> reads the <code>/etc/fstab</code> if executed on TTY. Otherwise, it reads the STDIN.
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-out-bash.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,65 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Writing an output filter in Bash</nadpis>
+ <perex>processing relational data in GNU Bash or some other shell</perex>
+ <m:pořadí-příkladu>00600</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ In previous example we created an output filter in Perl.
+ We converted a relation to values separated by <code>\0</code> and then passed it through <code>xargs</code> to a perl <em>one-liner</em> (or a <em>multi-liner</em> in this case).
+ But we can write such output filter in pure Bash without <code>xargs</code> and <code>perl</code>.
+ Of course, it is still limited to a single relation (or it can process multiple relations of same type and do something like implicit <code>UNION ALL</code>).
+ </p>
+
+ <p>
+ We will define a function that will help us with reading the <code>\0</code>-separated values and putting them into shell variables:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[read_nullbyte() { for v in "$@"; do export "$v"; read -r -d '' "$v"; done }]]></m:pre>
+
+ <!--
+ This version will not require the last \0:
+ read_zero() { for v in "$@"; do export "$v"; read -r -d '' "$v" || [ ! -z "${!v}" ]; done }
+ at least in case when the last value is not missing.
+ Other values might be null/missing: \0\0 is OK.
+ -->
+
+ <p>
+ Currently, there is no known way how to do this without a custom function (just with <code>read</code> built-in command of Bash and its parameters).
+ But it is just a single line function, so not a big deal.
+ </p>
+
+ <p>
+ And then we just read the values, put them in shell variables and process them in a cycle in a shell block of code:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
+ | relpipe-out-nullbyte \
+ | while read_nullbyte scheme device mount_point fs_type options dump pass; do
+ echo "Device ${scheme:+$scheme=}$device is mounted" \
+ "at $mount_point and contains $fs_type.";
+ done]]></m:pre>
+
+ <p>
+ Which will print:
+ </p>
+
+ <pre><![CDATA[Device UUID=29758270-fd25-4a6c-a7bb-9a18302816af is mounted at / and contains ext4.
+Device /dev/sr0 is mounted at /media/cdrom0 and contains udf,iso9660.
+Device /dev/sde is mounted at /mnt/data and contains ext4.
+Device UUID=a2b5f230-a795-4f6f-a39b-9b57686c86d5 is mounted at /home and contains btrfs.
+Device /dev/mapper/sdf_crypt is mounted at /mnt/private and contains xfs.]]></pre>
+
+ <p>
+ Using this method, we can convert any single relation to any format (preferably some text one, but <code>printf</code> can produce also binary data).
+ This is good for ad-hoc conversions and single-relation data.
+ More powerful tools can be written in C++ and other languages like Java, Python, Guile etc. (when particular libraries are available).
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-out-fstab.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,88 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Formatting fstab</nadpis>
+ <perex>implementing a simple relpipe-out-fstab filter using -in-fstab, -out-nullbyte, xargs and Perl</perex>
+ <m:pořadí-příkladu>00300</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ As we have seen before, we can convert <code>/etc/fstab</code> (or <code>mtab</code>)
+ to e.g. an XML or a nice and colorful table using <m:name/>.
+ But we can also convert these data back to the <code>fstab</code> format. And do it with proper indentation/padding.
+ Fstab has a simple format where values are separated by one or more whitespace characters.
+ But without proper indentation, these files look a bit obfuscated and hard to read (however, they are valid).
+ </p>
+
+ <m:pre jazyk="text" src="examples/relpipe-out-fstab.txt"/>
+
+ <p>
+ So let's build a pipeline that reformats the <code>fstab</code> and makes it more readable.
+ </p>
+
+ <m:pre jazyk="bash">relpipe-in-fstab | relpipe-out-fstab > reformatted-fstab.txt</m:pre>
+
+ <p>
+ We can hack together a script called <code>relpipe-out-fstab</code> that accepts relational data and produces <code>fstab</code> data.
+ Later this will be probably implemented as a regular tool, but for now, it is just an example of a ad-hoc shell script:
+ </p>
+
+ <m:pre jazyk="bash" src="examples/relpipe-out-fstab.sh" odkaz="ano"/>
+
+ <p>
+ In the first part, we prepend a single record (<code>relpipe-in-cli</code>) before the data coming from STDIN (<code>cat</code>).
+ Then, we use <code>relpipe-out-nullbyte</code> to convert relational data to values separated by a null-byte.
+ This command processes only attribute values (skips relation and attribute names).
+ Then we used <code>xargs</code> to read the null-separated values and execute a Perl command for each record (pass to it a same number of arguments, as we have attributes: <code>--max-args=7</code>).
+ Perl does the actual formatting: adds padding and does some little tunning (merges two attributes and replaces empty values with <em>none</em>).
+ </p>
+
+ <p>This is formatted version of the <code>fstab</code> above:</p>
+
+ <m:pre jazyk="text" src="examples/relpipe-out-fstab.formatted.txt"/>
+
+ <p>
+ And using following command we can verify, that the files differ only in comments and whitespace:
+ </p>
+
+ <pre>relpipe-in-fstab | relpipe-out-fstab | diff -w /etc/fstab -</pre>
+
+ <p>
+ Another check (should print same hashes):
+ </p>
+
+ <pre><![CDATA[relpipe-in-fstab | sha512sum
+relpipe-in-fstab | relpipe-out-fstab | relpipe-in-fstab | sha512sum]]></pre>
+
+ <p>
+ Regular implementation of <code>relpipe-out-fstab</code> will probably keep the comments
+ (it needs also one more attribute and small change in <code>relpipe-in-fstab</code>).
+ </p>
+
+ <p>
+ For just mere <code>fstab</code> reformatting, this approach is a bit overengineering.
+ We could skip the whole relational thing and do just something like this:
+ </p>
+
+ <m:pre jazyk="bash">cat /etc/fstab | grep -v '^#' | sed -E 's/\s+/\n/g' | tr \\n \\0 | xargs -0 -n7 ...</m:pre>
+
+ <p>
+ plus prepend the comment (or do everything in Perl).
+ But this example is intended as a demostration, how we can
+ 1) prepend some additional data before the data from STDIN
+ 2) use <m:name/> and traditional tools like <code>xargs</code> or <code>perl</code> together.
+ And BTW we have implemented a (simple but working) <em>relpipe output filter</em> – and did it without any serious programming, just put some existing commands together :-)
+ </p>
+
+ <blockquote>
+ <p>
+ There is more Unix-nature in one line of shell script than there is in ten thousand lines of C.
+ <m:podČarou>see <a href="http://www.catb.org/~esr/writings/unix-koans/ten-thousand.html">Master Foo and the Ten Thousand Lines</a></m:podČarou>
+ </p>
+ </blockquote>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-out-xml.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,36 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Generating XML</nadpis>
+ <perex>converting relational data to XML</perex>
+ <m:pořadí-příkladu>00400</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ Relational data can be converted to various formats and one of them is the XML.
+ This is a good option for further processing e.g. using XSLT transformation or passing the XML data to some other tool.
+ Just use <code>relpipe-out-xml</code> instead of <code>relpipe-out-tabular</code> and the rest of the pipeline remains unchanged:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[ssh example.com cat /etc/mtab | relpipe-in-fstab | relpipe-out-xml]]></m:pre>
+
+ <p>
+ Will produce XML like this:
+ </p>
+
+ <m:pre jazyk="xml" src="examples/relpipe-out-fstab.xml"/>
+
+ <p>
+ Thanks to XSLT, this XML can be easily converted e.g. to an XHTML table (<code>table|tr|td</code>) or other format.
+ Someone can convert such data to a (La)TeX table.
+ </p>
+
+ <p>
+ n.b. the format is not final and will change i future versions (XML namespace, more metadata etc.).
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-rename-groups-backreferences.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,44 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Using relpipe-tr-sed with groups and backreferences</nadpis>
+ <perex>sed-like substitution with regex groups and backreferences</perex>
+ <m:pořadí-příkladu>00800</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ This tool also support regex groups and backreferences. Thus we can use parts of the matched string in our replacement string:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate r 1 a string "some string xxx_123 some zzz_456 other" \
+ | relpipe-tr-sed 'r' 'a' '([a-z]{3})_([0-9]+)' '$2:$1' \
+ | relpipe-out-tabular]]></m:pre>
+
+ <p>Which would convert this:</p>
+ <pre><![CDATA[r:
+ ╭────────────────────────────────────────╮
+ │ a (string) │
+ ├────────────────────────────────────────┤
+ │ some string xxx_123 some zzz_456 other │
+ ╰────────────────────────────────────────╯
+Record count: 1]]></pre>
+
+ <p>into this:</p>
+ <pre><![CDATA[r:
+ ╭────────────────────────────────────────╮
+ │ a (string) │
+ ├────────────────────────────────────────┤
+ │ some string 123:xxx some 456:zzz other │
+ ╰────────────────────────────────────────╯
+Record count: 1]]></pre>
+
+ <p>
+ If there were any other relations or attributes in the stream, they would be unaffected by this transformation,
+ becase we specified <code>'r' 'a'</code> instead of some wider regular expression that would match more relations or attributes.
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-rename-vg-fstab.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,46 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Renaming VG in /etc/fstab using relpipe-tr-sed</nadpis>
+ <perex>sed-like substitutions in the relational stream</perex>
+ <m:pořadí-příkladu>00700</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ Assume that we have an <code>/etc/fstab</code> with many lines defining the mount-points (directories) of particular devices (disks) and we are using LVM.
+ If we rename a volume group (VG), we have to change all of them. The lines look like this one:
+ </p>
+
+ <pre>/dev/alpha/photos /mnt/photos/ btrfs noauto,noatime,nodiratime 0 0</pre>
+
+ <p>
+ We want to change all lines from <code>alpha</code> to <code>beta</code> (the new VG name).
+ This can be done by the power of regular expressions<m:podČarou>see <a href="https://en.wikibooks.org/wiki/Regular_Expressions/Simple_Regular_Expressions">Regular Expressions</a> at Wikibooks</m:podČarou> and this pipeline:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
+ | relpipe-tr-sed 'fstab' 'device' '^/dev/alpha/' '/dev/beta/' \
+ | relpipe-out-fstab]]></m:pre>
+
+ <p>
+ The <code>relpipe-tr-sed</code> tool works only with given relation (<code>fstab</code>) and given attribute (<code>device</code>)
+ and it would leave untouched other relations and attributes in the stream.
+ So it would not replace the strings on unwanted places (if there are any random matches).
+ </p>
+
+ <p>
+ Even the relation names and attribute names are specified as a regular expression, so we can (purposefully) modify multiple relations or attributes.
+ For example we can put zeroes in both <code>dump</code> and <code>pass</code> attributes:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-tr-sed 'fstab' 'dump|pass' '.+' '0' | relpipe-out-fstab]]></m:pre>
+
+ <p>
+ n.b. the data types must be respected, we can not e.g. put <code>abc</code> in the <code>pass</code> attribute because it is declared as <code>integer</code>.
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-validator.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,40 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Validating relational data</nadpis>
+ <perex>check whether data are in the relpipe format</perex>
+ <m:pořadí-příkladu>00500</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ Just a passthrough command, so these pipelines should produce the same hash:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[
+relpipe-in-fstab | relpipe-tr-validator | sha512sum
+relpipe-in-fstab | sha512sum]]></m:pre>
+
+ <p>
+ This tool can be used for testing whether a file contains valid relational data:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[
+if relpipe-tr-validator < "some-file.rp" &> /dev/null; then
+ echo "valid relational data";
+else
+ echo "garbage";
+fi]]></m:pre>
+
+ <p>or as a one-liner:</p>
+
+ <m:pre jazyk="bash"><![CDATA[relpipe-tr-validator < "some-file.rp" &> /dev/null && echo "ok" || echo "error"]]></m:pre>
+
+ <p>
+ If an error is found, it is reported on STDERR. So just omit the <code>&</code> in order to see the error message.
+ </p>
+
+ </text>
+
+</stránka>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-xquery-atom.xml Tue Feb 05 19:18:28 2019 +0100
@@ -0,0 +1,71 @@
+<stránka
+ xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+ xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+
+ <nadpis>Reading an Atom feed using XQuery</nadpis>
+ <perex>converting arbitrary XML into relational data using XQuery</perex>
+ <m:pořadí-příkladu>01100</m:pořadí-příkladu>
+
+ <text xmlns="http://www.w3.org/1999/xhtml">
+
+ <p>
+ Atom Syndication Format is a standard for publishing web feeds a.k.a web syndication.
+ These feeds are usually consumed by a <em>feed reeder</em> that aggregates news from many websites and displays them in a uniform format.
+ The Atom feed is an XML with a list of recent news containing their titles, URLs and short annotations.
+ It also contains some metadata (website author, title etc.).
+ </p>
+ <p>
+ Using this simple XQuery<m:podČarou>see <a href="https://en.wikibooks.org/wiki/XQuery">XQuery</a> at Wikibooks</m:podČarou>
+ <em>FLWOR Expression</em>
+ we convert the Atom feed into the XML serialization of relational data:
+ </p>
+
+ <m:pre jazyk="xq" src="examples/atom.xq" odkaz="ano"/>
+
+ <p>
+ This is similar operation to <a href="https://www.postgresql.org/docs/current/functions-xml.html">xmltable</a> used in SQL databases.
+ It converts an XML tree structure to the relational form.
+ In our case, the output is still XML, but in a format that can be read by <code>relpipe-in-xml</code>.
+ All put together in a single shell script:
+ </p>
+
+ <m:pre jazyk="bash" src="examples/atom.sh"/>
+
+ <p>Will generate a table with web news:</p>
+
+ <m:pre jazyk="text" src="examples/atom.txt"/>
+
+ <p>
+ For frequent usage we can create a script or funcrion called <code>relpipe-in-atom</code>
+ that reads Atom XML on STDIN and generates relational data on STDOUT.
+ And then do any of these:
+ </p>
+
+ <m:pre jazyk="bash"><![CDATA[wget … | relpipe-in-atom | relpipe-out-tabular
+wget … | relpipe-in-atom | relpipe-out-csv
+wget … | relpipe-in-atom | relpipe-out-gui
+wget … | relpipe-in-atom | relpipe-out-nullbyte | while read_nullbyte published title url; do echo "$title"; done
+wget … | relpipe-in-atom | relpipe-out-csv | csv2rec | …
+]]></m:pre>
+
+ <p>
+ There are several implementations of XQuery.
+ <a href="http://galax.sourceforge.net/">Galax</a> is one of them.
+ <a href="http://xqilla.sourceforge.net/">XQilla</a> or
+ <a href="http://basex.org/basex/xquery/">BaseX</a> are another ones (and support newer versions of the standard).
+ There are also XSLT processors like <a href="http://xmlsoft.org/XSLT/xsltproc2.html">xsltproc</a>.
+ BaseX can be used instead of Galax – we just replace
+ <code>galax-run -context-item /dev/stdin</code> with <code>basex -i /dev/stdin</code>.
+ </p>
+
+ <p>
+ Reading Atom feeds in a terminal might not be the best way to get news from a website,
+ but this simple example learns us how to convert arbitrary XML to relational data.
+ And of course, we can generate multiple relations from a single XML using a single XQuery script.
+ XQuery can be also used for operations like JOIN or UNION and for filtering and other transformations
+ as will be shown in further examples.
+ </p>
+
+ </text>
+
+</stránka>
--- a/relpipe-data/examples.xml Sun Jan 27 16:16:41 2019 +0100
+++ b/relpipe-data/examples.xml Tue Feb 05 19:18:28 2019 +0100
@@ -14,851 +14,18 @@
But they should also work in other shells.
</p>
- <h2>relpipe-in-cli: Hello Wordl!</h2>
-
- <p>
- Let's start with an obligatory Hello World example.
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate "relation_from_cli" 3 \
- "a" "integer" \
- "b" "string" \
- "c" "boolean" \
- "1" "Hello" "true" \
- "2" "World!" "false"]]></m:pre>
-
- <p>
- This command generates relational data.
- In order to see them, we need to convert them to some other format.
- For now, we will use the "tabular" format and pipe relational data to the <code>relpipe-out-tabular</code>.
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate "relation_from_cli" 3 \
- "a" "integer" \
- "b" "string" \
- "c" "boolean" \
- "1" "Hello" "true" \
- "2" "World!" "false" \
- | relpipe-out-tabular]]></m:pre>
-
- <p>Output:</p>
-
- <pre><![CDATA[relation_from_cli:
- ╭─────────────┬────────────┬─────────────╮
- │ a (integer) │ b (string) │ c (boolean) │
- ├─────────────┼────────────┼─────────────┤
- │ 1 │ Hello │ true │
- │ 2 │ World! │ false │
- ╰─────────────┴────────────┴─────────────╯
-Record count: 2
-]]></pre>
-
- <p>
- The syntax is simple as we see above. We specify the name of the relation, number of attributes,
- and then their definitions (names and types),
- followed by the data.
- </p>
-
- <p>
- A single stream may contain multiple relations:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[(relpipe-in-cli generate a 1 x string hello; \
- relpipe-in-cli generate b 1 y string world) \
- | relpipe-out-tabular]]></m:pre>
-
- <p>
- Thus we can combine various commands or files and pass the result to a single relational output filter (<code>relpipe-out-tabular</code> in this case) and get:
- </p>
-
- <pre><![CDATA[a:
- ╭────────────╮
- │ x (string) │
- ├────────────┤
- │ hello │
- ╰────────────╯
-Record count: 1
-b:
- ╭────────────╮
- │ y (string) │
- ├────────────┤
- │ world │
- ╰────────────╯
-Record count: 1]]></pre>
-
- <h2>relpipe-in-cli: STDIN</h2>
-
- <p>
- The number of <abbr title="Command-line interface">CLI</abbr> arguments is limited and they are passed at once to the process.
- So there is option to pass the values from STDIN instead of CLI arguments.
- Values on STDIN are expected to be separated by the null-byte.
- We can generate such data e.g. using <code>echo</code> and <code>tr</code> (or using <code>printf</code> or other commands):
- </p>
-
- <m:pre jazyk="bash"><![CDATA[echo -e "1\nHello\ntrue\n2\nWorld\nfalse" \
- | tr \\n \\0 \
- | relpipe-in-cli generate-from-stdin relation_from_stdin 3 \
- a integer \
- b string \
- c boolean \
- | relpipe-out-tabular]]></m:pre>
-
- <p>
- The output is same as above.
- We can use this approach to convert various formats to relational data.
- There are lot of data already in the form of null-separated values e.g. the process arguments:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[cat /proc/$(pidof mc)/cmdline \
- | relpipe-in-cli generate-from-stdin mc_args 1 a string \
- | relpipe-out-tabular
-]]></m:pre>
-
- <p>If we have <code>mc /etc/ /tmp/</code> running in some other terminal, the output will be:</p>
-
- <pre><![CDATA[mc_args:
- ╭────────────╮
- │ a (string) │
- ├────────────┤
- │ mc │
- │ /etc/ │
- │ /tmp/ │
- ╰────────────╯
-Record count: 3]]></pre>
-
- <p>
- Also the <code>find</code> command can produce data separated by the null-byte:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[find /etc/ -name '*ssh*_*' -print0 \
- | relpipe-in-cli generate-from-stdin files 1 file_name string \
- | relpipe-out-tabular]]></m:pre>
-
- <p>Will display something like this:</p>
-
- <pre><![CDATA[files:
- ╭───────────────────────────────────╮
- │ file_name (string) │
- ├───────────────────────────────────┤
- │ /etc/ssh/ssh_host_ecdsa_key │
- │ /etc/ssh/sshd_config │
- │ /etc/ssh/ssh_host_ed25519_key.pub │
- │ /etc/ssh/ssh_host_ecdsa_key.pub │
- │ /etc/ssh/ssh_host_rsa_key │
- │ /etc/ssh/ssh_config │
- │ /etc/ssh/ssh_host_ed25519_key │
- │ /etc/ssh/ssh_import_id │
- │ /etc/ssh/ssh_host_rsa_key.pub │
- ╰───────────────────────────────────╯
-Record count: 9]]></pre>
-
-
- <h2>relpipe-in-fstab</h2>
-
- <p>
- Using command <code>relpipe-in-fstab</code> we can convert the <code>/etc/fstab</code> or <code>/etc/mtab</code> to relational data
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-out-tabular]]></m:pre>
-
- <p>
- and see them as a nice table:
- </p>
-
- <pre><![CDATA[fstab:
- ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
- │ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
- ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
- │ UUID │ 29758270-fd25-4a6c-a7bb-9a18302816af │ / │ ext4 │ relatime,user_xattr,errors=remount-ro │ 0 │ 1 │
- │ │ /dev/sr0 │ /media/cdrom0 │ udf,iso9660 │ user,noauto │ 0 │ 0 │
- │ │ /dev/sde │ /mnt/data │ ext4 │ relatime,user_xattr,errors=remount-ro │ 0 │ 2 │
- │ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │
- │ │ /dev/mapper/sdf_crypt │ /mnt/private │ xfs │ relatime │ 0 │ 2 │
- ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
-Record count: 5]]></pre>
-
- <p>And we can do the same also with a remote <code>fstab</code> or <code>mtab</code>; just by adding <code>ssh</code> to the pipeline:</p>
-
- <m:pre jazyk="bash"><![CDATA[ssh example.com cat /etc/mtab | relpipe-in-fstab | relpipe-out-tabular]]></m:pre>
-
- <p>
- The <code>cat</code> runs remotely. The <code>relpipe-in-fstab</code> and <code>relpipe-out-tabular</code> run on our machine.
- </p>
-
- <p>
- n.b. the <code>relpipe-in-fstab</code> reads the <code>/etc/fstab</code> if executed on TTY. Otherwise, it reads the STDIN.
- </p>
-
- <h2>relpipe-out-xml</h2>
-
- <p>
- Relational data can be converted to various formats and one of them is the XML.
- This is a good option for further processing e.g. using XSLT transformation or passing the XML data to some other tool.
- Just use <code>relpipe-out-xml</code> instead of <code>relpipe-out-tabular</code> and the rest of the pipeline remains unchanged:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[ssh example.com cat /etc/mtab | relpipe-in-fstab | relpipe-out-xml]]></m:pre>
-
- <p>
- Will produce XML like this:
- </p>
-
- <m:pre jazyk="xml" src="examples/relpipe-out-fstab.xml"/>
-
- <p>
- Thanks to XSLT, this XML can be easily converted e.g. to an XHTML table (<code>table|tr|td</code>) or other format.
- Someone can convert such data to a (La)TeX table.
- </p>
-
- <p>
- n.b. the format is not final and will change i future versions (XML namespace, more metadata etc.).
- </p>
-
-
- <h2>relpipe-tr-validator</h2>
-
- <p>
- Just a passthrough command, so these pipelines should produce the same hash:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[
-relpipe-in-fstab | relpipe-tr-validator | sha512sum
-relpipe-in-fstab | sha512sum]]></m:pre>
-
- <p>
- This tool can be used for testing whether a file contains valid relational data:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[
-if relpipe-tr-validator < "some-file.rp" &> /dev/null; then
- echo "valid relational data";
-else
- echo "garbage";
-fi]]></m:pre>
-
- <p>or as a one-liner:</p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-tr-validator < "some-file.rp" &> /dev/null && echo "ok" || echo "error"]]></m:pre>
-
- <p>
- If an error is found, it is reported on STDERR. So just omit the <code>&</code> in order to see the error message.
- </p>
-
-
- <h2>/etc/fstab formatting using -in-fstab, -out-nullbyte, xargs and Perl</h2>
-
- <p>
- As we have seen before, we can convert <code>/etc/fstab</code> (or <code>mtab</code>)
- to e.g. an XML or a nice and colorful table using <m:name/>.
- But we can also convert these data back to the <code>fstab</code> format. And do it with proper indentation/padding.
- Fstab has a simple format where values are separated by one or more whitespace characters.
- But without proper indentation, these files look a bit obfuscated and hard to read (however, they are valid).
- </p>
-
- <m:pre jazyk="text" src="examples/relpipe-out-fstab.txt"/>
-
- <p>
- So let's build a pipeline that reformats the <code>fstab</code> and makes it more readable.
- </p>
-
- <m:pre jazyk="bash">relpipe-in-fstab | relpipe-out-fstab > reformatted-fstab.txt</m:pre>
-
- <p>
- We can hack together a script called <code>relpipe-out-fstab</code> that accepts relational data and produces <code>fstab</code> data.
- Later this will be probably implemented as a regular tool, but for now, it is just an example of a ad-hoc shell script:
- </p>
-
- <m:pre jazyk="bash" src="examples/relpipe-out-fstab.sh" odkaz="ano"/>
-
- <p>
- In the first part, we prepend a single record (<code>relpipe-in-cli</code>) before the data coming from STDIN (<code>cat</code>).
- Then, we use <code>relpipe-out-nullbyte</code> to convert relational data to values separated by a null-byte.
- This command processes only attribute values (skips relation and attribute names).
- Then we used <code>xargs</code> to read the null-separated values and execute a Perl command for each record (pass to it a same number of arguments, as we have attributes: <code>--max-args=7</code>).
- Perl does the actual formatting: adds padding and does some little tunning (merges two attributes and replaces empty values with <em>none</em>).
- </p>
-
- <p>This is formatted version of the <code>fstab</code> above:</p>
-
- <m:pre jazyk="text" src="examples/relpipe-out-fstab.formatted.txt"/>
-
- <p>
- And using following command we can verify, that the files differ only in comments and whitespace:
- </p>
-
- <pre>relpipe-in-fstab | relpipe-out-fstab | diff -w /etc/fstab -</pre>
-
- <p>
- Another check (should print same hashes):
- </p>
-
- <pre><![CDATA[relpipe-in-fstab | sha512sum
-relpipe-in-fstab | relpipe-out-fstab | relpipe-in-fstab | sha512sum]]></pre>
-
- <p>
- Regular implementation of <code>relpipe-out-fstab</code> will probably keep the comments
- (it needs also one more attribute and small change in <code>relpipe-in-fstab</code>).
- </p>
-
- <p>
- For just mere <code>fstab</code> reformatting, this approach is a bit overengineering.
- We could skip the whole relational thing and do just something like this:
- </p>
-
- <m:pre jazyk="bash">cat /etc/fstab | grep -v '^#' | sed -E 's/\s+/\n/g' | tr \\n \\0 | xargs -0 -n7 ...</m:pre>
-
- <p>
- plus prepend the comment (or do everything in Perl).
- But this example is intended as a demostration, how we can
- 1) prepend some additional data before the data from STDIN
- 2) use <m:name/> and traditional tools like <code>xargs</code> or <code>perl</code> together.
- And BTW we have implemented a (simple but working) <em>relpipe output filter</em> – and did it without any serious programming, just put some existing commands together :-)
- </p>
-
- <blockquote>
- <p>
- There is more Unix-nature in one line of shell script than there is in ten thousand lines of C.
- <m:podČarou>see <a href="http://www.catb.org/~esr/writings/unix-koans/ten-thousand.html">Master Foo and the Ten Thousand Lines</a></m:podČarou>
- </p>
- </blockquote>
-
- <h2>Writing an output filter in Bash</h2>
-
- <p>
- In previous example we created an output filter in Perl.
- We converted a relation to values separated by <code>\0</code> and then passed it through <code>xargs</code> to a perl <em>one-liner</em> (or a <em>multi-liner</em> in this case).
- But we can write such output filter in pure Bash without <code>xargs</code> and <code>perl</code>.
- Of course, it is still limited to a single relation (or it can process multiple relations of same type and do something like implicit <code>UNION ALL</code>).
- </p>
-
- <p>
- We will define a function that will help us with reading the <code>\0</code>-separated values and putting them into shell variables:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[read_nullbyte() { for v in "$@"; do export "$v"; read -r -d '' "$v"; done }]]></m:pre>
-
- <!--
- This version will not require the last \0:
- read_zero() { for v in "$@"; do export "$v"; read -r -d '' "$v" || [ ! -z "${!v}" ]; done }
- at least in case when the last value is not missing.
- Other values might be null/missing: \0\0 is OK.
- -->
-
- <p>
- Currently, there is no known way how to do this without a custom function (just with <code>read</code> built-in command of Bash and its parameters).
- But it is just a single line function, so not a big deal.
- </p>
-
- <p>
- And then we just read the values, put them in shell variables and process them in a cycle in a shell block of code:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
- | relpipe-out-nullbyte \
- | while read_nullbyte scheme device mount_point fs_type options dump pass; do
- echo "Device ${scheme:+$scheme=}$device is mounted" \
- "at $mount_point and contains $fs_type.";
- done]]></m:pre>
-
- <p>
- Which will print:
- </p>
-
- <pre><![CDATA[Device UUID=29758270-fd25-4a6c-a7bb-9a18302816af is mounted at / and contains ext4.
-Device /dev/sr0 is mounted at /media/cdrom0 and contains udf,iso9660.
-Device /dev/sde is mounted at /mnt/data and contains ext4.
-Device UUID=a2b5f230-a795-4f6f-a39b-9b57686c86d5 is mounted at /home and contains btrfs.
-Device /dev/mapper/sdf_crypt is mounted at /mnt/private and contains xfs.]]></pre>
-
- <p>
- Using this method, we can convert any single relation to any format (preferably some text one, but <code>printf</code> can produce also binary data).
- This is good for ad-hoc conversions and single-relation data.
- More powerful tools can be written in C++ and other languages like Java, Python, Guile etc. (when particular libraries are available).
- </p>
-
- <h2>Rename VG in /etc/fstab using relpipe-tr-sed</h2>
-
- <p>
- Assume that we have an <code>/etc/fstab</code> with many lines defining the mount-points (directories) of particular devices (disks) and we are using LVM.
- If we rename a volume group (VG), we have to change all of them. The lines look like this one:
- </p>
-
- <pre>/dev/alpha/photos /mnt/photos/ btrfs noauto,noatime,nodiratime 0 0</pre>
-
- <p>
- We want to change all lines from <code>alpha</code> to <code>beta</code> (the new VG name).
- This can be done by the power of regular expressions<m:podČarou>see <a href="https://en.wikibooks.org/wiki/Regular_Expressions/Simple_Regular_Expressions">Regular Expressions</a> at Wikibooks</m:podČarou> and this pipeline:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
- | relpipe-tr-sed 'fstab' 'device' '^/dev/alpha/' '/dev/beta/' \
- | relpipe-out-fstab]]></m:pre>
-
- <p>
- The <code>relpipe-tr-sed</code> tool works only with given relation (<code>fstab</code>) and given attribute (<code>device</code>)
- and it would leave untouched other relations and attributes in the stream.
- So it would not replace the strings on unwanted places (if there are any random matches).
- </p>
-
- <p>
- Even the relation names and attribute names are specified as a regular expression, so we can (purposefully) modify multiple relations or attributes.
- For example we can put zeroes in both <code>dump</code> and <code>pass</code> attributes:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-tr-sed 'fstab' 'dump|pass' '.+' '0' | relpipe-out-fstab]]></m:pre>
-
- <p>
- n.b. the data types must be respected, we can not e.g. put <code>abc</code> in the <code>pass</code> attribute because it is declared as <code>integer</code>.
- </p>
-
- <h2>Using relpipe-tr-sed with groups and backreferences</h2>
-
- <p>
- This tool also support regex groups and backreferences. Thus we can use parts of the matched string in our replacement string:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate r 1 a string "some string xxx_123 some zzz_456 other" \
- | relpipe-tr-sed 'r' 'a' '([a-z]{3})_([0-9]+)' '$2:$1' \
- | relpipe-out-tabular]]></m:pre>
-
- <p>Which would convert this:</p>
- <pre><![CDATA[r:
- ╭────────────────────────────────────────╮
- │ a (string) │
- ├────────────────────────────────────────┤
- │ some string xxx_123 some zzz_456 other │
- ╰────────────────────────────────────────╯
-Record count: 1]]></pre>
-
- <p>into this:</p>
- <pre><![CDATA[r:
- ╭────────────────────────────────────────╮
- │ a (string) │
- ├────────────────────────────────────────┤
- │ some string 123:xxx some 456:zzz other │
- ╰────────────────────────────────────────╯
-Record count: 1]]></pre>
-
- <p>
- If there were any other relations or attributes in the stream, they would be unaffected by this transformation,
- becase we specified <code>'r' 'a'</code> instead of some wider regular expression that would match more relations or attributes.
- </p>
-
- <h2>Filter /etc/fstab using relpipe-tr-grep</h2>
-
- <p>
- If we are interested only in certain records in some relation, we can filter it using <code>relpipe-tr-grep</code>.
- If we want to list e.g. only Btrfs and XFS file systems from our <code>fstab</code> (see above), we will run:
- </p>
-
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-tr-grep 'fstab' 'type' 'btrfs|xfs' | relpipe-out-tabular]]></m:pre>
-
- <p>and we will get following filtered result:</p>
- <pre><![CDATA[fstab:
- ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬──────────────────┬────────────────┬────────────────╮
- │ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
- ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼──────────────────┼────────────────┼────────────────┤
- │ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │
- │ │ /dev/mapper/sdf_crypt │ /mnt/private │ xfs │ relatime │ 0 │ 2 │
- ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴──────────────────┴────────────────┴────────────────╯
-Record count: 2]]></pre>
-
- <p>
- Command arguments are similar to <code>relpipe-tr-sed</code>.
- Everything is a regular expression.
- Only relations matching the regex will be filtered, others will flow through the pipeline unmodified.
- If the attribute regex matches more attribute names, filtering will be done with logical OR
- i.e. the record is included if at least one of that attributes matches the search regex.
- </p>
-
- <p>
- If we need exact match of the whole attribute, we have to use something like <code>'^btrfs|xfs$'</code>,
- otherwise mere substring-match is enough to include the record.
- </p>
-
- <h2>SELECT mount_point FROM fstab WHERE type IN ('btrfs', 'xfs')</h2>
-
- <p>
- While reading classic pipelines involving <code>grep</code> and <code>cut</code> commands
- we must notice that there is some similarity with simple SQL queries looking like:
- </p>
-
- <m:pre jazyk="SQL">SELECT "some", "cut", "fields" FROM stdin WHERE grep_matches(whole_line);</m:pre>
-
- <p>
- And that is true: <code>grep</code> does restriction<m:podČarou>
- <a href="https://en.wikipedia.org/wiki/Selection_(relational_algebra)">selecting</a> only certain records from the original relation according to their match with given conditions</m:podČarou>
- and <code>cut</code> does projection<m:podČarou>limited subset of what <a href="https://en.wikipedia.org/wiki/Projection_(relational_algebra)">projection</a> means</m:podČarou>.
- Now we can do these relational operations using our relational tools called <code>relpipe-tr-grep</code> and <code>relpipe-tr-cut</code>.
- </p>
-
- <p>
- Assume that we need only <code>mount_point</code> fields from our <code>fstab</code> where <code>type</code> is <code>btrfs</code> or <code>xfs</code>
- and we want to do something (a shell script block) with these directory paths.
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
- | relpipe-tr-grep 'fstab' 'type' '^btrfs|xfs$' \
- | relpipe-tr-cut 'fstab' 'mount_point' \
- | relpipe-out-nullbyte \
- | while read -r -d '' m; do
- echo "$m";
- done]]></m:pre>
-
- <p>
- The <code>relpipe-tr-cut</code> tool has similar syntax to its <em>grep</em> and <em>sed</em> siblings and also uses the power of regular expressions.
- In this case it modifies on-the-fly the <code>fstab</code> relation and drops all its attributes except the <code>mount_point</code> one.
- </p>
-
- <p>
- Then we pass the data to the Bash <code>while</code> cycle.
- In such simple scenario (just <code>echo</code>), we could use <code>xargs</code> as in examples above,
- but in this syntax, we can write whole block of shell commands for each record/value and do more complex actions with them.
- </p>
-
- <h2>More projections with relpipe-tr-cut</h2>
-
- <p>
- Assume that we have a simple relation containing numbers:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[seq 0 8 \
- | tr \\n \\0 \
- | relpipe-in-cli generate-from-stdin numbers 3 a integer b integer c integer \
- > numbers.rp]]></m:pre>
-
- <p>and second one containing letters:</p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate letters 2 a string b string A B C D > letters.rp]]></m:pre>
-
- <p>We saved them into two files and then combined them into a single file. We will work with them as they are a single stream of relations:</p>
-
- <m:pre jazyk="bash"><![CDATA[cat numbers.rp letters.rp > both.rp;
-cat both.rp | relpipe-out-tabular]]></m:pre>
-
- <p>Will print:</p>
-
- <pre><![CDATA[numbers:
- ╭─────────────┬─────────────┬─────────────╮
- │ a (integer) │ b (integer) │ c (integer) │
- ├─────────────┼─────────────┼─────────────┤
- │ 0 │ 1 │ 2 │
- │ 3 │ 4 │ 5 │
- │ 6 │ 7 │ 8 │
- ╰─────────────┴─────────────┴─────────────╯
-Record count: 3
-letters:
- ╭─────────────┬─────────────╮
- │ a (string) │ b (string) │
- ├─────────────┼─────────────┤
- │ A │ B │
- │ C │ D │
- ╰─────────────┴─────────────╯
-Record count: 2]]></pre>
-
- <p>We can put away the <code>a</code> attribute from the <code>numbers</code> relation:</p>
-
- <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|c' | relpipe-out-tabular</m:pre>
-
- <p>and leave the <code>letters</code> relation unaffected:</p>
-
- <pre><![CDATA[numbers:
- ╭─────────────┬─────────────╮
- │ b (integer) │ c (integer) │
- ├─────────────┼─────────────┤
- │ 1 │ 2 │
- │ 4 │ 5 │
- │ 7 │ 8 │
- ╰─────────────┴─────────────╯
-Record count: 3
-letters:
- ╭─────────────┬─────────────╮
- │ a (string) │ b (string) │
- ├─────────────┼─────────────┤
- │ A │ B │
- │ C │ D │
- ╰─────────────┴─────────────╯
-Record count: 2]]></pre>
-
- <p>Or we can remove <code>a</code> from both relations resp. keep there only attributes whose names match <code>'b|c'</code> regex:</p>
-
- <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut '.*' 'b|c' | relpipe-out-tabular</m:pre>
-
- <p>Instead of <code>'.*'</code> we could use <code>'numbers|letters'</code> and in this case it will give the same result:</p>
-
- <pre><![CDATA[numbers:
- ╭─────────────┬─────────────╮
- │ b (integer) │ c (integer) │
- ├─────────────┼─────────────┤
- │ 1 │ 2 │
- │ 4 │ 5 │
- │ 7 │ 8 │
- ╰─────────────┴─────────────╯
-Record count: 3
-letters:
- ╭─────────────╮
- │ b (string) │
- ├─────────────┤
- │ B │
- │ D │
- ╰─────────────╯
-Record count: 2]]></pre>
-
- <p>All the time, we are reducing the attributes. But we can also multiply them or change their order:</p>
-
- <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|a|c' 'b' 'a' 'a' | relpipe-out-tabular</m:pre>
-
- <p>
- n.b. the order in <code>'b|a|c'</code> does not matter and if such regex matches, it preserves the original order of the attributes;
- but if we use multiple regexes to specify attributes, their order and count matters:
- </p>
-
- <pre><![CDATA[numbers:
- ╭─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────╮
- │ a (integer) │ b (integer) │ c (integer) │ b (integer) │ a (integer) │ a (integer) │
- ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
- │ 0 │ 1 │ 2 │ 1 │ 0 │ 0 │
- │ 3 │ 4 │ 5 │ 4 │ 3 │ 3 │
- │ 6 │ 7 │ 8 │ 7 │ 6 │ 6 │
- ╰─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────╯
-Record count: 3
-letters:
- ╭─────────────┬─────────────╮
- │ a (string) │ b (string) │
- ├─────────────┼─────────────┤
- │ A │ B │
- │ C │ D │
- ╰─────────────┴─────────────╯
-Record count: 2]]></pre>
-
- <p>
- The <code>letters</code> relation stays rock steady and <code>relpipe-tr-cut 'numbers'</code> does not affect it in any way.
- </p>
-
-
- <h2>Read an Atom feed using XQuery and relpipe-in-xml</h2>
-
- <p>
- Atom Syndication Format is a standard for publishing web feeds a.k.a web syndication.
- These feeds are usually consumed by a <em>feed reeder</em> that aggregates news from many websites and displays them in a uniform format.
- The Atom feed is an XML with a list of recent news containing their titles, URLs and short annotations.
- It also contains some metadata (website author, title etc.).
- </p>
- <p>
- Using this simple XQuery<m:podČarou>see <a href="https://en.wikibooks.org/wiki/XQuery">XQuery</a> at Wikibooks</m:podČarou>
- <em>FLWOR Expression</em>
- we convert the Atom feed into the XML serialization of relational data:
- </p>
-
- <m:pre jazyk="xq" src="examples/atom.xq" odkaz="ano"/>
-
- <p>
- This is similar operation to <a href="https://www.postgresql.org/docs/current/functions-xml.html">xmltable</a> used in SQL databases.
- It converts an XML tree structure to the relational form.
- In our case, the output is still XML, but in a format that can be read by <code>relpipe-in-xml</code>.
- All put together in a single shell script:
- </p>
-
- <m:pre jazyk="bash" src="examples/atom.sh"/>
-
- <p>Will generate a table with web news:</p>
-
- <m:pre jazyk="text" src="examples/atom.txt"/>
-
- <p>
- For frequent usage we can create a script or funcrion called <code>relpipe-in-atom</code>
- that reads Atom XML on STDIN and generates relational data on STDOUT.
- And then do any of these:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[wget … | relpipe-in-atom | relpipe-out-tabular
-wget … | relpipe-in-atom | relpipe-out-csv
-wget … | relpipe-in-atom | relpipe-out-gui
-wget … | relpipe-in-atom | relpipe-out-nullbyte | while read_nullbyte published title url; do echo "$title"; done
-wget … | relpipe-in-atom | relpipe-out-csv | csv2rec | …
-]]></m:pre>
-
- <p>
- There are several implementations of XQuery.
- <a href="http://galax.sourceforge.net/">Galax</a> is one of them.
- <a href="http://xqilla.sourceforge.net/">XQilla</a> or
- <a href="http://basex.org/basex/xquery/">BaseX</a> are another ones (and support newer versions of the standard).
- There are also XSLT processors like <a href="http://xmlsoft.org/XSLT/xsltproc2.html">xsltproc</a>.
- BaseX can be used instead of Galax – we just replace
- <code>galax-run -context-item /dev/stdin</code> with <code>basex -i /dev/stdin</code>.
- </p>
-
- <p>
- Reading Atom feeds in a terminal might not be the best way to get news from a website,
- but this simple example learns us how to convert arbitrary XML to relational data.
- And of course, we can generate multiple relations from a single XML using a single XQuery script.
- XQuery can be also used for operations like JOIN or UNION and for filtering and other transformations
- as will be shown in further examples.
- </p>
-
- <h2>Read files metadata using relpipe-in-filesystem</h2>
-
- <p>
- Our filesystems contain valuable information and using proper tools we can extract them.
- Using <code>relpipe-in-filesystem</code> we can gather metadata of our files and process them in relational way.
- This tools does not traverse our filesystem (remember the rule: <em>do one thing and do it well</em>),
- instead, it eats a list of file paths separated by <code>\0</code>.
- It is typically used together with the <code>find</code> command, but we can also create such list by hand using e.g. <code>printf</code> command or <code>tr \\n \\0</code>.
- </p>
-
- <m:pre jazyk="bash">find /etc/ssh/ -print0 | relpipe-in-filesystem | relpipe-out-tabular</m:pre>
-
- <p>
- In the basic scenario, it behaves like <code>ls -l</code>, just more modular and machine-readable:
- </p>
-
- <pre><![CDATA[filesystem:
- ╭───────────────────────────────────┬───────────────┬────────────────┬────────────────┬────────────────╮
- │ path (string) │ type (string) │ size (integer) │ owner (string) │ group (string) │
- ├───────────────────────────────────┼───────────────┼────────────────┼────────────────┼────────────────┤
- │ /etc/ssh/ │ d │ 0 │ root │ root │
- │ /etc/ssh/moduli │ f │ 553122 │ root │ root │
- │ /etc/ssh/ssh_host_ecdsa_key │ f │ 227 │ root │ root │
- │ /etc/ssh/sshd_config │ f │ 3262 │ root │ root │
- │ /etc/ssh/ssh_host_ed25519_key.pub │ f │ 91 │ root │ root │
- │ /etc/ssh/ssh_host_ecdsa_key.pub │ f │ 171 │ root │ root │
- │ /etc/ssh/ssh_host_rsa_key │ f │ 1679 │ root │ root │
- │ /etc/ssh/ssh_config │ f │ 1580 │ root │ root │
- │ /etc/ssh/ssh_host_ed25519_key │ f │ 399 │ root │ root │
- │ /etc/ssh/ssh_import_id │ f │ 338 │ root │ root │
- │ /etc/ssh/ssh_host_rsa_key.pub │ f │ 391 │ root │ root │
- ╰───────────────────────────────────┴───────────────┴────────────────┴────────────────┴────────────────╯
-Record count: 11]]></pre>
-
- <p>
- We can specify desired attributes and also their aliases:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[find /etc/ssh/ -print0 \
- | relpipe-in-filesystem \
- --file path --as artefact \
- --file size \
- --file owner --as dear_owner \
- | relpipe-out-tabular]]></m:pre>
-
- <p>And we will get a subset with renamed attributes:</p>
-
- <pre><![CDATA[filesystem:
- ╭───────────────────────────────────┬────────────────┬─────────────────────╮
- │ artefact (string) │ size (integer) │ dear_owner (string) │
- ├───────────────────────────────────┼────────────────┼─────────────────────┤
- │ /etc/ssh/ │ 0 │ root │
- │ /etc/ssh/moduli │ 553122 │ root │
- │ /etc/ssh/ssh_host_ecdsa_key │ 227 │ root │
- │ /etc/ssh/sshd_config │ 3262 │ root │
- │ /etc/ssh/ssh_host_ed25519_key.pub │ 91 │ root │
- │ /etc/ssh/ssh_host_ecdsa_key.pub │ 171 │ root │
- │ /etc/ssh/ssh_host_rsa_key │ 1679 │ root │
- │ /etc/ssh/ssh_config │ 1580 │ root │
- │ /etc/ssh/ssh_host_ed25519_key │ 399 │ root │
- │ /etc/ssh/ssh_import_id │ 338 │ root │
- │ /etc/ssh/ssh_host_rsa_key.pub │ 391 │ root │
- ╰───────────────────────────────────┴────────────────┴─────────────────────╯
-Record count: 11]]></pre>
-
- <p>
- We can also choose, which path format fits our needs best:
- </p>
-
-
- <m:pre jazyk="bash"><![CDATA[find ../../etc/ssh/ -print0 \
- | relpipe-in-filesystem \
- --file path \
- --file path_absolute \
- --file path_canonical \
- --file name \
- | relpipe-out-tabular]]></m:pre>
-
- <p>The <code>path</code> attribute contains the exact same value as was on input. Other formats are derived:</p>
-
- <pre><![CDATA[filesystem:
- ╭────────────────────────────────────────┬───────────────────────────────────────────────────┬───────────────────────────────────┬──────────────────────────╮
- │ path (string) │ path_absolute (string) │ path_canonical (string) │ name (string) │
- ├────────────────────────────────────────┼───────────────────────────────────────────────────┼───────────────────────────────────┼──────────────────────────┤
- │ ../../etc/ssh/ │ /home/hack/../../etc/ssh/ │ /etc/ssh │ │
- │ ../../etc/ssh/moduli │ /home/hack/../../etc/ssh/moduli │ /etc/ssh/moduli │ moduli │
- │ ../../etc/ssh/ssh_host_ecdsa_key │ /home/hack/../../etc/ssh/ssh_host_ecdsa_key │ /etc/ssh/ssh_host_ecdsa_key │ ssh_host_ecdsa_key │
- │ ../../etc/ssh/sshd_config │ /home/hack/../../etc/ssh/sshd_config │ /etc/ssh/sshd_config │ sshd_config │
- │ ../../etc/ssh/ssh_host_ed25519_key.pub │ /home/hack/../../etc/ssh/ssh_host_ed25519_key.pub │ /etc/ssh/ssh_host_ed25519_key.pub │ ssh_host_ed25519_key.pub │
- │ ../../etc/ssh/ssh_host_ecdsa_key.pub │ /home/hack/../../etc/ssh/ssh_host_ecdsa_key.pub │ /etc/ssh/ssh_host_ecdsa_key.pub │ ssh_host_ecdsa_key.pub │
- │ ../../etc/ssh/ssh_host_rsa_key │ /home/hack/../../etc/ssh/ssh_host_rsa_key │ /etc/ssh/ssh_host_rsa_key │ ssh_host_rsa_key │
- │ ../../etc/ssh/ssh_config │ /home/hack/../../etc/ssh/ssh_config │ /etc/ssh/ssh_config │ ssh_config │
- │ ../../etc/ssh/ssh_host_ed25519_key │ /home/hack/../../etc/ssh/ssh_host_ed25519_key │ /etc/ssh/ssh_host_ed25519_key │ ssh_host_ed25519_key │
- │ ../../etc/ssh/ssh_import_id │ /home/hack/../../etc/ssh/ssh_import_id │ /etc/ssh/ssh_import_id │ ssh_import_id │
- │ ../../etc/ssh/ssh_host_rsa_key.pub │ /home/hack/../../etc/ssh/ssh_host_rsa_key.pub │ /etc/ssh/ssh_host_rsa_key.pub │ ssh_host_rsa_key.pub │
- ╰────────────────────────────────────────┴───────────────────────────────────────────────────┴───────────────────────────────────┴──────────────────────────╯
-Record count: 11]]></pre>
-
- <p>
- We can also <em>select</em> symlink targets or their types.
- If some file is missing or is inaccessible due to permissions, only <code>path</code> is printed for it.
- </p>
-
- <p>
- Tip: if we are looking for files in the current directory and want omit the „.“ we just call: <code>find -printf '%P\0'</code> instead of <code>find -print0</code>.
- </p>
-
-
- <h2>Using relpipe-in-filesystem to read extended attributes</h2>
-
- <p>
- Extended attributes (xattr) are additional <em>key=value</em> pairs that can be attached to our files.
- They are not stored inside the files, but on the filesystem.
- Thus they are independent of particular file format (which might not support metadata)
- and we can use them e.g. for tagging, cataloguing or adding some notes to our files.
- Some tools like GNU Wget use extended attributes to store metadata like the original URL from which the file was downloaded.
- </p>
-
- <m:pre jazyk="bash"><![CDATA[wget --recursive --level=1 https://relational-pipes.globalcode.info/
-find -type f -printf '%P\0' \
- | relpipe-in-filesystem --file path --file size --xattr xdg.origin.url \
- | relpipe-out-tabular
-]]></m:pre>
-
- <p>And now we know, where the files on our disk came from:</p>
-
- <pre><![CDATA[filesystem:
- ╭───────────────────────────┬────────────────┬────────────────────────────────────────────────────────────────────╮
- │ path (string) │ size (integer) │ xdg.origin.url (string) │
- ├───────────────────────────┼────────────────┼────────────────────────────────────────────────────────────────────┤
- │ index.html │ 12159 │ https://relational-pipes.globalcode.info/v_0/ │
- │ v_0/atom.xml │ 4613 │ https://relational-pipes.globalcode.info/v_0/atom.xml │
- │ v_0/rss.xml │ 4926 │ https://relational-pipes.globalcode.info/v_0/rss.xml │
- │ v_0/js/skript.js │ 2126 │ https://relational-pipes.globalcode.info/v_0/js/skript.js │
- │ v_0/css/styl.css │ 2988 │ https://relational-pipes.globalcode.info/v_0/css/styl.css │
- │ v_0/css/relpipe.css │ 1095 │ https://relational-pipes.globalcode.info/v_0/css/relpipe.css │
- │ v_0/css/syntaxe.css │ 3584 │ https://relational-pipes.globalcode.info/v_0/css/syntaxe.css │
- │ v_0/index.xhtml │ 12159 │ https://relational-pipes.globalcode.info/v_0/index.xhtml │
- │ v_0/grafika/logo.png │ 3298 │ https://relational-pipes.globalcode.info/v_0/grafika/logo.png │
- │ v_0/principles.xhtml │ 17171 │ https://relational-pipes.globalcode.info/v_0/principles.xhtml │
- │ v_0/roadmap.xhtml │ 11097 │ https://relational-pipes.globalcode.info/v_0/roadmap.xhtml │
- │ v_0/faq.xhtml │ 11080 │ https://relational-pipes.globalcode.info/v_0/faq.xhtml │
- │ v_0/specification.xhtml │ 12983 │ https://relational-pipes.globalcode.info/v_0/specification.xhtml │
- │ v_0/implementation.xhtml │ 10810 │ https://relational-pipes.globalcode.info/v_0/implementation.xhtml │
- │ v_0/examples.xhtml │ 76958 │ https://relational-pipes.globalcode.info/v_0/examples.xhtml │
- │ v_0/license.xhtml │ 65580 │ https://relational-pipes.globalcode.info/v_0/license.xhtml │
- │ v_0/screenshots.xhtml │ 5708 │ https://relational-pipes.globalcode.info/v_0/screenshots.xhtml │
- │ v_0/download.xhtml │ 5204 │ https://relational-pipes.globalcode.info/v_0/download.xhtml │
- │ v_0/contact.xhtml │ 4940 │ https://relational-pipes.globalcode.info/v_0/contact.xhtml │
- │ v_0/classic-example.xhtml │ 9539 │ https://relational-pipes.globalcode.info/v_0/classic-example.xhtml │
- ╰───────────────────────────┴────────────────┴────────────────────────────────────────────────────────────────────╯
-Record count: 20]]></pre>
-
- <p>
- If we like the BeOS/Haiku style, we can create empty files with some attributes attached and use our filesystem as a simple database
- and query it using relational tools.
- It will lack indexing, but for basic scenarios like <em>address book</em> it will be fast enough
- and we can feel a bit of BeOS/Haiku atmosphere in our contemporary GNU/Linux systems.
- But be careful with that because some editors delete and recreate files while saving them, which destroys the xattrs.
- Tools like <code>rsync</code> or <code>tar</code> with <code>--xattrs</code> option will backup our attributes securely.
- </p>
-
+ <m:skript jazyk="bash" výstup="xhtml"><![CDATA[
+ echo "<ul>";
+ DIR=$(dirname "$XWG_STRANKA_SOUBOR");
+ DIR="$DIR/../vstup"
+ cd "$DIR";
+ # TODO: use XQuery? (but Grep and Bash are everywhere)
+ for f in examples-*.xml; do
+ grep -oP '(?<=<m:pořadí-příkladu>).*(?=</m:pořadí-příkladu>)' $f | tr \\n ' '
+ echo "<li><m:a href=\"${f//\.xml/}\">$(grep -oP '(?<=<nadpis>).*(?=</nadpis>)' $f)</m:a> – $(grep -oP '(?<=<perex>).+(?=</perex>)' $f)</li>";
+ done | sort | sed -E 's/^[0-9]+ //'
+ echo "</ul>";
+ ]]></m:skript>
</text>