--- a/relpipe-data/examples.xml Sun Jan 27 16:16:41 2019 +0100
+++ b/relpipe-data/examples.xml Tue Feb 05 19:18:28 2019 +0100
@@ -14,851 +14,18 @@
But they should also work in other shells.
</p>
- <h2>relpipe-in-cli: Hello Wordl!</h2>
-
- <p>
- Let's start with an obligatory Hello World example.
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate "relation_from_cli" 3 \
- "a" "integer" \
- "b" "string" \
- "c" "boolean" \
- "1" "Hello" "true" \
- "2" "World!" "false"]]></m:pre>
-
- <p>
- This command generates relational data.
- In order to see them, we need to convert them to some other format.
- For now, we will use the "tabular" format and pipe relational data to the <code>relpipe-out-tabular</code>.
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate "relation_from_cli" 3 \
- "a" "integer" \
- "b" "string" \
- "c" "boolean" \
- "1" "Hello" "true" \
- "2" "World!" "false" \
- | relpipe-out-tabular]]></m:pre>
-
- <p>Output:</p>
-
- <pre><![CDATA[relation_from_cli:
- ╭─────────────┬────────────┬─────────────╮
- │ a (integer) │ b (string) │ c (boolean) │
- ├─────────────┼────────────┼─────────────┤
- │ 1 │ Hello │ true │
- │ 2 │ World! │ false │
- ╰─────────────┴────────────┴─────────────╯
-Record count: 2
-]]></pre>
-
- <p>
- The syntax is simple as we see above. We specify the name of the relation, number of attributes,
- and then their definitions (names and types),
- followed by the data.
- </p>
-
- <p>
- A single stream may contain multiple relations:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[(relpipe-in-cli generate a 1 x string hello; \
- relpipe-in-cli generate b 1 y string world) \
- | relpipe-out-tabular]]></m:pre>
-
- <p>
- Thus we can combine various commands or files and pass the result to a single relational output filter (<code>relpipe-out-tabular</code> in this case) and get:
- </p>
-
- <pre><![CDATA[a:
- ╭────────────╮
- │ x (string) │
- ├────────────┤
- │ hello │
- ╰────────────╯
-Record count: 1
-b:
- ╭────────────╮
- │ y (string) │
- ├────────────┤
- │ world │
- ╰────────────╯
-Record count: 1]]></pre>
-
- <h2>relpipe-in-cli: STDIN</h2>
-
- <p>
- The number of <abbr title="Command-line interface">CLI</abbr> arguments is limited and they are passed at once to the process.
- So there is option to pass the values from STDIN instead of CLI arguments.
- Values on STDIN are expected to be separated by the null-byte.
- We can generate such data e.g. using <code>echo</code> and <code>tr</code> (or using <code>printf</code> or other commands):
- </p>
-
- <m:pre jazyk="bash"><![CDATA[echo -e "1\nHello\ntrue\n2\nWorld\nfalse" \
- | tr \\n \\0 \
- | relpipe-in-cli generate-from-stdin relation_from_stdin 3 \
- a integer \
- b string \
- c boolean \
- | relpipe-out-tabular]]></m:pre>
-
- <p>
- The output is same as above.
- We can use this approach to convert various formats to relational data.
- There are lot of data already in the form of null-separated values e.g. the process arguments:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[cat /proc/$(pidof mc)/cmdline \
- | relpipe-in-cli generate-from-stdin mc_args 1 a string \
- | relpipe-out-tabular
-]]></m:pre>
-
- <p>If we have <code>mc /etc/ /tmp/</code> running in some other terminal, the output will be:</p>
-
- <pre><![CDATA[mc_args:
- ╭────────────╮
- │ a (string) │
- ├────────────┤
- │ mc │
- │ /etc/ │
- │ /tmp/ │
- ╰────────────╯
-Record count: 3]]></pre>
-
- <p>
- Also the <code>find</code> command can produce data separated by the null-byte:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[find /etc/ -name '*ssh*_*' -print0 \
- | relpipe-in-cli generate-from-stdin files 1 file_name string \
- | relpipe-out-tabular]]></m:pre>
-
- <p>Will display something like this:</p>
-
- <pre><![CDATA[files:
- ╭───────────────────────────────────╮
- │ file_name (string) │
- ├───────────────────────────────────┤
- │ /etc/ssh/ssh_host_ecdsa_key │
- │ /etc/ssh/sshd_config │
- │ /etc/ssh/ssh_host_ed25519_key.pub │
- │ /etc/ssh/ssh_host_ecdsa_key.pub │
- │ /etc/ssh/ssh_host_rsa_key │
- │ /etc/ssh/ssh_config │
- │ /etc/ssh/ssh_host_ed25519_key │
- │ /etc/ssh/ssh_import_id │
- │ /etc/ssh/ssh_host_rsa_key.pub │
- ╰───────────────────────────────────╯
-Record count: 9]]></pre>
-
-
- <h2>relpipe-in-fstab</h2>
-
- <p>
- Using command <code>relpipe-in-fstab</code> we can convert the <code>/etc/fstab</code> or <code>/etc/mtab</code> to relational data
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-out-tabular]]></m:pre>
-
- <p>
- and see them as a nice table:
- </p>
-
- <pre><![CDATA[fstab:
- ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
- │ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
- ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
- │ UUID │ 29758270-fd25-4a6c-a7bb-9a18302816af │ / │ ext4 │ relatime,user_xattr,errors=remount-ro │ 0 │ 1 │
- │ │ /dev/sr0 │ /media/cdrom0 │ udf,iso9660 │ user,noauto │ 0 │ 0 │
- │ │ /dev/sde │ /mnt/data │ ext4 │ relatime,user_xattr,errors=remount-ro │ 0 │ 2 │
- │ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │
- │ │ /dev/mapper/sdf_crypt │ /mnt/private │ xfs │ relatime │ 0 │ 2 │
- ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
-Record count: 5]]></pre>
-
- <p>And we can do the same also with a remote <code>fstab</code> or <code>mtab</code>; just by adding <code>ssh</code> to the pipeline:</p>
-
- <m:pre jazyk="bash"><![CDATA[ssh example.com cat /etc/mtab | relpipe-in-fstab | relpipe-out-tabular]]></m:pre>
-
- <p>
- The <code>cat</code> runs remotely. The <code>relpipe-in-fstab</code> and <code>relpipe-out-tabular</code> run on our machine.
- </p>
-
- <p>
- n.b. the <code>relpipe-in-fstab</code> reads the <code>/etc/fstab</code> if executed on TTY. Otherwise, it reads the STDIN.
- </p>
-
- <h2>relpipe-out-xml</h2>
-
- <p>
- Relational data can be converted to various formats and one of them is the XML.
- This is a good option for further processing e.g. using XSLT transformation or passing the XML data to some other tool.
- Just use <code>relpipe-out-xml</code> instead of <code>relpipe-out-tabular</code> and the rest of the pipeline remains unchanged:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[ssh example.com cat /etc/mtab | relpipe-in-fstab | relpipe-out-xml]]></m:pre>
-
- <p>
- Will produce XML like this:
- </p>
-
- <m:pre jazyk="xml" src="examples/relpipe-out-fstab.xml"/>
-
- <p>
- Thanks to XSLT, this XML can be easily converted e.g. to an XHTML table (<code>table|tr|td</code>) or other format.
- Someone can convert such data to a (La)TeX table.
- </p>
-
- <p>
- n.b. the format is not final and will change i future versions (XML namespace, more metadata etc.).
- </p>
-
-
- <h2>relpipe-tr-validator</h2>
-
- <p>
- Just a passthrough command, so these pipelines should produce the same hash:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[
-relpipe-in-fstab | relpipe-tr-validator | sha512sum
-relpipe-in-fstab | sha512sum]]></m:pre>
-
- <p>
- This tool can be used for testing whether a file contains valid relational data:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[
-if relpipe-tr-validator < "some-file.rp" &> /dev/null; then
- echo "valid relational data";
-else
- echo "garbage";
-fi]]></m:pre>
-
- <p>or as a one-liner:</p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-tr-validator < "some-file.rp" &> /dev/null && echo "ok" || echo "error"]]></m:pre>
-
- <p>
- If an error is found, it is reported on STDERR. So just omit the <code>&</code> in order to see the error message.
- </p>
-
-
- <h2>/etc/fstab formatting using -in-fstab, -out-nullbyte, xargs and Perl</h2>
-
- <p>
- As we have seen before, we can convert <code>/etc/fstab</code> (or <code>mtab</code>)
- to e.g. an XML or a nice and colorful table using <m:name/>.
- But we can also convert these data back to the <code>fstab</code> format. And do it with proper indentation/padding.
- Fstab has a simple format where values are separated by one or more whitespace characters.
- But without proper indentation, these files look a bit obfuscated and hard to read (however, they are valid).
- </p>
-
- <m:pre jazyk="text" src="examples/relpipe-out-fstab.txt"/>
-
- <p>
- So let's build a pipeline that reformats the <code>fstab</code> and makes it more readable.
- </p>
-
- <m:pre jazyk="bash">relpipe-in-fstab | relpipe-out-fstab > reformatted-fstab.txt</m:pre>
-
- <p>
- We can hack together a script called <code>relpipe-out-fstab</code> that accepts relational data and produces <code>fstab</code> data.
- Later this will be probably implemented as a regular tool, but for now, it is just an example of a ad-hoc shell script:
- </p>
-
- <m:pre jazyk="bash" src="examples/relpipe-out-fstab.sh" odkaz="ano"/>
-
- <p>
- In the first part, we prepend a single record (<code>relpipe-in-cli</code>) before the data coming from STDIN (<code>cat</code>).
- Then, we use <code>relpipe-out-nullbyte</code> to convert relational data to values separated by a null-byte.
- This command processes only attribute values (skips relation and attribute names).
- Then we used <code>xargs</code> to read the null-separated values and execute a Perl command for each record (pass to it a same number of arguments, as we have attributes: <code>--max-args=7</code>).
- Perl does the actual formatting: adds padding and does some little tunning (merges two attributes and replaces empty values with <em>none</em>).
- </p>
-
- <p>This is formatted version of the <code>fstab</code> above:</p>
-
- <m:pre jazyk="text" src="examples/relpipe-out-fstab.formatted.txt"/>
-
- <p>
- And using following command we can verify, that the files differ only in comments and whitespace:
- </p>
-
- <pre>relpipe-in-fstab | relpipe-out-fstab | diff -w /etc/fstab -</pre>
-
- <p>
- Another check (should print same hashes):
- </p>
-
- <pre><![CDATA[relpipe-in-fstab | sha512sum
-relpipe-in-fstab | relpipe-out-fstab | relpipe-in-fstab | sha512sum]]></pre>
-
- <p>
- Regular implementation of <code>relpipe-out-fstab</code> will probably keep the comments
- (it needs also one more attribute and small change in <code>relpipe-in-fstab</code>).
- </p>
-
- <p>
- For just mere <code>fstab</code> reformatting, this approach is a bit overengineering.
- We could skip the whole relational thing and do just something like this:
- </p>
-
- <m:pre jazyk="bash">cat /etc/fstab | grep -v '^#' | sed -E 's/\s+/\n/g' | tr \\n \\0 | xargs -0 -n7 ...</m:pre>
-
- <p>
- plus prepend the comment (or do everything in Perl).
- But this example is intended as a demostration, how we can
- 1) prepend some additional data before the data from STDIN
- 2) use <m:name/> and traditional tools like <code>xargs</code> or <code>perl</code> together.
- And BTW we have implemented a (simple but working) <em>relpipe output filter</em> – and did it without any serious programming, just put some existing commands together :-)
- </p>
-
- <blockquote>
- <p>
- There is more Unix-nature in one line of shell script than there is in ten thousand lines of C.
- <m:podČarou>see <a href="http://www.catb.org/~esr/writings/unix-koans/ten-thousand.html">Master Foo and the Ten Thousand Lines</a></m:podČarou>
- </p>
- </blockquote>
-
- <h2>Writing an output filter in Bash</h2>
-
- <p>
- In previous example we created an output filter in Perl.
- We converted a relation to values separated by <code>\0</code> and then passed it through <code>xargs</code> to a perl <em>one-liner</em> (or a <em>multi-liner</em> in this case).
- But we can write such output filter in pure Bash without <code>xargs</code> and <code>perl</code>.
- Of course, it is still limited to a single relation (or it can process multiple relations of same type and do something like implicit <code>UNION ALL</code>).
- </p>
-
- <p>
- We will define a function that will help us with reading the <code>\0</code>-separated values and putting them into shell variables:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[read_nullbyte() { for v in "$@"; do export "$v"; read -r -d '' "$v"; done }]]></m:pre>
-
- <!--
- This version will not require the last \0:
- read_zero() { for v in "$@"; do export "$v"; read -r -d '' "$v" || [ ! -z "${!v}" ]; done }
- at least in case when the last value is not missing.
- Other values might be null/missing: \0\0 is OK.
- -->
-
- <p>
- Currently, there is no known way how to do this without a custom function (just with <code>read</code> built-in command of Bash and its parameters).
- But it is just a single line function, so not a big deal.
- </p>
-
- <p>
- And then we just read the values, put them in shell variables and process them in a cycle in a shell block of code:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
- | relpipe-out-nullbyte \
- | while read_nullbyte scheme device mount_point fs_type options dump pass; do
- echo "Device ${scheme:+$scheme=}$device is mounted" \
- "at $mount_point and contains $fs_type.";
- done]]></m:pre>
-
- <p>
- Which will print:
- </p>
-
- <pre><![CDATA[Device UUID=29758270-fd25-4a6c-a7bb-9a18302816af is mounted at / and contains ext4.
-Device /dev/sr0 is mounted at /media/cdrom0 and contains udf,iso9660.
-Device /dev/sde is mounted at /mnt/data and contains ext4.
-Device UUID=a2b5f230-a795-4f6f-a39b-9b57686c86d5 is mounted at /home and contains btrfs.
-Device /dev/mapper/sdf_crypt is mounted at /mnt/private and contains xfs.]]></pre>
-
- <p>
- Using this method, we can convert any single relation to any format (preferably some text one, but <code>printf</code> can produce also binary data).
- This is good for ad-hoc conversions and single-relation data.
- More powerful tools can be written in C++ and other languages like Java, Python, Guile etc. (when particular libraries are available).
- </p>
-
- <h2>Rename VG in /etc/fstab using relpipe-tr-sed</h2>
-
- <p>
- Assume that we have an <code>/etc/fstab</code> with many lines defining the mount-points (directories) of particular devices (disks) and we are using LVM.
- If we rename a volume group (VG), we have to change all of them. The lines look like this one:
- </p>
-
- <pre>/dev/alpha/photos /mnt/photos/ btrfs noauto,noatime,nodiratime 0 0</pre>
-
- <p>
- We want to change all lines from <code>alpha</code> to <code>beta</code> (the new VG name).
- This can be done by the power of regular expressions<m:podČarou>see <a href="https://en.wikibooks.org/wiki/Regular_Expressions/Simple_Regular_Expressions">Regular Expressions</a> at Wikibooks</m:podČarou> and this pipeline:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
- | relpipe-tr-sed 'fstab' 'device' '^/dev/alpha/' '/dev/beta/' \
- | relpipe-out-fstab]]></m:pre>
-
- <p>
- The <code>relpipe-tr-sed</code> tool works only with given relation (<code>fstab</code>) and given attribute (<code>device</code>)
- and it would leave untouched other relations and attributes in the stream.
- So it would not replace the strings on unwanted places (if there are any random matches).
- </p>
-
- <p>
- Even the relation names and attribute names are specified as a regular expression, so we can (purposefully) modify multiple relations or attributes.
- For example we can put zeroes in both <code>dump</code> and <code>pass</code> attributes:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-tr-sed 'fstab' 'dump|pass' '.+' '0' | relpipe-out-fstab]]></m:pre>
-
- <p>
- n.b. the data types must be respected, we can not e.g. put <code>abc</code> in the <code>pass</code> attribute because it is declared as <code>integer</code>.
- </p>
-
- <h2>Using relpipe-tr-sed with groups and backreferences</h2>
-
- <p>
- This tool also support regex groups and backreferences. Thus we can use parts of the matched string in our replacement string:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate r 1 a string "some string xxx_123 some zzz_456 other" \
- | relpipe-tr-sed 'r' 'a' '([a-z]{3})_([0-9]+)' '$2:$1' \
- | relpipe-out-tabular]]></m:pre>
-
- <p>Which would convert this:</p>
- <pre><![CDATA[r:
- ╭────────────────────────────────────────╮
- │ a (string) │
- ├────────────────────────────────────────┤
- │ some string xxx_123 some zzz_456 other │
- ╰────────────────────────────────────────╯
-Record count: 1]]></pre>
-
- <p>into this:</p>
- <pre><![CDATA[r:
- ╭────────────────────────────────────────╮
- │ a (string) │
- ├────────────────────────────────────────┤
- │ some string 123:xxx some 456:zzz other │
- ╰────────────────────────────────────────╯
-Record count: 1]]></pre>
-
- <p>
- If there were any other relations or attributes in the stream, they would be unaffected by this transformation,
- becase we specified <code>'r' 'a'</code> instead of some wider regular expression that would match more relations or attributes.
- </p>
-
- <h2>Filter /etc/fstab using relpipe-tr-grep</h2>
-
- <p>
- If we are interested only in certain records in some relation, we can filter it using <code>relpipe-tr-grep</code>.
- If we want to list e.g. only Btrfs and XFS file systems from our <code>fstab</code> (see above), we will run:
- </p>
-
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-tr-grep 'fstab' 'type' 'btrfs|xfs' | relpipe-out-tabular]]></m:pre>
-
- <p>and we will get following filtered result:</p>
- <pre><![CDATA[fstab:
- ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬──────────────────┬────────────────┬────────────────╮
- │ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
- ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼──────────────────┼────────────────┼────────────────┤
- │ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │
- │ │ /dev/mapper/sdf_crypt │ /mnt/private │ xfs │ relatime │ 0 │ 2 │
- ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴──────────────────┴────────────────┴────────────────╯
-Record count: 2]]></pre>
-
- <p>
- Command arguments are similar to <code>relpipe-tr-sed</code>.
- Everything is a regular expression.
- Only relations matching the regex will be filtered, others will flow through the pipeline unmodified.
- If the attribute regex matches more attribute names, filtering will be done with logical OR
- i.e. the record is included if at least one of that attributes matches the search regex.
- </p>
-
- <p>
- If we need exact match of the whole attribute, we have to use something like <code>'^btrfs|xfs$'</code>,
- otherwise mere substring-match is enough to include the record.
- </p>
-
- <h2>SELECT mount_point FROM fstab WHERE type IN ('btrfs', 'xfs')</h2>
-
- <p>
- While reading classic pipelines involving <code>grep</code> and <code>cut</code> commands
- we must notice that there is some similarity with simple SQL queries looking like:
- </p>
-
- <m:pre jazyk="SQL">SELECT "some", "cut", "fields" FROM stdin WHERE grep_matches(whole_line);</m:pre>
-
- <p>
- And that is true: <code>grep</code> does restriction<m:podČarou>
- <a href="https://en.wikipedia.org/wiki/Selection_(relational_algebra)">selecting</a> only certain records from the original relation according to their match with given conditions</m:podČarou>
- and <code>cut</code> does projection<m:podČarou>limited subset of what <a href="https://en.wikipedia.org/wiki/Projection_(relational_algebra)">projection</a> means</m:podČarou>.
- Now we can do these relational operations using our relational tools called <code>relpipe-tr-grep</code> and <code>relpipe-tr-cut</code>.
- </p>
-
- <p>
- Assume that we need only <code>mount_point</code> fields from our <code>fstab</code> where <code>type</code> is <code>btrfs</code> or <code>xfs</code>
- and we want to do something (a shell script block) with these directory paths.
- </p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
- | relpipe-tr-grep 'fstab' 'type' '^btrfs|xfs$' \
- | relpipe-tr-cut 'fstab' 'mount_point' \
- | relpipe-out-nullbyte \
- | while read -r -d '' m; do
- echo "$m";
- done]]></m:pre>
-
- <p>
- The <code>relpipe-tr-cut</code> tool has similar syntax to its <em>grep</em> and <em>sed</em> siblings and also uses the power of regular expressions.
- In this case it modifies on-the-fly the <code>fstab</code> relation and drops all its attributes except the <code>mount_point</code> one.
- </p>
-
- <p>
- Then we pass the data to the Bash <code>while</code> cycle.
- In such simple scenario (just <code>echo</code>), we could use <code>xargs</code> as in examples above,
- but in this syntax, we can write whole block of shell commands for each record/value and do more complex actions with them.
- </p>
-
- <h2>More projections with relpipe-tr-cut</h2>
-
- <p>
- Assume that we have a simple relation containing numbers:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[seq 0 8 \
- | tr \\n \\0 \
- | relpipe-in-cli generate-from-stdin numbers 3 a integer b integer c integer \
- > numbers.rp]]></m:pre>
-
- <p>and second one containing letters:</p>
-
- <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate letters 2 a string b string A B C D > letters.rp]]></m:pre>
-
- <p>We saved them into two files and then combined them into a single file. We will work with them as they are a single stream of relations:</p>
-
- <m:pre jazyk="bash"><![CDATA[cat numbers.rp letters.rp > both.rp;
-cat both.rp | relpipe-out-tabular]]></m:pre>
-
- <p>Will print:</p>
-
- <pre><![CDATA[numbers:
- ╭─────────────┬─────────────┬─────────────╮
- │ a (integer) │ b (integer) │ c (integer) │
- ├─────────────┼─────────────┼─────────────┤
- │ 0 │ 1 │ 2 │
- │ 3 │ 4 │ 5 │
- │ 6 │ 7 │ 8 │
- ╰─────────────┴─────────────┴─────────────╯
-Record count: 3
-letters:
- ╭─────────────┬─────────────╮
- │ a (string) │ b (string) │
- ├─────────────┼─────────────┤
- │ A │ B │
- │ C │ D │
- ╰─────────────┴─────────────╯
-Record count: 2]]></pre>
-
- <p>We can put away the <code>a</code> attribute from the <code>numbers</code> relation:</p>
-
- <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|c' | relpipe-out-tabular</m:pre>
-
- <p>and leave the <code>letters</code> relation unaffected:</p>
-
- <pre><![CDATA[numbers:
- ╭─────────────┬─────────────╮
- │ b (integer) │ c (integer) │
- ├─────────────┼─────────────┤
- │ 1 │ 2 │
- │ 4 │ 5 │
- │ 7 │ 8 │
- ╰─────────────┴─────────────╯
-Record count: 3
-letters:
- ╭─────────────┬─────────────╮
- │ a (string) │ b (string) │
- ├─────────────┼─────────────┤
- │ A │ B │
- │ C │ D │
- ╰─────────────┴─────────────╯
-Record count: 2]]></pre>
-
- <p>Or we can remove <code>a</code> from both relations resp. keep there only attributes whose names match <code>'b|c'</code> regex:</p>
-
- <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut '.*' 'b|c' | relpipe-out-tabular</m:pre>
-
- <p>Instead of <code>'.*'</code> we could use <code>'numbers|letters'</code> and in this case it will give the same result:</p>
-
- <pre><![CDATA[numbers:
- ╭─────────────┬─────────────╮
- │ b (integer) │ c (integer) │
- ├─────────────┼─────────────┤
- │ 1 │ 2 │
- │ 4 │ 5 │
- │ 7 │ 8 │
- ╰─────────────┴─────────────╯
-Record count: 3
-letters:
- ╭─────────────╮
- │ b (string) │
- ├─────────────┤
- │ B │
- │ D │
- ╰─────────────╯
-Record count: 2]]></pre>
-
- <p>All the time, we are reducing the attributes. But we can also multiply them or change their order:</p>
-
- <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|a|c' 'b' 'a' 'a' | relpipe-out-tabular</m:pre>
-
- <p>
- n.b. the order in <code>'b|a|c'</code> does not matter and if such regex matches, it preserves the original order of the attributes;
- but if we use multiple regexes to specify attributes, their order and count matters:
- </p>
-
- <pre><![CDATA[numbers:
- ╭─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────╮
- │ a (integer) │ b (integer) │ c (integer) │ b (integer) │ a (integer) │ a (integer) │
- ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
- │ 0 │ 1 │ 2 │ 1 │ 0 │ 0 │
- │ 3 │ 4 │ 5 │ 4 │ 3 │ 3 │
- │ 6 │ 7 │ 8 │ 7 │ 6 │ 6 │
- ╰─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────╯
-Record count: 3
-letters:
- ╭─────────────┬─────────────╮
- │ a (string) │ b (string) │
- ├─────────────┼─────────────┤
- │ A │ B │
- │ C │ D │
- ╰─────────────┴─────────────╯
-Record count: 2]]></pre>
-
- <p>
- The <code>letters</code> relation stays rock steady and <code>relpipe-tr-cut 'numbers'</code> does not affect it in any way.
- </p>
-
-
- <h2>Read an Atom feed using XQuery and relpipe-in-xml</h2>
-
- <p>
- Atom Syndication Format is a standard for publishing web feeds a.k.a web syndication.
- These feeds are usually consumed by a <em>feed reeder</em> that aggregates news from many websites and displays them in a uniform format.
- The Atom feed is an XML with a list of recent news containing their titles, URLs and short annotations.
- It also contains some metadata (website author, title etc.).
- </p>
- <p>
- Using this simple XQuery<m:podČarou>see <a href="https://en.wikibooks.org/wiki/XQuery">XQuery</a> at Wikibooks</m:podČarou>
- <em>FLWOR Expression</em>
- we convert the Atom feed into the XML serialization of relational data:
- </p>
-
- <m:pre jazyk="xq" src="examples/atom.xq" odkaz="ano"/>
-
- <p>
- This is similar operation to <a href="https://www.postgresql.org/docs/current/functions-xml.html">xmltable</a> used in SQL databases.
- It converts an XML tree structure to the relational form.
- In our case, the output is still XML, but in a format that can be read by <code>relpipe-in-xml</code>.
- All put together in a single shell script:
- </p>
-
- <m:pre jazyk="bash" src="examples/atom.sh"/>
-
- <p>Will generate a table with web news:</p>
-
- <m:pre jazyk="text" src="examples/atom.txt"/>
-
- <p>
- For frequent usage we can create a script or funcrion called <code>relpipe-in-atom</code>
- that reads Atom XML on STDIN and generates relational data on STDOUT.
- And then do any of these:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[wget … | relpipe-in-atom | relpipe-out-tabular
-wget … | relpipe-in-atom | relpipe-out-csv
-wget … | relpipe-in-atom | relpipe-out-gui
-wget … | relpipe-in-atom | relpipe-out-nullbyte | while read_nullbyte published title url; do echo "$title"; done
-wget … | relpipe-in-atom | relpipe-out-csv | csv2rec | …
-]]></m:pre>
-
- <p>
- There are several implementations of XQuery.
- <a href="http://galax.sourceforge.net/">Galax</a> is one of them.
- <a href="http://xqilla.sourceforge.net/">XQilla</a> or
- <a href="http://basex.org/basex/xquery/">BaseX</a> are another ones (and support newer versions of the standard).
- There are also XSLT processors like <a href="http://xmlsoft.org/XSLT/xsltproc2.html">xsltproc</a>.
- BaseX can be used instead of Galax – we just replace
- <code>galax-run -context-item /dev/stdin</code> with <code>basex -i /dev/stdin</code>.
- </p>
-
- <p>
- Reading Atom feeds in a terminal might not be the best way to get news from a website,
- but this simple example learns us how to convert arbitrary XML to relational data.
- And of course, we can generate multiple relations from a single XML using a single XQuery script.
- XQuery can be also used for operations like JOIN or UNION and for filtering and other transformations
- as will be shown in further examples.
- </p>
-
- <h2>Read files metadata using relpipe-in-filesystem</h2>
-
- <p>
- Our filesystems contain valuable information and using proper tools we can extract them.
- Using <code>relpipe-in-filesystem</code> we can gather metadata of our files and process them in relational way.
- This tools does not traverse our filesystem (remember the rule: <em>do one thing and do it well</em>),
- instead, it eats a list of file paths separated by <code>\0</code>.
- It is typically used together with the <code>find</code> command, but we can also create such list by hand using e.g. <code>printf</code> command or <code>tr \\n \\0</code>.
- </p>
-
- <m:pre jazyk="bash">find /etc/ssh/ -print0 | relpipe-in-filesystem | relpipe-out-tabular</m:pre>
-
- <p>
- In the basic scenario, it behaves like <code>ls -l</code>, just more modular and machine-readable:
- </p>
-
- <pre><![CDATA[filesystem:
- ╭───────────────────────────────────┬───────────────┬────────────────┬────────────────┬────────────────╮
- │ path (string) │ type (string) │ size (integer) │ owner (string) │ group (string) │
- ├───────────────────────────────────┼───────────────┼────────────────┼────────────────┼────────────────┤
- │ /etc/ssh/ │ d │ 0 │ root │ root │
- │ /etc/ssh/moduli │ f │ 553122 │ root │ root │
- │ /etc/ssh/ssh_host_ecdsa_key │ f │ 227 │ root │ root │
- │ /etc/ssh/sshd_config │ f │ 3262 │ root │ root │
- │ /etc/ssh/ssh_host_ed25519_key.pub │ f │ 91 │ root │ root │
- │ /etc/ssh/ssh_host_ecdsa_key.pub │ f │ 171 │ root │ root │
- │ /etc/ssh/ssh_host_rsa_key │ f │ 1679 │ root │ root │
- │ /etc/ssh/ssh_config │ f │ 1580 │ root │ root │
- │ /etc/ssh/ssh_host_ed25519_key │ f │ 399 │ root │ root │
- │ /etc/ssh/ssh_import_id │ f │ 338 │ root │ root │
- │ /etc/ssh/ssh_host_rsa_key.pub │ f │ 391 │ root │ root │
- ╰───────────────────────────────────┴───────────────┴────────────────┴────────────────┴────────────────╯
-Record count: 11]]></pre>
-
- <p>
- We can specify desired attributes and also their aliases:
- </p>
-
- <m:pre jazyk="bash"><![CDATA[find /etc/ssh/ -print0 \
- | relpipe-in-filesystem \
- --file path --as artefact \
- --file size \
- --file owner --as dear_owner \
- | relpipe-out-tabular]]></m:pre>
-
- <p>And we will get a subset with renamed attributes:</p>
-
- <pre><![CDATA[filesystem:
- ╭───────────────────────────────────┬────────────────┬─────────────────────╮
- │ artefact (string) │ size (integer) │ dear_owner (string) │
- ├───────────────────────────────────┼────────────────┼─────────────────────┤
- │ /etc/ssh/ │ 0 │ root │
- │ /etc/ssh/moduli │ 553122 │ root │
- │ /etc/ssh/ssh_host_ecdsa_key │ 227 │ root │
- │ /etc/ssh/sshd_config │ 3262 │ root │
- │ /etc/ssh/ssh_host_ed25519_key.pub │ 91 │ root │
- │ /etc/ssh/ssh_host_ecdsa_key.pub │ 171 │ root │
- │ /etc/ssh/ssh_host_rsa_key │ 1679 │ root │
- │ /etc/ssh/ssh_config │ 1580 │ root │
- │ /etc/ssh/ssh_host_ed25519_key │ 399 │ root │
- │ /etc/ssh/ssh_import_id │ 338 │ root │
- │ /etc/ssh/ssh_host_rsa_key.pub │ 391 │ root │
- ╰───────────────────────────────────┴────────────────┴─────────────────────╯
-Record count: 11]]></pre>
-
- <p>
- We can also choose, which path format fits our needs best:
- </p>
-
-
- <m:pre jazyk="bash"><![CDATA[find ../../etc/ssh/ -print0 \
- | relpipe-in-filesystem \
- --file path \
- --file path_absolute \
- --file path_canonical \
- --file name \
- | relpipe-out-tabular]]></m:pre>
-
- <p>The <code>path</code> attribute contains the exact same value as was on input. Other formats are derived:</p>
-
- <pre><![CDATA[filesystem:
- ╭────────────────────────────────────────┬───────────────────────────────────────────────────┬───────────────────────────────────┬──────────────────────────╮
- │ path (string) │ path_absolute (string) │ path_canonical (string) │ name (string) │
- ├────────────────────────────────────────┼───────────────────────────────────────────────────┼───────────────────────────────────┼──────────────────────────┤
- │ ../../etc/ssh/ │ /home/hack/../../etc/ssh/ │ /etc/ssh │ │
- │ ../../etc/ssh/moduli │ /home/hack/../../etc/ssh/moduli │ /etc/ssh/moduli │ moduli │
- │ ../../etc/ssh/ssh_host_ecdsa_key │ /home/hack/../../etc/ssh/ssh_host_ecdsa_key │ /etc/ssh/ssh_host_ecdsa_key │ ssh_host_ecdsa_key │
- │ ../../etc/ssh/sshd_config │ /home/hack/../../etc/ssh/sshd_config │ /etc/ssh/sshd_config │ sshd_config │
- │ ../../etc/ssh/ssh_host_ed25519_key.pub │ /home/hack/../../etc/ssh/ssh_host_ed25519_key.pub │ /etc/ssh/ssh_host_ed25519_key.pub │ ssh_host_ed25519_key.pub │
- │ ../../etc/ssh/ssh_host_ecdsa_key.pub │ /home/hack/../../etc/ssh/ssh_host_ecdsa_key.pub │ /etc/ssh/ssh_host_ecdsa_key.pub │ ssh_host_ecdsa_key.pub │
- │ ../../etc/ssh/ssh_host_rsa_key │ /home/hack/../../etc/ssh/ssh_host_rsa_key │ /etc/ssh/ssh_host_rsa_key │ ssh_host_rsa_key │
- │ ../../etc/ssh/ssh_config │ /home/hack/../../etc/ssh/ssh_config │ /etc/ssh/ssh_config │ ssh_config │
- │ ../../etc/ssh/ssh_host_ed25519_key │ /home/hack/../../etc/ssh/ssh_host_ed25519_key │ /etc/ssh/ssh_host_ed25519_key │ ssh_host_ed25519_key │
- │ ../../etc/ssh/ssh_import_id │ /home/hack/../../etc/ssh/ssh_import_id │ /etc/ssh/ssh_import_id │ ssh_import_id │
- │ ../../etc/ssh/ssh_host_rsa_key.pub │ /home/hack/../../etc/ssh/ssh_host_rsa_key.pub │ /etc/ssh/ssh_host_rsa_key.pub │ ssh_host_rsa_key.pub │
- ╰────────────────────────────────────────┴───────────────────────────────────────────────────┴───────────────────────────────────┴──────────────────────────╯
-Record count: 11]]></pre>
-
- <p>
- We can also <em>select</em> symlink targets or their types.
- If some file is missing or is inaccessible due to permissions, only <code>path</code> is printed for it.
- </p>
-
- <p>
- Tip: if we are looking for files in the current directory and want omit the „.“ we just call: <code>find -printf '%P\0'</code> instead of <code>find -print0</code>.
- </p>
-
-
- <h2>Using relpipe-in-filesystem to read extended attributes</h2>
-
- <p>
- Extended attributes (xattr) are additional <em>key=value</em> pairs that can be attached to our files.
- They are not stored inside the files, but on the filesystem.
- Thus they are independent of particular file format (which might not support metadata)
- and we can use them e.g. for tagging, cataloguing or adding some notes to our files.
- Some tools like GNU Wget use extended attributes to store metadata like the original URL from which the file was downloaded.
- </p>
-
- <m:pre jazyk="bash"><![CDATA[wget --recursive --level=1 https://relational-pipes.globalcode.info/
-find -type f -printf '%P\0' \
- | relpipe-in-filesystem --file path --file size --xattr xdg.origin.url \
- | relpipe-out-tabular
-]]></m:pre>
-
- <p>And now we know, where the files on our disk came from:</p>
-
- <pre><![CDATA[filesystem:
- ╭───────────────────────────┬────────────────┬────────────────────────────────────────────────────────────────────╮
- │ path (string) │ size (integer) │ xdg.origin.url (string) │
- ├───────────────────────────┼────────────────┼────────────────────────────────────────────────────────────────────┤
- │ index.html │ 12159 │ https://relational-pipes.globalcode.info/v_0/ │
- │ v_0/atom.xml │ 4613 │ https://relational-pipes.globalcode.info/v_0/atom.xml │
- │ v_0/rss.xml │ 4926 │ https://relational-pipes.globalcode.info/v_0/rss.xml │
- │ v_0/js/skript.js │ 2126 │ https://relational-pipes.globalcode.info/v_0/js/skript.js │
- │ v_0/css/styl.css │ 2988 │ https://relational-pipes.globalcode.info/v_0/css/styl.css │
- │ v_0/css/relpipe.css │ 1095 │ https://relational-pipes.globalcode.info/v_0/css/relpipe.css │
- │ v_0/css/syntaxe.css │ 3584 │ https://relational-pipes.globalcode.info/v_0/css/syntaxe.css │
- │ v_0/index.xhtml │ 12159 │ https://relational-pipes.globalcode.info/v_0/index.xhtml │
- │ v_0/grafika/logo.png │ 3298 │ https://relational-pipes.globalcode.info/v_0/grafika/logo.png │
- │ v_0/principles.xhtml │ 17171 │ https://relational-pipes.globalcode.info/v_0/principles.xhtml │
- │ v_0/roadmap.xhtml │ 11097 │ https://relational-pipes.globalcode.info/v_0/roadmap.xhtml │
- │ v_0/faq.xhtml │ 11080 │ https://relational-pipes.globalcode.info/v_0/faq.xhtml │
- │ v_0/specification.xhtml │ 12983 │ https://relational-pipes.globalcode.info/v_0/specification.xhtml │
- │ v_0/implementation.xhtml │ 10810 │ https://relational-pipes.globalcode.info/v_0/implementation.xhtml │
- │ v_0/examples.xhtml │ 76958 │ https://relational-pipes.globalcode.info/v_0/examples.xhtml │
- │ v_0/license.xhtml │ 65580 │ https://relational-pipes.globalcode.info/v_0/license.xhtml │
- │ v_0/screenshots.xhtml │ 5708 │ https://relational-pipes.globalcode.info/v_0/screenshots.xhtml │
- │ v_0/download.xhtml │ 5204 │ https://relational-pipes.globalcode.info/v_0/download.xhtml │
- │ v_0/contact.xhtml │ 4940 │ https://relational-pipes.globalcode.info/v_0/contact.xhtml │
- │ v_0/classic-example.xhtml │ 9539 │ https://relational-pipes.globalcode.info/v_0/classic-example.xhtml │
- ╰───────────────────────────┴────────────────┴────────────────────────────────────────────────────────────────────╯
-Record count: 20]]></pre>
-
- <p>
- If we like the BeOS/Haiku style, we can create empty files with some attributes attached and use our filesystem as a simple database
- and query it using relational tools.
- It will lack indexing, but for basic scenarios like <em>address book</em> it will be fast enough
- and we can feel a bit of BeOS/Haiku atmosphere in our contemporary GNU/Linux systems.
- But be careful with that because some editors delete and recreate files while saving them, which destroys the xattrs.
- Tools like <code>rsync</code> or <code>tar</code> with <code>--xattrs</code> option will backup our attributes securely.
- </p>
-
+ <m:skript jazyk="bash" výstup="xhtml"><![CDATA[
+ echo "<ul>";
+ DIR=$(dirname "$XWG_STRANKA_SOUBOR");
+ DIR="$DIR/../vstup"
+ cd "$DIR";
+ # TODO: use XQuery? (but Grep and Bash are everywhere)
+ for f in examples-*.xml; do
+ grep -oP '(?<=<m:pořadí-příkladu>).*(?=</m:pořadí-příkladu>)' $f | tr \\n ' '
+ echo "<li><m:a href=\"${f//\.xml/}\">$(grep -oP '(?<=<nadpis>).*(?=</nadpis>)' $f)</m:a> – $(grep -oP '(?<=<perex>).+(?=</perex>)' $f)</li>";
+ done | sort | sed -E 's/^[0-9]+ //'
+ echo "</ul>";
+ ]]></m:skript>
</text>