# HG changeset patch # User František Kučera # Date 1549390708 -3600 # Node ID d4f401b5f90cca3b15cff0fdcfc14213f55b961d # Parent 9c1d0c5ed599115fe4a244753f70c26e766b1273 examples: move each example to a separate page + add generated list of examples diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-cli-stdin.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-cli-stdin.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,77 @@ + + + Reading STDIN + generating relational data from values on standard input + 00200 + + + +

+ The number of CLI arguments is limited and they are passed at once to the process. + So there is option to pass the values from STDIN instead of CLI arguments. + Values on STDIN are expected to be separated by the null-byte. + We can generate such data e.g. using echo and tr (or using printf or other commands): +

+ + + +

+ The output is same as above. + We can use this approach to convert various formats to relational data. + There are lot of data already in the form of null-separated values e.g. the process arguments: +

+ + + +

If we have mc /etc/ /tmp/ running in some other terminal, the output will be:

+ +
+ +

+ Also the find command can produce data separated by the null-byte: +

+ + + +

Will display something like this:

+ +
+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-filesystem-file.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-filesystem-file.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,118 @@ + + + Reading files metadata using relpipe-in-filesystem + accessing file metadata like path, type, size or owner + 01200 + + + +

+ Our filesystems contain valuable information and using proper tools we can extract them. + Using relpipe-in-filesystem we can gather metadata of our files and process them in relational way. + This tools does not traverse our filesystem (remember the rule: do one thing and do it well), + instead, it eats a list of file paths separated by \0. + It is typically used together with the find command, but we can also create such list by hand using e.g. printf command or tr \\n \\0. +

+ + find /etc/ssh/ -print0 | relpipe-in-filesystem | relpipe-out-tabular + +

+ In the basic scenario, it behaves like ls -l, just more modular and machine-readable: +

+ +
+ +

+ We can specify desired attributes and also their aliases: +

+ + + +

And we will get a subset with renamed attributes:

+ +
+ +

+ We can also choose, which path format fits our needs best: +

+ + + + +

The path attribute contains the exact same value as was on input. Other formats are derived:

+ +
+ +

+ We can also select symlink targets or their types. + If some file is missing or is inaccessible due to permissions, only path is printed for it. +

+ +

+ Tip: if we are looking for files in the current directory and want omit the „.“ we just call: find -printf '%P\0' instead of find -print0. +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-filesystem-xattr.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-filesystem-xattr.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,67 @@ + + + Reading extended attributes using relpipe-in-filesystem + accessing xattr of given files e.g. xdg.origin.url + 01300 + + + + +

+ Extended attributes (xattr) are additional key=value pairs that can be attached to our files. + They are not stored inside the files, but on the filesystem. + Thus they are independent of particular file format (which might not support metadata) + and we can use them e.g. for tagging, cataloguing or adding some notes to our files. + Some tools like GNU Wget use extended attributes to store metadata like the original URL from which the file was downloaded. +

+ + + +

And now we know, where the files on our disk came from:

+ +
+ +

+ If we like the BeOS/Haiku style, we can create empty files with some attributes attached and use our filesystem as a simple database + and query it using relational tools. + It will lack indexing, but for basic scenarios like address book it will be fast enough + and we can feel a bit of BeOS/Haiku atmosphere in our contemporary GNU/Linux systems. + But be careful with that because some editors delete and recreate files while saving them, which destroys the xattrs. + Tools like rsync or tar with --xattrs option will backup our attributes securely. +

+ + +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-grep-cut-fstab.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-grep-cut-fstab.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,170 @@ + + + Doing projection and restriction using cut and grep + SELECT mount_point FROM fstab WHERE type IN ('btrfs', 'xfs') + 01000 + + + +

+ While reading classic pipelines involving grep and cut commands + we must notice that there is some similarity with simple SQL queries looking like: +

+ + SELECT "some", "cut", "fields" FROM stdin WHERE grep_matches(whole_line); + +

+ And that is true: grep does restriction + selecting only certain records from the original relation according to their match with given conditions + and cut does projectionlimited subset of what projection means. + Now we can do these relational operations using our relational tools called relpipe-tr-grep and relpipe-tr-cut. +

+ +

+ Assume that we need only mount_point fields from our fstab where type is btrfs or xfs + and we want to do something (a shell script block) with these directory paths. +

+ + + +

+ The relpipe-tr-cut tool has similar syntax to its grep and sed siblings and also uses the power of regular expressions. + In this case it modifies on-the-fly the fstab relation and drops all its attributes except the mount_point one. +

+ +

+ Then we pass the data to the Bash while cycle. + In such simple scenario (just echo), we could use xargs as in examples above, + but in this syntax, we can write whole block of shell commands for each record/value and do more complex actions with them. +

+ +

More projections with relpipe-tr-cut

+ +

+ Assume that we have a simple relation containing numbers: +

+ + numbers.rp]]> + +

and second one containing letters:

+ + letters.rp]]> + +

We saved them into two files and then combined them into a single file. We will work with them as they are a single stream of relations:

+ + both.rp; +cat both.rp | relpipe-out-tabular]]> + +

Will print:

+ +
+ +

We can put away the a attribute from the numbers relation:

+ + cat both.rp | relpipe-tr-cut 'numbers' 'b|c' | relpipe-out-tabular + +

and leave the letters relation unaffected:

+ +
+ +

Or we can remove a from both relations resp. keep there only attributes whose names match 'b|c' regex:

+ + cat both.rp | relpipe-tr-cut '.*' 'b|c' | relpipe-out-tabular + +

Instead of '.*' we could use 'numbers|letters' and in this case it will give the same result:

+ +
+ +

All the time, we are reducing the attributes. But we can also multiply them or change their order:

+ + cat both.rp | relpipe-tr-cut 'numbers' 'b|a|c' 'b' 'a' 'a' | relpipe-out-tabular + +

+ n.b. the order in 'b|a|c' does not matter and if such regex matches, it preserves the original order of the attributes; + but if we use multiple regexes to specify attributes, their order and count matters: +

+ +
+ +

+ The letters relation stays rock steady and relpipe-tr-cut 'numbers' does not affect it in any way. +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-grep-fstab.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-grep-fstab.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,44 @@ + + + Filtering /etc/fstab using relpipe-tr-grep + list only records with desired filesystem types + 00900 + + + +

+ If we are interested only in certain records in some relation, we can filter it using relpipe-tr-grep. + If we want to list e.g. only Btrfs and XFS file systems from our fstab (see above), we will run: +

+ + + + +

and we will get following filtered result:

+
+ +

+ Command arguments are similar to relpipe-tr-sed. + Everything is a regular expression. + Only relations matching the regex will be filtered, others will flow through the pipeline unmodified. + If the attribute regex matches more attribute names, filtering will be done with logical OR + i.e. the record is included if at least one of that attributes matches the search regex. +

+ +

+ If we need exact match of the whole attribute, we have to use something like '^btrfs|xfs$', + otherwise mere substring-match is enough to include the record. +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-hello-world.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-hello-world.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,83 @@ + + + Hello Wordl! + generating relational data from CLI arguments + 00100 + + + +

+ Let's start with an obligatory Hello World example. +

+ + + +

+ This command generates relational data. + In order to see them, we need to convert them to some other format. + For now, we will use the "tabular" format and pipe relational data to the relpipe-out-tabular. +

+ + + +

Output:

+ +
+ +

+ The syntax is simple as we see above. We specify the name of the relation, number of attributes, + and then their definitions (names and types), + followed by the data. +

+ +

+ A single stream may contain multiple relations: +

+ + + +

+ Thus we can combine various commands or files and pass the result to a single relational output filter (relpipe-out-tabular in this case) and get: +

+ +
+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-in-fstab.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-in-fstab.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,47 @@ + + + Reading fstab + converting /etc/fstab or /etc/mtab to relational data + 00200 + + + +

+ Using command relpipe-in-fstab we can convert the /etc/fstab or /etc/mtab to relational data +

+ + + +

+ and see them as a nice table: +

+ +
+ +

And we can do the same also with a remote fstab or mtab; just by adding ssh to the pipeline:

+ + + +

+ The cat runs remotely. The relpipe-in-fstab and relpipe-out-tabular run on our machine. +

+ +

+ n.b. the relpipe-in-fstab reads the /etc/fstab if executed on TTY. Otherwise, it reads the STDIN. +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-out-bash.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-out-bash.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,65 @@ + + + Writing an output filter in Bash + processing relational data in GNU Bash or some other shell + 00600 + + + +

+ In previous example we created an output filter in Perl. + We converted a relation to values separated by \0 and then passed it through xargs to a perl one-liner (or a multi-liner in this case). + But we can write such output filter in pure Bash without xargs and perl. + Of course, it is still limited to a single relation (or it can process multiple relations of same type and do something like implicit UNION ALL). +

+ +

+ We will define a function that will help us with reading the \0-separated values and putting them into shell variables: +

+ + + + + +

+ Currently, there is no known way how to do this without a custom function (just with read built-in command of Bash and its parameters). + But it is just a single line function, so not a big deal. +

+ +

+ And then we just read the values, put them in shell variables and process them in a cycle in a shell block of code: +

+ + + +

+ Which will print: +

+ +
+ +

+ Using this method, we can convert any single relation to any format (preferably some text one, but printf can produce also binary data). + This is good for ad-hoc conversions and single-relation data. + More powerful tools can be written in C++ and other languages like Java, Python, Guile etc. (when particular libraries are available). +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-out-fstab.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-out-fstab.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,88 @@ + + + Formatting fstab + implementing a simple relpipe-out-fstab filter using -in-fstab, -out-nullbyte, xargs and Perl + 00300 + + + +

+ As we have seen before, we can convert /etc/fstab (or mtab) + to e.g. an XML or a nice and colorful table using . + But we can also convert these data back to the fstab format. And do it with proper indentation/padding. + Fstab has a simple format where values are separated by one or more whitespace characters. + But without proper indentation, these files look a bit obfuscated and hard to read (however, they are valid). +

+ + + +

+ So let's build a pipeline that reformats the fstab and makes it more readable. +

+ + relpipe-in-fstab | relpipe-out-fstab > reformatted-fstab.txt + +

+ We can hack together a script called relpipe-out-fstab that accepts relational data and produces fstab data. + Later this will be probably implemented as a regular tool, but for now, it is just an example of a ad-hoc shell script: +

+ + + +

+ In the first part, we prepend a single record (relpipe-in-cli) before the data coming from STDIN (cat). + Then, we use relpipe-out-nullbyte to convert relational data to values separated by a null-byte. + This command processes only attribute values (skips relation and attribute names). + Then we used xargs to read the null-separated values and execute a Perl command for each record (pass to it a same number of arguments, as we have attributes: --max-args=7). + Perl does the actual formatting: adds padding and does some little tunning (merges two attributes and replaces empty values with none). +

+ +

This is formatted version of the fstab above:

+ + + +

+ And using following command we can verify, that the files differ only in comments and whitespace: +

+ +
relpipe-in-fstab | relpipe-out-fstab | diff -w /etc/fstab -
+ +

+ Another check (should print same hashes): +

+ +
+ +

+ Regular implementation of relpipe-out-fstab will probably keep the comments + (it needs also one more attribute and small change in relpipe-in-fstab). +

+ +

+ For just mere fstab reformatting, this approach is a bit overengineering. + We could skip the whole relational thing and do just something like this: +

+ + cat /etc/fstab | grep -v '^#' | sed -E 's/\s+/\n/g' | tr \\n \\0 | xargs -0 -n7 ... + +

+ plus prepend the comment (or do everything in Perl). + But this example is intended as a demostration, how we can + 1) prepend some additional data before the data from STDIN + 2) use and traditional tools like xargs or perl together. + And BTW we have implemented a (simple but working) relpipe output filter – and did it without any serious programming, just put some existing commands together :-) +

+ +
+

+ There is more Unix-nature in one line of shell script than there is in ten thousand lines of C. + see Master Foo and the Ten Thousand Lines +

+
+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-out-xml.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-out-xml.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,36 @@ + + + Generating XML + converting relational data to XML + 00400 + + + +

+ Relational data can be converted to various formats and one of them is the XML. + This is a good option for further processing e.g. using XSLT transformation or passing the XML data to some other tool. + Just use relpipe-out-xml instead of relpipe-out-tabular and the rest of the pipeline remains unchanged: +

+ + + +

+ Will produce XML like this: +

+ + + +

+ Thanks to XSLT, this XML can be easily converted e.g. to an XHTML table (table|tr|td) or other format. + Someone can convert such data to a (La)TeX table. +

+ +

+ n.b. the format is not final and will change i future versions (XML namespace, more metadata etc.). +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-rename-groups-backreferences.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-rename-groups-backreferences.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,44 @@ + + + Using relpipe-tr-sed with groups and backreferences + sed-like substitution with regex groups and backreferences + 00800 + + + +

+ This tool also support regex groups and backreferences. Thus we can use parts of the matched string in our replacement string: +

+ + + +

Which would convert this:

+
+ +

into this:

+
+ +

+ If there were any other relations or attributes in the stream, they would be unaffected by this transformation, + becase we specified 'r' 'a' instead of some wider regular expression that would match more relations or attributes. +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-rename-vg-fstab.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-rename-vg-fstab.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,46 @@ + + + Renaming VG in /etc/fstab using relpipe-tr-sed + sed-like substitutions in the relational stream + 00700 + + + +

+ Assume that we have an /etc/fstab with many lines defining the mount-points (directories) of particular devices (disks) and we are using LVM. + If we rename a volume group (VG), we have to change all of them. The lines look like this one: +

+ +
/dev/alpha/photos    /mnt/photos/    btrfs    noauto,noatime,nodiratime    0  0
+ +

+ We want to change all lines from alpha to beta (the new VG name). + This can be done by the power of regular expressionssee Regular Expressions at Wikibooks and this pipeline: +

+ + + +

+ The relpipe-tr-sed tool works only with given relation (fstab) and given attribute (device) + and it would leave untouched other relations and attributes in the stream. + So it would not replace the strings on unwanted places (if there are any random matches). +

+ +

+ Even the relation names and attribute names are specified as a regular expression, so we can (purposefully) modify multiple relations or attributes. + For example we can put zeroes in both dump and pass attributes: +

+ + + +

+ n.b. the data types must be respected, we can not e.g. put abc in the pass attribute because it is declared as integer. +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-validator.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-validator.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,40 @@ + + + Validating relational data + check whether data are in the relpipe format + 00500 + + + +

+ Just a passthrough command, so these pipelines should produce the same hash: +

+ + + +

+ This tool can be used for testing whether a file contains valid relational data: +

+ + /dev/null; then + echo "valid relational data"; +else + echo "garbage"; +fi]]> + +

or as a one-liner:

+ + /dev/null && echo "ok" || echo "error"]]> + +

+ If an error is found, it is reported on STDERR. So just omit the & in order to see the error message. +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples-xquery-atom.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-xquery-atom.xml Tue Feb 05 19:18:28 2019 +0100 @@ -0,0 +1,71 @@ + + + Reading an Atom feed using XQuery + converting arbitrary XML into relational data using XQuery + 01100 + + + +

+ Atom Syndication Format is a standard for publishing web feeds a.k.a web syndication. + These feeds are usually consumed by a feed reeder that aggregates news from many websites and displays them in a uniform format. + The Atom feed is an XML with a list of recent news containing their titles, URLs and short annotations. + It also contains some metadata (website author, title etc.). +

+

+ Using this simple XQuerysee XQuery at Wikibooks + FLWOR Expression + we convert the Atom feed into the XML serialization of relational data: +

+ + + +

+ This is similar operation to xmltable used in SQL databases. + It converts an XML tree structure to the relational form. + In our case, the output is still XML, but in a format that can be read by relpipe-in-xml. + All put together in a single shell script: +

+ + + +

Will generate a table with web news:

+ + + +

+ For frequent usage we can create a script or funcrion called relpipe-in-atom + that reads Atom XML on STDIN and generates relational data on STDOUT. + And then do any of these: +

+ + + +

+ There are several implementations of XQuery. + Galax is one of them. + XQilla or + BaseX are another ones (and support newer versions of the standard). + There are also XSLT processors like xsltproc. + BaseX can be used instead of Galax – we just replace + galax-run -context-item /dev/stdin with basex -i /dev/stdin. +

+ +

+ Reading Atom feeds in a terminal might not be the best way to get news from a website, + but this simple example learns us how to convert arbitrary XML to relational data. + And of course, we can generate multiple relations from a single XML using a single XQuery script. + XQuery can be also used for operations like JOIN or UNION and for filtering and other transformations + as will be shown in further examples. +

+ +
+ +
diff -r 9c1d0c5ed599 -r d4f401b5f90c relpipe-data/examples.xml --- a/relpipe-data/examples.xml Sun Jan 27 16:16:41 2019 +0100 +++ b/relpipe-data/examples.xml Tue Feb 05 19:18:28 2019 +0100 @@ -14,851 +14,18 @@ But they should also work in other shells.

-

relpipe-in-cli: Hello Wordl!

- -

- Let's start with an obligatory Hello World example. -

- - - -

- This command generates relational data. - In order to see them, we need to convert them to some other format. - For now, we will use the "tabular" format and pipe relational data to the relpipe-out-tabular. -

- - - -

Output:

- -
- -

- The syntax is simple as we see above. We specify the name of the relation, number of attributes, - and then their definitions (names and types), - followed by the data. -

- -

- A single stream may contain multiple relations: -

- - - -

- Thus we can combine various commands or files and pass the result to a single relational output filter (relpipe-out-tabular in this case) and get: -

- -
- -

relpipe-in-cli: STDIN

- -

- The number of CLI arguments is limited and they are passed at once to the process. - So there is option to pass the values from STDIN instead of CLI arguments. - Values on STDIN are expected to be separated by the null-byte. - We can generate such data e.g. using echo and tr (or using printf or other commands): -

- - - -

- The output is same as above. - We can use this approach to convert various formats to relational data. - There are lot of data already in the form of null-separated values e.g. the process arguments: -

- - - -

If we have mc /etc/ /tmp/ running in some other terminal, the output will be:

- -
- -

- Also the find command can produce data separated by the null-byte: -

- - - -

Will display something like this:

- -
- - -

relpipe-in-fstab

- -

- Using command relpipe-in-fstab we can convert the /etc/fstab or /etc/mtab to relational data -

- - - -

- and see them as a nice table: -

- -
- -

And we can do the same also with a remote fstab or mtab; just by adding ssh to the pipeline:

- - - -

- The cat runs remotely. The relpipe-in-fstab and relpipe-out-tabular run on our machine. -

- -

- n.b. the relpipe-in-fstab reads the /etc/fstab if executed on TTY. Otherwise, it reads the STDIN. -

- -

relpipe-out-xml

- -

- Relational data can be converted to various formats and one of them is the XML. - This is a good option for further processing e.g. using XSLT transformation or passing the XML data to some other tool. - Just use relpipe-out-xml instead of relpipe-out-tabular and the rest of the pipeline remains unchanged: -

- - - -

- Will produce XML like this: -

- - - -

- Thanks to XSLT, this XML can be easily converted e.g. to an XHTML table (table|tr|td) or other format. - Someone can convert such data to a (La)TeX table. -

- -

- n.b. the format is not final and will change i future versions (XML namespace, more metadata etc.). -

- - -

relpipe-tr-validator

- -

- Just a passthrough command, so these pipelines should produce the same hash: -

- - - -

- This tool can be used for testing whether a file contains valid relational data: -

- - /dev/null; then - echo "valid relational data"; -else - echo "garbage"; -fi]]> - -

or as a one-liner:

- - /dev/null && echo "ok" || echo "error"]]> - -

- If an error is found, it is reported on STDERR. So just omit the & in order to see the error message. -

- - -

/etc/fstab formatting using -in-fstab, -out-nullbyte, xargs and Perl

- -

- As we have seen before, we can convert /etc/fstab (or mtab) - to e.g. an XML or a nice and colorful table using . - But we can also convert these data back to the fstab format. And do it with proper indentation/padding. - Fstab has a simple format where values are separated by one or more whitespace characters. - But without proper indentation, these files look a bit obfuscated and hard to read (however, they are valid). -

- - - -

- So let's build a pipeline that reformats the fstab and makes it more readable. -

- - relpipe-in-fstab | relpipe-out-fstab > reformatted-fstab.txt - -

- We can hack together a script called relpipe-out-fstab that accepts relational data and produces fstab data. - Later this will be probably implemented as a regular tool, but for now, it is just an example of a ad-hoc shell script: -

- - - -

- In the first part, we prepend a single record (relpipe-in-cli) before the data coming from STDIN (cat). - Then, we use relpipe-out-nullbyte to convert relational data to values separated by a null-byte. - This command processes only attribute values (skips relation and attribute names). - Then we used xargs to read the null-separated values and execute a Perl command for each record (pass to it a same number of arguments, as we have attributes: --max-args=7). - Perl does the actual formatting: adds padding and does some little tunning (merges two attributes and replaces empty values with none). -

- -

This is formatted version of the fstab above:

- - - -

- And using following command we can verify, that the files differ only in comments and whitespace: -

- -
relpipe-in-fstab | relpipe-out-fstab | diff -w /etc/fstab -
- -

- Another check (should print same hashes): -

- -
- -

- Regular implementation of relpipe-out-fstab will probably keep the comments - (it needs also one more attribute and small change in relpipe-in-fstab). -

- -

- For just mere fstab reformatting, this approach is a bit overengineering. - We could skip the whole relational thing and do just something like this: -

- - cat /etc/fstab | grep -v '^#' | sed -E 's/\s+/\n/g' | tr \\n \\0 | xargs -0 -n7 ... - -

- plus prepend the comment (or do everything in Perl). - But this example is intended as a demostration, how we can - 1) prepend some additional data before the data from STDIN - 2) use and traditional tools like xargs or perl together. - And BTW we have implemented a (simple but working) relpipe output filter – and did it without any serious programming, just put some existing commands together :-) -

- -
-

- There is more Unix-nature in one line of shell script than there is in ten thousand lines of C. - see Master Foo and the Ten Thousand Lines -

-
- -

Writing an output filter in Bash

- -

- In previous example we created an output filter in Perl. - We converted a relation to values separated by \0 and then passed it through xargs to a perl one-liner (or a multi-liner in this case). - But we can write such output filter in pure Bash without xargs and perl. - Of course, it is still limited to a single relation (or it can process multiple relations of same type and do something like implicit UNION ALL). -

- -

- We will define a function that will help us with reading the \0-separated values and putting them into shell variables: -

- - - - - -

- Currently, there is no known way how to do this without a custom function (just with read built-in command of Bash and its parameters). - But it is just a single line function, so not a big deal. -

- -

- And then we just read the values, put them in shell variables and process them in a cycle in a shell block of code: -

- - - -

- Which will print: -

- -
- -

- Using this method, we can convert any single relation to any format (preferably some text one, but printf can produce also binary data). - This is good for ad-hoc conversions and single-relation data. - More powerful tools can be written in C++ and other languages like Java, Python, Guile etc. (when particular libraries are available). -

- -

Rename VG in /etc/fstab using relpipe-tr-sed

- -

- Assume that we have an /etc/fstab with many lines defining the mount-points (directories) of particular devices (disks) and we are using LVM. - If we rename a volume group (VG), we have to change all of them. The lines look like this one: -

- -
/dev/alpha/photos    /mnt/photos/    btrfs    noauto,noatime,nodiratime    0  0
- -

- We want to change all lines from alpha to beta (the new VG name). - This can be done by the power of regular expressionssee Regular Expressions at Wikibooks and this pipeline: -

- - - -

- The relpipe-tr-sed tool works only with given relation (fstab) and given attribute (device) - and it would leave untouched other relations and attributes in the stream. - So it would not replace the strings on unwanted places (if there are any random matches). -

- -

- Even the relation names and attribute names are specified as a regular expression, so we can (purposefully) modify multiple relations or attributes. - For example we can put zeroes in both dump and pass attributes: -

- - - -

- n.b. the data types must be respected, we can not e.g. put abc in the pass attribute because it is declared as integer. -

- -

Using relpipe-tr-sed with groups and backreferences

- -

- This tool also support regex groups and backreferences. Thus we can use parts of the matched string in our replacement string: -

- - - -

Which would convert this:

-
- -

into this:

-
- -

- If there were any other relations or attributes in the stream, they would be unaffected by this transformation, - becase we specified 'r' 'a' instead of some wider regular expression that would match more relations or attributes. -

- -

Filter /etc/fstab using relpipe-tr-grep

- -

- If we are interested only in certain records in some relation, we can filter it using relpipe-tr-grep. - If we want to list e.g. only Btrfs and XFS file systems from our fstab (see above), we will run: -

- - - - -

and we will get following filtered result:

-
- -

- Command arguments are similar to relpipe-tr-sed. - Everything is a regular expression. - Only relations matching the regex will be filtered, others will flow through the pipeline unmodified. - If the attribute regex matches more attribute names, filtering will be done with logical OR - i.e. the record is included if at least one of that attributes matches the search regex. -

- -

- If we need exact match of the whole attribute, we have to use something like '^btrfs|xfs$', - otherwise mere substring-match is enough to include the record. -

- -

SELECT mount_point FROM fstab WHERE type IN ('btrfs', 'xfs')

- -

- While reading classic pipelines involving grep and cut commands - we must notice that there is some similarity with simple SQL queries looking like: -

- - SELECT "some", "cut", "fields" FROM stdin WHERE grep_matches(whole_line); - -

- And that is true: grep does restriction - selecting only certain records from the original relation according to their match with given conditions - and cut does projectionlimited subset of what projection means. - Now we can do these relational operations using our relational tools called relpipe-tr-grep and relpipe-tr-cut. -

- -

- Assume that we need only mount_point fields from our fstab where type is btrfs or xfs - and we want to do something (a shell script block) with these directory paths. -

- - - -

- The relpipe-tr-cut tool has similar syntax to its grep and sed siblings and also uses the power of regular expressions. - In this case it modifies on-the-fly the fstab relation and drops all its attributes except the mount_point one. -

- -

- Then we pass the data to the Bash while cycle. - In such simple scenario (just echo), we could use xargs as in examples above, - but in this syntax, we can write whole block of shell commands for each record/value and do more complex actions with them. -

- -

More projections with relpipe-tr-cut

- -

- Assume that we have a simple relation containing numbers: -

- - numbers.rp]]> - -

and second one containing letters:

- - letters.rp]]> - -

We saved them into two files and then combined them into a single file. We will work with them as they are a single stream of relations:

- - both.rp; -cat both.rp | relpipe-out-tabular]]> - -

Will print:

- -
- -

We can put away the a attribute from the numbers relation:

- - cat both.rp | relpipe-tr-cut 'numbers' 'b|c' | relpipe-out-tabular - -

and leave the letters relation unaffected:

- -
- -

Or we can remove a from both relations resp. keep there only attributes whose names match 'b|c' regex:

- - cat both.rp | relpipe-tr-cut '.*' 'b|c' | relpipe-out-tabular - -

Instead of '.*' we could use 'numbers|letters' and in this case it will give the same result:

- -
- -

All the time, we are reducing the attributes. But we can also multiply them or change their order:

- - cat both.rp | relpipe-tr-cut 'numbers' 'b|a|c' 'b' 'a' 'a' | relpipe-out-tabular - -

- n.b. the order in 'b|a|c' does not matter and if such regex matches, it preserves the original order of the attributes; - but if we use multiple regexes to specify attributes, their order and count matters: -

- -
- -

- The letters relation stays rock steady and relpipe-tr-cut 'numbers' does not affect it in any way. -

- - -

Read an Atom feed using XQuery and relpipe-in-xml

- -

- Atom Syndication Format is a standard for publishing web feeds a.k.a web syndication. - These feeds are usually consumed by a feed reeder that aggregates news from many websites and displays them in a uniform format. - The Atom feed is an XML with a list of recent news containing their titles, URLs and short annotations. - It also contains some metadata (website author, title etc.). -

-

- Using this simple XQuerysee XQuery at Wikibooks - FLWOR Expression - we convert the Atom feed into the XML serialization of relational data: -

- - - -

- This is similar operation to xmltable used in SQL databases. - It converts an XML tree structure to the relational form. - In our case, the output is still XML, but in a format that can be read by relpipe-in-xml. - All put together in a single shell script: -

- - - -

Will generate a table with web news:

- - - -

- For frequent usage we can create a script or funcrion called relpipe-in-atom - that reads Atom XML on STDIN and generates relational data on STDOUT. - And then do any of these: -

- - - -

- There are several implementations of XQuery. - Galax is one of them. - XQilla or - BaseX are another ones (and support newer versions of the standard). - There are also XSLT processors like xsltproc. - BaseX can be used instead of Galax – we just replace - galax-run -context-item /dev/stdin with basex -i /dev/stdin. -

- -

- Reading Atom feeds in a terminal might not be the best way to get news from a website, - but this simple example learns us how to convert arbitrary XML to relational data. - And of course, we can generate multiple relations from a single XML using a single XQuery script. - XQuery can be also used for operations like JOIN or UNION and for filtering and other transformations - as will be shown in further examples. -

- -

Read files metadata using relpipe-in-filesystem

- -

- Our filesystems contain valuable information and using proper tools we can extract them. - Using relpipe-in-filesystem we can gather metadata of our files and process them in relational way. - This tools does not traverse our filesystem (remember the rule: do one thing and do it well), - instead, it eats a list of file paths separated by \0. - It is typically used together with the find command, but we can also create such list by hand using e.g. printf command or tr \\n \\0. -

- - find /etc/ssh/ -print0 | relpipe-in-filesystem | relpipe-out-tabular - -

- In the basic scenario, it behaves like ls -l, just more modular and machine-readable: -

- -
- -

- We can specify desired attributes and also their aliases: -

- - - -

And we will get a subset with renamed attributes:

- -
- -

- We can also choose, which path format fits our needs best: -

- - - - -

The path attribute contains the exact same value as was on input. Other formats are derived:

- -
- -

- We can also select symlink targets or their types. - If some file is missing or is inaccessible due to permissions, only path is printed for it. -

- -

- Tip: if we are looking for files in the current directory and want omit the „.“ we just call: find -printf '%P\0' instead of find -print0. -

- - -

Using relpipe-in-filesystem to read extended attributes

- -

- Extended attributes (xattr) are additional key=value pairs that can be attached to our files. - They are not stored inside the files, but on the filesystem. - Thus they are independent of particular file format (which might not support metadata) - and we can use them e.g. for tagging, cataloguing or adding some notes to our files. - Some tools like GNU Wget use extended attributes to store metadata like the original URL from which the file was downloaded. -

- - - -

And now we know, where the files on our disk came from:

- -
- -

- If we like the BeOS/Haiku style, we can create empty files with some attributes attached and use our filesystem as a simple database - and query it using relational tools. - It will lack indexing, but for basic scenarios like address book it will be fast enough - and we can feel a bit of BeOS/Haiku atmosphere in our contemporary GNU/Linux systems. - But be careful with that because some editors delete and recreate files while saving them, which destroys the xattrs. - Tools like rsync or tar with --xattrs option will backup our attributes securely. -

- + "; + DIR=$(dirname "$XWG_STRANKA_SOUBOR"); + DIR="$DIR/../vstup" + cd "$DIR"; + # TODO: use XQuery? (but Grep and Bash are everywhere) + for f in examples-*.xml; do + grep -oP '(?<=).*(?=)' $f | tr \\n ' ' + echo "
  • $(grep -oP '(?<=).*(?=)' $f) – $(grep -oP '(?<=).+(?=)' $f)
  • "; + done | sort | sed -E 's/^[0-9]+ //' + echo ""; + ]]>