relpipe-data/examples.xml
author František Kučera <franta-hg@frantovo.cz>
Wed, 12 Dec 2018 23:27:45 +0100
branchv_0
changeset 210 f0a2916368e2
parent 209 74fecc2ba590
child 212 bf9a704dc916
permissions -rw-r--r--
small fixes and improvements

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Examples</nadpis>
	<perex>Usage examples of Relational pipes tools</perex>
	<pořadí>40</pořadí>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		
		<p>
			All examples were tested in <a href="https://www.gnu.org/software/bash/">GNU Bash</a>.
			But they should also work in other shells.
		</p>
		
		<h3>relpipe-in-cli: Hello Wordl!</h3>
		
		<p>
			Let's start with an obligatory Hello World example.
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate "relation_from_cli" 3 \
	"a" "integer" \
	"b" "string" \
	"c" "boolean" \
	"1" "Hello" "true" \
	"2" "World!" "false"]]></m:pre>
	
		<p>
			This command generates relational data.
			In order to see them, we need to convert them to some other format.
			For now, we will use the "tabular" format and pipe relational data to the <code>relpipe-out-tabular</code>.
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate "relation_from_cli" 3 \
		"a" "integer" \
		"b" "string" \
		"c" "boolean" \
		"1" "Hello" "true" \
		"2" "World!" "false" \
	| relpipe-out-tabular]]></m:pre>
	
		<p>Output:</p>

		<pre><![CDATA[relation_from_cli:
 ╭─────────────┬────────────┬─────────────╮
 │ a (integer) │ b (string) │ c (boolean) │
 ├─────────────┼────────────┼─────────────┤
 │           1 │ Hello      │        true │
 │           2 │ World!     │       false │
 ╰─────────────┴────────────┴─────────────╯
Record count: 2
]]></pre>

		<p>
			The syntax is simple as we see above. We specify the name of the relation, number of attributes,
			and then their definitions (names and types),
			followed by the data.
		</p>

		<p>
			A single stream may contain multiple relations:
		</p>		
		
		<m:pre jazyk="bash"><![CDATA[(relpipe-in-cli generate a 1 x string hello; \
 relpipe-in-cli generate b 1 y string world) \
	| relpipe-out-tabular]]></m:pre>
			
		<p>
			Thus we can combine various commands or files and pass the result to a single relational output filter (<code>relpipe-out-tabular</code> in this case) and get:
		</p>
		
		<pre><![CDATA[a:
 ╭────────────╮
 │ x (string) │
 ├────────────┤
 │ hello      │
 ╰────────────╯
Record count: 1
b:
 ╭────────────╮
 │ y (string) │
 ├────────────┤
 │ world      │
 ╰────────────╯
Record count: 1]]></pre>
		
		<h3>relpipe-in-cli: STDIN</h3>
		
		<p>
			The number of CLI arguments is limited and their are passed at once to the process.
			So there is option to pass the values from STDIN instead of CLI arguments.
			Values on STDIN are expected to be separated by the null-byte.
			We can generate such data e.g. using <code>echo</code> and <code>tr</code> (or using <code>printf</code> or other commands):
		</p>
		
		<m:pre jazyk="bash"><![CDATA[echo -e "1\nHello\ntrue\n2\nWorld\nfalse" \
	| tr \\n \\0 \
	| relpipe-in-cli generate-from-stdin relation_from_stdin 3 \
		a integer \
		b string \
		c boolean \
	| relpipe-out-tabular]]></m:pre>

		<p>
			The output is same as above.
			We can use this approach to convert various formats to relational data.
			There are lot of data already in the form of null-separated values e.g. the process arguments:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[cat /proc/$(pidof mc)/cmdline \
	| relpipe-in-cli generate-from-stdin mc_args 1 a string \
	| relpipe-out-tabular
]]></m:pre>
	
		<p>If we have <code>mc /etc/ /tmp/</code> running in some other terminal, the output will be:</p>
		
		<pre><![CDATA[mc_args:
 ╭────────────╮
 │ a (string) │
 ├────────────┤
 │ mc         │
 │ /etc/      │
 │ /tmp/      │
 ╰────────────╯
Record count: 3]]></pre>

		<p>
			Also the <code>find</code> command can produce data separated by the null-byte:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[find /etc/ -name '*ssh*_*' -print0 \
	| relpipe-in-cli generate-from-stdin files 1 file_name string \
	| relpipe-out-tabular]]></m:pre>
	
		<p>Will display something like this:</p>
		
		<pre><![CDATA[files:
 ╭───────────────────────────────────╮
 │ file_name                (string) │
 ├───────────────────────────────────┤
 │ /etc/ssh/ssh_host_ecdsa_key       │
 │ /etc/ssh/sshd_config              │
 │ /etc/ssh/ssh_host_ed25519_key.pub │
 │ /etc/ssh/ssh_host_ecdsa_key.pub   │
 │ /etc/ssh/ssh_host_rsa_key         │
 │ /etc/ssh/ssh_config               │
 │ /etc/ssh/ssh_host_ed25519_key     │
 │ /etc/ssh/ssh_import_id            │
 │ /etc/ssh/ssh_host_rsa_key.pub     │
 ╰───────────────────────────────────╯
Record count: 9]]></pre>
		
		
		<h3>relpipe-in-fstab</h3>
		
		<p>
			Using command <code>relpipe-in-fstab</code> we can convert the <code>/etc/fstab</code> or <code>/etc/mtab</code> to relational data 
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-out-tabular]]></m:pre>
		
		<p>
			and see them as a nice table:
		</p>
		
		<pre><![CDATA[fstab:
 ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
 │ scheme (string) │ device                      (string) │ mount_point (string) │ type (string) │ options                      (string) │ dump (integer) │ pass (integer) │
 ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
 │ UUID            │ 29758270-fd25-4a6c-a7bb-9a18302816af │ /                    │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              1 │
 │                 │ /dev/sr0                             │ /media/cdrom0        │ udf,iso9660   │ user,noauto                           │              0 │              0 │
 │                 │ /dev/sde                             │ /mnt/data            │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              2 │
 │ UUID            │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home                │ btrfs         │ relatime                              │              0 │              2 │
 │                 │ /dev/mapper/sdf_crypt                │ /mnt/private         │ xfs           │ relatime                              │              0 │              2 │
 ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
Record count: 5]]></pre>

		<p>And we can do the same also with a remote <code>fstab</code> or <code>mtab</code>; just by adding <code>ssh</code> to the pipeline:</p>

		<m:pre jazyk="bash"><![CDATA[ssh example.com cat /etc/mtab | relpipe-in-fstab | relpipe-out-tabular]]></m:pre>
		
		<p>
			The <code>cat</code> runs remotely. The <code>relpipe-in-fstab</code> and <code>relpipe-out-tabular</code> run on our machine.
		</p>
		
		<p>
			n.b. the <code>relpipe-in-fstab</code> reads the <code>/etc/fstab</code> if executed on TTY. Otherwise, it reads the STDIN.
		</p>
		
		<h3>relpipe-out-xml</h3>
		
		<p>
			Relational data can be converted to various formats and one of them is the XML.
			This is a good option for further processing e.g. using XSLT transformation or passing the XML data to some other tool.
			Just use <code>relpipe-out-xml</code> instead of <code>relpipe-out-tabular</code> and the rest of the pipeline remains unchanged:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[ssh example.com cat /etc/mtab | relpipe-in-fstab | relpipe-out-xml]]></m:pre>
		
		<p>
			Will produce XML like this:
		</p>
		
		<m:pre jazyk="xml"><![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<pipe>
	<relation>
		<name>fstab</name>
		<record>
			<attribute>UUID</attribute>
			<attribute>29758270-fd25-4a6c-a7bb-9a18302816af</attribute>
			<attribute>/</attribute>
			<attribute>ext4</attribute>
			<attribute>relatime,user_xattr,errors=remount-ro</attribute>
			<attribute>0</attribute>
			<attribute>1</attribute>
		</record>
		<record>
			<attribute></attribute>
			<attribute>/dev/sr0</attribute>
			<attribute>/media/cdrom0</attribute>
			<attribute>udf,iso9660</attribute>
			<attribute>user,noauto</attribute>
			<attribute>0</attribute>
			<attribute>0</attribute>
		</record>
		<record>
			<attribute></attribute>
			<attribute>/dev/sde</attribute>
			<attribute>/mnt/data</attribute>
			<attribute>ext4</attribute>
			<attribute>relatime,user_xattr,errors=remount-ro</attribute>
			<attribute>0</attribute>
			<attribute>2</attribute>
		</record>
		<record>
			<attribute>UUID</attribute>
			<attribute>a2b5f230-a795-4f6f-a39b-9b57686c86d5</attribute>
			<attribute>/home</attribute>
			<attribute>btrfs</attribute>
			<attribute>relatime</attribute>
			<attribute>0</attribute>
			<attribute>2</attribute>
		</record>
		<record>
			<attribute></attribute>
			<attribute>/dev/mapper/sdf_crypt</attribute>
			<attribute>/mnt/private</attribute>
			<attribute>xfs</attribute>
			<attribute>relatime</attribute>
			<attribute>0</attribute>
			<attribute>2</attribute>
		</record>
	</relation>
</pipe>]]></m:pre>

		<p>
			Thanks to XSLT, this XML can be easily converted e.g. to an XHTML table (<code>table|tr|td</code>) or other format.
			Someone can convert such data to a (La)TeX table.
		</p>
		
		<p>
			n.b. the format is not final and will change i future versions (XML namespace, more metadata etc.).
		</p>
		
		
		<h3>relpipe-tr-validator</h3>
		
		<p>
			Just a passthrough command, so these pipelines should produce the same hash:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[
relpipe-in-fstab | relpipe-tr-validator | sha512sum
relpipe-in-fstab | sha512sum]]></m:pre>

		<p>
			This tool can be used for testing whether a file contains valid relational data:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[
if relpipe-tr-validator < "some-file.rp" &> /dev/null; then
	echo "valid relational data";
else
	echo "garbage";
fi]]></m:pre>
		
		<p>or as a one-liner:</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-tr-validator < "some-file.rp" &> /dev/null && echo "ok" || echo "error"]]></m:pre>
		
		<p>
			If an error is found, it is reported on STDERR. So just omit the <code>&amp;</code> in order to see the error message.
		</p>
		
		
		<h3>/etc/fstab formatting using -in-fstab, -out-nullbyte, xargs and Perl</h3>
		
		<p>
			As we have seen before, we can convert <code>/etc/fstab</code> (or <code>mtab</code>)
			to e.g. an XML or a nice and colorful table using <m:name/>.
			But we can also convert these data back to the <code>fstab</code> format. And do it with proper indentation/padding.
			Fstab has a simple format where values are separated by one or more whitespace characters.
			But without proper indentation, these files look a bit obfuscated and hard to read (however, they are valid).
		</p>
		
		<m:pre jazyk="text" src="examples/relpipe-out-fstab.txt"/>
		
		<p>
			So let's build a pipeline that reformats the <code>fstab</code> and makes it more readable.
		</p>
			
		<m:pre jazyk="bash">relpipe-in-fstab | relpipe-out-fstab &gt; reformatted-fstab.txt</m:pre>
			
		<p>
			We can hack together a script called <code>relpipe-out-fstab</code> that accepts relational data and produces <code>fstab</code> data.
			Later this will be probably implemented as a regular tool, but for now, it is just an example of a ad-hoc shell script:
		</p>
		
		<m:pre jazyk="bash" src="examples/relpipe-out-fstab.sh" odkaz="ano"/>
		
		<p>
			In the first part, we prepend a single record (<code>relpipe-in-cli</code>) before the data coming from STDIN (<code>cat</code>).
			Then, we use <code>relpipe-out-nullbyte</code> to convert relational data to values separated by a null-byte.
			This command processes only attribute values (skips relation and attribute names).
			Then we used <code>xargs</code> to read the null-separated values and execute a Perl command for each record (pass to it a same number of arguments, as we have attributes: <code>--max-args=7</code>).
			Perl does the actual formatting: adds padding and does some little tunning (merges two attributes and replaces empty values with <em>none</em>).
		</p>
		
		<p>This is formatted version of the <code>fstab</code> above:</p>
		
		<m:pre jazyk="text" src="examples/relpipe-out-fstab.formatted.txt"/>
		
		<p>
			And using following command we can verify, that the files differ only in comments and whitespace:
		</p>
		
		<pre>relpipe-in-fstab | relpipe-out-fstab | diff -w /etc/fstab -</pre>

		<p>
			Another check (should print same hashes):
		</p>
		
		<pre><![CDATA[relpipe-in-fstab | sha512sum 
relpipe-in-fstab | relpipe-out-fstab | relpipe-in-fstab | sha512sum]]></pre>
		
		<p>
			Regular implementation of <code>relpipe-out-fstab</code> will probably keep the comments
			(it needs also one more attribute and small change in <code>relpipe-in-fstab</code>).
		</p>
		
		<p>
			For just mere <code>fstab</code> reformatting, this approach is a bit overengineering.
			We could skip the whole relational thing and do just something like this:
		</p>
		
		<m:pre jazyk="bash">cat /etc/fstab | grep -v '^#' | sed -E 's/\s+/\n/g' | tr \\n \\0 | xargs -0 -n7 ...</m:pre>
		
		<p>
			plus prepend the comment (or do everything in Perl).
			But this example is intended as a demostration, how we can
			1) prepend some additional data before the data from STDIN
			2) use <m:name/> and traditional tools like <code>xargs</code> or <code>perl</code> together.
			And BTW we have implemented a (simple but working) <em>relpipe output filter</em> – and did it without any serious programming, just put some existing commands together :-)
		</p>
		
		<blockquote>
			<p>
				There is more Unix-nature in one line of shell script than there is in ten thousand lines of C.
				<m:podČarou>see <a href="http://www.catb.org/~esr/writings/unix-koans/ten-thousand.html">Master Foo and the Ten Thousand Lines</a></m:podČarou>
			</p>
		</blockquote>
		
		<h2>Rename VG in /etc/fstab using relpipe-tr-sed</h2>
		
		<p>
			Assume that we have an <code>/etc/fstab</code> with many lines defining the mount-points (directories) of particular devices (disks) and we are using LVM.
			If we rename a volume group (VG), we have to change all of them. The lines look like this one:
		</p>
		
		<pre>/dev/alpha/photos    /mnt/photos/    btrfs    noauto,noatime,nodiratime    0  0</pre>
		
		<p>
			We want to change all lines from <code>alpha</code> to <code>beta</code> (the new VG name).
			This can be done by the power of regular expressions<m:podČarou>see <a href="https://en.wikibooks.org/wiki/Regular_Expressions/Simple_Regular_Expressions">Regular Expressions</a> at Wikibooks</m:podČarou> and this pipeline:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-sed 'fstab' 'device' '^/dev/alpha/' '/dev/beta/' \
	| relpipe-out-fstab]]></m:pre>
	
		<p>
			The <code>relpipe-tr-sed</code> tool works only with given relation (<code>fstab</code>) and given attribute (<code>device</code>)
			and it would leave untouched other relations and attributes in the stream.
			So it would not replace the strings on unwanted places (if there are any random matches).
		</p>
		
		<p>
			Even the relation names and attribute names are specified as a regular expression, so we can (purposefully) modify multiple relations or attributes.
			For example we can put zeroes in both <code>dump</code> and <code>pass</code> attributes:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-tr-sed 'fstab' 'dump|pass' '.*' '0' | relpipe-out-fstab]]></m:pre>
		
		<p>
			n.b. the data types must be respected, we can not e.g. put <code>abc</code> in the <code>pass</code> attribute because it is declared as <code>integer</code>.
		</p>
		
		<h2>Using relpipe-tr-sed with groups and backreferences</h2>
		
		<p>
			This tool also support regex groups and backreferences. Thus we can use parts of the matched string in our replacement string:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate r 1 a string "some string xxx_123 some zzz_456 other" \
	| relpipe-tr-sed 'r' 'a' '([a-z]{3})_([0-9]+)' '$2:$1' \
	| relpipe-out-tabular]]></m:pre>
		
		<p>Which would convert this:</p>
		<pre><![CDATA[r:
 ╭────────────────────────────────────────╮
 │ a                             (string) │
 ├────────────────────────────────────────┤
 │ some string xxx_123 some zzz_456 other │
 ╰────────────────────────────────────────╯
Record count: 1]]></pre>
		
		<p>into this:</p>
		<pre><![CDATA[r:
 ╭────────────────────────────────────────╮
 │ a                             (string) │
 ├────────────────────────────────────────┤
 │ some string 123:xxx some 456:zzz other │
 ╰────────────────────────────────────────╯
Record count: 1]]></pre>

		<p>
			If there were any other relations or attributes in the stream, they would be unaffected by this transformation,
			becase we specified <code>'r' 'a'</code> instead of some wider regular expression that would match more relations or attributes.
		</p>
		
		<h2>Filter /etc/fstab using relpipe-tr-grep</h2>
		
		<p>
			If we are interested only in certain records in some relation, we can filter it using <code>relpipe-tr-grep</code>.
			If we want to list e.g. only Btrfs and XFS file systems from our <code>fstab</code> (see above), we will run:
		</p>
		
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-tr-grep 'fstab' 'type' 'btrfs|xfs' | relpipe-out-tabular]]></m:pre>
				
		<p>and we will get following filtered result:</p>
		<pre><![CDATA[fstab:
 ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬──────────────────┬────────────────┬────────────────╮
 │ scheme (string) │ device                      (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
 ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼──────────────────┼────────────────┼────────────────┤
 │ UUID            │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home                │ btrfs         │ relatime         │              0 │              2 │
 │                 │ /dev/mapper/sdf_crypt                │ /mnt/private         │ xfs           │ relatime         │              0 │              2 │
 ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴──────────────────┴────────────────┴────────────────╯
Record count: 2]]></pre>

		<p>
			Command arguments are similar to <code>relpipe-tr-sed</code>.
			Everything is a regular expression.
			Only relations matching the regex will be filtered, others will flow through the pipeline unmodified.
			If the attribute regex matches more attribute names, filtering will be done with logical OR
			i.e. the record is included if at least one of that attributes matches the search regex.
		</p>
		
		<p>
			If we need exact match of the whole attribute, we have to use something like <code>'^btrfs|xfs$'</code>,
			otherwise mere substring-match is enough to include the record.
		</p>
		
	</text>

</stránka>