relpipe-data/examples-awk-changing-structure.xml
author František Kučera <franta-hg@frantovo.cz>
Thu, 01 Aug 2019 11:59:39 +0200
branchv_0
changeset 266 862a1d97e74b
parent 258 2868d772c27e
permissions -rw-r--r--
add the Big picture diagram

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Changing structures with AWK</nadpis>
	<perex>adding or removing attributes or dropping a relation</perex>
	<m:pořadí-příkladu>02400</m:pořadí-příkladu>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		<p>
			The AWK transformations can also change the structure of transformed relation.
			It means adding or removing attributes or dropping the whole relation.
		</p>
		
		<h2>Adding attributes with AWK</h2>
		
		<p>
			Using <code>--output-attribute</code> we can specify the output attributes.
			If we do not want to explicitly specify all of them and just want to add some new ones, we will use <code>--input-attributes-append</code> (or <code>--input-attributes-prepend</code>),
			which will preserve also the input attributes:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-awk \
		--relation '.*' \
			--for-each '{ id = NR; record(); }' \
			--output-attribute id integer \
			--input-attributes-append \
	| relpipe-out-tabular]]></m:pre>
	
		<p>This adds one new attribute with ordinal numbers:</p>
		
		<pre><![CDATA[fstab:
 ╭──────────────┬─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
 │ id (integer) │ scheme (string) │ device                      (string) │ mount_point (string) │ type (string) │ options                      (string) │ dump (integer) │ pass (integer) │
 ├──────────────┼─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
 │            1 │ UUID            │ 29758270-fd25-4a6c-a7bb-9a18302816af │ /                    │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              1 │
 │            2 │                 │ /dev/sr0                             │ /media/cdrom0        │ udf,iso9660   │ user,noauto                           │              0 │              0 │
 │            3 │                 │ /dev/sde                             │ /mnt/data            │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              2 │
 │            4 │ UUID            │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home                │ btrfs         │ relatime                              │              0 │              2 │
 │            5 │                 │ /dev/mapper/sdf_crypt                │ /mnt/private         │ xfs           │ relatime                              │              0 │              2 │
 ╰──────────────┴─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
Record count: 5]]></pre>


		<h2>Remnoving attributes with AWK</h2>

		<p>Or we can omit omit attributes unless explicitly specified ones:</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-awk \
		--relation '.*' \
			--for-each '{ type_big = toupper(type); record(); }' \
			--output-attribute mount_point string \
			--output-attribute type        string \
			--output-attribute type_big    string \
		| relpipe-out-tabular]]></m:pre>
		
		<p>which effectively removes unlisted attributes:</p>
		
		<pre><![CDATA[fstab:
 ╭──────────────────────┬───────────────┬───────────────────╮
 │ mount_point (string) │ type (string) │ type_big (string) │
 ├──────────────────────┼───────────────┼───────────────────┤
 │ /                    │ ext4          │ EXT4              │
 │ /media/cdrom0        │ udf,iso9660   │ UDF,ISO9660       │
 │ /mnt/data            │ ext4          │ EXT4              │
 │ /home                │ btrfs         │ BTRFS             │
 │ /mnt/private         │ xfs           │ XFS               │
 ╰──────────────────────┴───────────────┴───────────────────╯
Record count: 5]]></pre>


		<p>AWK is a powerful language so we can use conditions, for cycles etc. and write much more complex transformations.</p>
		
		<h2>Dropping a relation</h2>
		
		<p>
			A relation can be „dropped“ which means that transformation will run but no relational output will be generated for it 
			(even the header will be omitted, so it differs from just eliminating all records by a condition).
			Using AWK for such a simple operation like <code>DROP</code> seems weird but sometimes it might make sense due to intentional side effects.
		</p>
		
		<p>
			Because the AWK code is executed for each record, we can e.g. write some output to a file or to STDERR:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-awk \
		--relation '.*' \
			--for-each '{ printf("%s → %s\n", device, mount_point) > "/dev/stderr" }' \
			--drop]]></m:pre>
			
		<p>Which prints text:</p>
		
		<pre><![CDATA[29758270-fd25-4a6c-a7bb-9a18302816af → /
/dev/sr0 → /media/cdrom0
/dev/sde → /mnt/data
a2b5f230-a795-4f6f-a39b-9b57686c86d5 → /home
/dev/mapper/sdf_crypt → /mnt/private]]></pre>

		<p>
			Then <code>relpipe-tr-awk</code> works much like an output filter (converts relational data to another format).
			However, if there are more relations and some of theme are not matched by <code>--relation</code>, they will be passed through and delivered to the STDOUT in the relational format.
			STDERR might be occasionally polluted by some warning messages, so using a dedicated file for such output is a safer way.
		</p>


	</text>
	
</stránka>