relpipe-data/examples-awk-changing-structure.xml
author František Kučera <franta-hg@frantovo.cz>
Tue, 28 May 2019 21:18:20 +0200
branchv_0
changeset 258 2868d772c27e
permissions -rw-r--r--
Release v0.12 – AWK
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
258
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     1
<stránka
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     2
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     3
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     4
	
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     5
	<nadpis>Changing structures with AWK</nadpis>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     6
	<perex>adding or removing attributes or dropping a relation</perex>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     7
	<m:pořadí-příkladu>02400</m:pořadí-příkladu>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     8
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
     9
	<text xmlns="http://www.w3.org/1999/xhtml">
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    10
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    11
		<p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    12
			The AWK transformations can also change the structure of transformed relation.
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    13
			It means adding or removing attributes or dropping the whole relation.
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    14
		</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    15
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    16
		<h2>Adding attributes with AWK</h2>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    17
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    18
		<p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    19
			Using <code>--output-attribute</code> we can specify the output attributes.
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    20
			If we do not want to explicitly specify all of them and just want to add some new ones, we will use <code>--input-attributes-append</code> (or <code>--input-attributes-prepend</code>),
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    21
			which will preserve also the input attributes:
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    22
		</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    23
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    24
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    25
	| relpipe-tr-awk \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    26
		--relation '.*' \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    27
			--for-each '{ id = NR; record(); }' \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    28
			--output-attribute id integer \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    29
			--input-attributes-append \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    30
	| relpipe-out-tabular]]></m:pre>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    31
	
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    32
		<p>This adds one new attribute with ordinal numbers:</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    33
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    34
		<pre><![CDATA[fstab:
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    35
 ╭──────────────┬─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    36
 │ id (integer) │ scheme (string) │ device                      (string) │ mount_point (string) │ type (string) │ options                      (string) │ dump (integer) │ pass (integer) │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    37
 ├──────────────┼─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    38
 │            1 │ UUID            │ 29758270-fd25-4a6c-a7bb-9a18302816af │ /                    │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              1 │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    39
 │            2 │                 │ /dev/sr0                             │ /media/cdrom0        │ udf,iso9660   │ user,noauto                           │              0 │              0 │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    40
 │            3 │                 │ /dev/sde                             │ /mnt/data            │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              2 │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    41
 │            4 │ UUID            │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home                │ btrfs         │ relatime                              │              0 │              2 │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    42
 │            5 │                 │ /dev/mapper/sdf_crypt                │ /mnt/private         │ xfs           │ relatime                              │              0 │              2 │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    43
 ╰──────────────┴─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    44
Record count: 5]]></pre>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    45
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    46
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    47
		<h2>Remnoving attributes with AWK</h2>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    48
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    49
		<p>Or we can omit omit attributes unless explicitly specified ones:</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    50
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    51
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    52
	| relpipe-tr-awk \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    53
		--relation '.*' \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    54
			--for-each '{ type_big = toupper(type); record(); }' \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    55
			--output-attribute mount_point string \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    56
			--output-attribute type        string \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    57
			--output-attribute type_big    string \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    58
		| relpipe-out-tabular]]></m:pre>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    59
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    60
		<p>which effectively removes unlisted attributes:</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    61
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    62
		<pre><![CDATA[fstab:
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    63
 ╭──────────────────────┬───────────────┬───────────────────╮
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    64
 │ mount_point (string) │ type (string) │ type_big (string) │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    65
 ├──────────────────────┼───────────────┼───────────────────┤
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    66
 │ /                    │ ext4          │ EXT4              │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    67
 │ /media/cdrom0        │ udf,iso9660   │ UDF,ISO9660       │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    68
 │ /mnt/data            │ ext4          │ EXT4              │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    69
 │ /home                │ btrfs         │ BTRFS             │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    70
 │ /mnt/private         │ xfs           │ XFS               │
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    71
 ╰──────────────────────┴───────────────┴───────────────────╯
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    72
Record count: 5]]></pre>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    73
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    74
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    75
		<p>AWK is a powerful language so we can use conditions, for cycles etc. and write much more complex transformations.</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    76
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    77
		<h2>Dropping a relation</h2>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    78
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    79
		<p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    80
			A relation can be „dropped“ which means that transformation will run but no relational output will be generated for it 
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    81
			(even the header will be omitted, so it differs from just eliminating all records by a condition).
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    82
			Using AWK for such a simple operation like <code>DROP</code> seems weird but sometimes it might make sense due to intentional side effects.
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    83
		</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    84
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    85
		<p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    86
			Because the AWK code is executed for each record, we can e.g. write some output to a file or to STDERR:
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    87
		</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    88
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    89
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    90
	| relpipe-tr-awk \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    91
		--relation '.*' \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    92
			--for-each '{ printf("%s → %s\n", device, mount_point) > "/dev/stderr" }' \
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    93
			--drop]]></m:pre>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    94
			
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    95
		<p>Which prints text:</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    96
		
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    97
		<pre><![CDATA[29758270-fd25-4a6c-a7bb-9a18302816af → /
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    98
/dev/sr0 → /media/cdrom0
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
    99
/dev/sde → /mnt/data
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   100
a2b5f230-a795-4f6f-a39b-9b57686c86d5 → /home
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   101
/dev/mapper/sdf_crypt → /mnt/private]]></pre>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   102
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   103
		<p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   104
			Then <code>relpipe-tr-awk</code> works much like an output filter (converts relational data to another format).
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   105
			However, if there are more relations and some of theme are not matched by <code>--relation</code>, they will be passed through and delivered to the STDOUT in the relational format.
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   106
			STDERR might be occasionally polluted by some warning messages, so using a dedicated file for such output is a safer way.
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   107
		</p>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   108
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   109
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   110
	</text>
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   111
	
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   112
</stránka>