relpipe-data/examples-awk-debugging.xml
author František Kučera <franta-hg@frantovo.cz>
Sat, 06 Jun 2020 13:22:57 +0200
branchv_0
changeset 300 b9bd0f06b4a1
parent 258 2868d772c27e
permissions -rw-r--r--
MySQL seems to works well even with libmyodbc5a.so not only libmyodbc5w.so ODBC driver

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Debugging AWK transformations</nadpis>
	<perex>discovering variable mappings and transformation internals</perex>
	<m:pořadí-příkladu>02200</m:pořadí-příkladu>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		<p>In most cases, AWK transformations should be quite straightforward, but sometimes we need to look inside the box.</p>
		
		<h2>Mapping attributes to variables</h2>
		
		<p>
			Relations have named attributes but in a language like AWK we work with named variables.
			In most cases, the names will match 1:1. But not always.
			The mapping is needed because not all valid attribute names are also valid variable names in particular language, thus sometimes some escaping or prefixing is necessary. 
			So there is <code>--debug-variable-mapping</code> option for printing the mappings between attributes and variables.
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-awk \
		--relation '.*' \
			--for-each '1' \
			--debug-variable-mapping \
	| relpipe-out-tabular]]></m:pre>

		<p>This option prepends additional relation with these metadata to the stream:</p>
		
		<pre><![CDATA[fstab.variableMapping:
 ╭────────────────────┬───────────────────╮
 │ attribute (string) │ variable (string) │
 ├────────────────────┼───────────────────┤
 │ device             │ device            │
 │ dump               │ dump              │
 │ mount_point        │ mount_point       │
 │ options            │ options           │
 │ pass               │ pass              │
 │ scheme             │ scheme            │
 │ type               │ type              │
 ╰────────────────────┴───────────────────╯
Record count: 7
fstab:
 ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
 │ scheme (string) │ device                      (string) │ mount_point (string) │ type (string) │ options                      (string) │ dump (integer) │ pass (integer) │
 ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
 │ UUID            │ 29758270-fd25-4a6c-a7bb-9a18302816af │ /                    │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              1 │
 │                 │ /dev/sr0                             │ /media/cdrom0        │ udf,iso9660   │ user,noauto                           │              0 │              0 │
 │                 │ /dev/sde                             │ /mnt/data            │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              2 │
 │ UUID            │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home                │ btrfs         │ relatime                              │              0 │              2 │
 │                 │ /dev/mapper/sdf_crypt                │ /mnt/private         │ xfs           │ relatime                              │              0 │              2 │
 ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
Record count: 5]]></pre>

		<p>If we are interested only in the mappings, we should use it in combination with <code>--drop</code> option:</p>

		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-awk \
		--relation '.*' \
			--for-each '1' \
			--debug-variable-mapping \
			--drop \
	| relpipe-out-tabular]]></m:pre>

		<p>which skips the actual data:</p>
		
		<pre><![CDATA[fstab.variableMapping:
 ╭────────────────────┬───────────────────╮
 │ attribute (string) │ variable (string) │
 ├────────────────────┼───────────────────┤
 │ device             │ device            │
 │ dump               │ dump              │
 │ mount_point        │ mount_point       │
 │ options            │ options           │
 │ pass               │ pass              │
 │ scheme             │ scheme            │
 │ type               │ type              │
 ╰────────────────────┴───────────────────╯
Record count: 7]]></pre>

		<p>Because there were no collisions, variables have same names as attributes. But in this case:</p>
				
		<m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate t 3 \
		"if"      string \
		"6pack"   string \
		"spa ces" string \
	| relpipe-tr-awk \
		--relation t \
			--debug-variable-mapping \
			--drop \
	| relpipe-out-tabular]]></m:pre>
	
		<p>mapping rules come in to the play:</p>
		
		<pre><![CDATA[t.variableMapping:
 ╭────────────────────┬───────────────────╮
 │ attribute (string) │ variable (string) │
 ├────────────────────┼───────────────────┤
 │ 6pack              │ _6pack            │
 │ if                 │ _if               │
 │ spa ces            │ spa_ces           │
 ╰────────────────────┴───────────────────╯
Record count: 3]]></pre>

		<p>in order to make variable names valid in AWK.</p>


		<h2>Inspecting the internals of an AWK transformation</h2>
		
		<p>
			The <code>relpipe-tr-awk</code> calls AWK as a child process and passes data of given relation to it for actual processing.
			Because it executes <code>awk</code> program found on <code>$PATH</code>, we can easily switch the AWK implementations.
			In the source code repository, there is <code>scripts/awk</code> – a wrapper script.
			We can modify the <code>$PATH</code>, so this wrapper will be called by <code>relpipe-tr-awk</code>.
			This script captures CLI arguments, STDIN, STDOUT, STDERR and the exit code and saves them to files in the temp directory.
			Using GNU Screen and the <em>inotifywait</em> we can build a kind of IDE and watch what happens inside during the transformation:
		</p>
		
		<m:img src="img/awk-wrapper-debug-1.png"/>
		
		<p>
			So we can inspect the generated AWK code and the inputs and outputs of the AWK process.
			Recommended usage is described in the <code>scripts/awk</code> script.
		</p>

	</text>
	
</stránka>