relpipe-data/examples-guile-filtering.xml
author František Kučera <franta-hg@frantovo.cz>
Thu, 07 Feb 2019 11:52:32 +0100
branchv_0
changeset 245 4919c8098008
child 258 2868d772c27e
permissions -rw-r--r--
examples: Complex filtering with Guile

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Complex filtering with Guile</nadpis>
	<perex>filtering records with AND, OR and functions</perex>
	<m:pořadí-příkladu>01400</m:pořadí-příkladu>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		<p>
			For simple filtering, we can use <code>relpipe-tr-grep</code>.
			But what if we need to write some complex query that contains AND and OR operators?
			What if we need e.g. compare numbers – not only match texts against regular expressions?
			There is a tool capable to do this and much more: <code>relpipe-tr-guile</code>!
		</p>
		
		<p>
			<a href="https://www.gnu.org/software/guile/">Guile</a> is the GNU implementation of Scheme language (something like Lisp and also full of parenthesis).
			The <code>relpipe-tr-guile</code> uses GNU Guile as a library, puts data in the Guile context and evaluates Guile expressions and then reads data from the Guile context back and generates relational output from them.
			Good news are that it is not necessary to know Lisp/Scheme to use this tool. For the first steps, it can be used just as a query language – like SQL, just a bit Polish.
		</p>
		
		<h2>Filtering numbers</h2>
		
		<p>
			We are looking for „satanistic“ icons in our filesystem – those that have size = 666 bytes.
		</p>
		
		<m:pre jazyk="bash"><![CDATA[$ find /usr/share/icons/ -type f -print0 \
	| relpipe-in-filesystem \
	| relpipe-tr-guile --relation 'files.*' --where '(= $size 666)' \
	| relpipe-out-tabular]]></m:pre>
	
		<p>Well, well… here we are:</p>
		
		<m:pre jazyk="text"><![CDATA[filesystem:
 ╭───────────────────────────────────────────────────────────────────────┬───────────────┬────────────────┬────────────────┬────────────────╮
 │ path                                                         (string) │ type (string) │ size (integer) │ owner (string) │ group (string) │
 ├───────────────────────────────────────────────────────────────────────┼───────────────┼────────────────┼────────────────┼────────────────┤
 │ /usr/share/icons/elementary-xfce/actions/24/tab-new.png               │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/elementary-xfce/apps/16/clock.png                    │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/elementary-xfce/mimes/22/x-office-spreadsheet.png    │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/Tango/22x22/apps/office-calendar.png                 │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/Tango/16x16/actions/process-stop.png                 │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/breeze/actions/24/align-vertical-center.svg          │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/breeze/devices/22/camera-photo.svg                   │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/oxygen/base/48x48/actions/tab-detach.png             │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/oxygen/base/32x32/actions/insert-horizontal-rule.png │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/breeze-dark/actions/24/align-vertical-center.svg     │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/breeze-dark/devices/22/camera-photo.svg              │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/gnome/22x22/status/weather-overcast.png              │ f             │            666 │ root           │ root           │
 │ /usr/share/icons/gnome/16x16/actions/go-home.png                      │ f             │            666 │ root           │ root           │
 ╰───────────────────────────────────────────────────────────────────────┴───────────────┴────────────────┴────────────────┴────────────────╯
Record count: 13]]></m:pre>

		<p>The <code>--relation 'files.*'</code> is a regular expression that says which relations should be processed in Guile – others are passed through unchanged.</p>
		
		<p>
			The <code>--where '(= $size 666)'</code> is our condition. 
			The Polish<m:podČarou>see <a href="https://en.wikipedia.org/wiki/Polish_notation">Polish notation</a></m:podČarou> thing means that we write <code>= $size 666</code> instead of <code>$size = 666</code>.
			It seems a bit weird but it makes sense – the <code>=</code> is a function that compares two numbers and returns a boolean value – 
			so we just call this function and pass <code>$size</code> and <code>666</code> arguments to it.
			And because it is a function, there are <code>(</code>parentheses<code>)</code>.
		</p>
		
		<p>
			Relational attributes are mapped to Guile variables with same name, just prefixed with <code>$</code>.
			(we considered <code>
				<abbr title="Bitcoin">₿</abbr>
			</code> symbol, but <code>$</code> seems to be still more common on keyboards in 2019)
			While relational attribute name is an arbitrary string, Guile variable names have some limitations, thus not all attributes can be mapped – those with spaces and some special characters are currently unsupported (this will be fixed in later versions by some kind of encoding/escaping).
		</p>
		
		<p>
			We can also look for 
			<code>--where '(&gt; $size 100)'</code> which means „size is greater than 100“
			or
			<code>--where '(&lt; $size 100)'</code> which means „size is smaller than 100“.
			The <code>&gt;=</code> and <code>&lt;=</code> also work as expected.
		</p>
		
		<h2>Filtering strings</h2>
		
		<p>
			Scheme is strongly typed language and we have to use proper functions/operators for each type.
			For strings, it is <code>string=</code> instead of <code>=</code> function:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-guile --relation 'fstab' --where '(string= $type "btrfs")' \
	| relpipe-out-tabular]]></m:pre>
	
		<p>The Btrfs filesystems in our <code>fstab</code>:</p>

		<m:pre jazyk="text"><![CDATA[fstab:
 ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬──────────────────┬────────────────┬────────────────╮
 │ scheme (string) │ device                      (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
 ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼──────────────────┼────────────────┼────────────────┤
 │ UUID            │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home                │ btrfs         │ relatime         │              0 │              2 │
 ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴──────────────────┴────────────────┴────────────────╯
Record count: 1]]></m:pre>

		<p>
			There is also <code>string-prefix?</code> which evaluates whether the first string is a prefix of the second string:
		</p>

		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-guile --relation 'fstab' --where '(string-prefix? "/mnt" $mount_point)' \
	| relpipe-out-tabular]]></m:pre>
		
		<p>So we can find filesystems mounted somewhere under <code>/mnt</code>:</p>

		<m:pre jazyk="bash"><![CDATA[fstab:
 ╭─────────────────┬───────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
 │ scheme (string) │ device       (string) │ mount_point (string) │ type (string) │ options                      (string) │ dump (integer) │ pass (integer) │
 ├─────────────────┼───────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
 │                 │ /dev/sde              │ /mnt/data            │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              2 │
 │                 │ /dev/mapper/sdf_crypt │ /mnt/private         │ xfs           │ relatime                              │              0 │              2 │
 ╰─────────────────┴───────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
Record count: 2]]></m:pre>

		<p>
			There are much more functions – can be found in the <a href="https://www.gnu.org/software/guile/manual/guile.html">Guile documentation</a>
			– like case-insensitive variants (e.g. <code>string-ci=</code>) or regular expression search (<code>string-match</code>).
		</p>


		<h2>AND and OR</h2>
		
		<p>
			Like in SQL, we can join multiple conditions together with logical operators AND and OR.
			In Guile/Scheme these operators are also functions – they are written in the same <code>(</code>fashion<code>)</code>.
		</p>
		
		<p>
			So we can e.g. look for icons that are „satanistic“ or „Orwellian“:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[find /usr/share/icons/ -type f -print0 \
	| relpipe-in-filesystem --file path --file size \
	| relpipe-tr-guile --relation 'files.*' --where '(or (= $size 666) (= $size 1984) )' \
	| relpipe-out-tabular]]></m:pre>
	
		<p>Files with sizes 666 bytes or 1984 bytes:</p>

		<m:pre jazyk="text"><![CDATA[filesystem:
 ╭───────────────────────────────────────────────────────────────────────┬────────────────╮
 │ path                                                         (string) │ size (integer) │
 ├───────────────────────────────────────────────────────────────────────┼────────────────┤
 │ /usr/share/icons/elementary-xfce/actions/48/mail-mark-important.png   │           1984 │
 │ /usr/share/icons/elementary-xfce/actions/24/tab-new.png               │            666 │
 │ /usr/share/icons/elementary-xfce/apps/16/clock.png                    │            666 │
 │ /usr/share/icons/elementary-xfce/mimes/22/x-office-spreadsheet.png    │            666 │
 │ /usr/share/icons/Humanity-Dark/status/22/krb-no-valid-ticket.svg      │           1984 │
 │ /usr/share/icons/Tango/22x22/apps/office-calendar.png                 │            666 │
 │ /usr/share/icons/Tango/16x16/actions/process-stop.png                 │            666 │
 │ /usr/share/icons/breeze/actions/24/align-vertical-center.svg          │            666 │
 │ /usr/share/icons/breeze/devices/22/camera-photo.svg                   │            666 │
 │ /usr/share/icons/oxygen/base/48x48/actions/tab-detach.png             │            666 │
 │ /usr/share/icons/oxygen/base/32x32/actions/insert-horizontal-rule.png │            666 │
 │ /usr/share/icons/Humanity/status/22/krb-no-valid-ticket.svg           │           1984 │
 │ /usr/share/icons/breeze-dark/actions/24/align-vertical-center.svg     │            666 │
 │ /usr/share/icons/breeze-dark/devices/22/camera-photo.svg              │            666 │
 │ /usr/share/icons/gnome/48x48/status/user-busy.png                     │           1984 │
 │ /usr/share/icons/gnome/22x22/status/weather-overcast.png              │            666 │
 │ /usr/share/icons/gnome/16x16/actions/go-home.png                      │            666 │
 ╰───────────────────────────────────────────────────────────────────────┴────────────────╯
Record count: 17]]></m:pre>

		<p>Or we can look for icons that are in SVG format and (at the same time) Orwellian:</p>
		
		<m:pre jazyk="bash"><![CDATA[find /usr/share/icons/ -type f -print0 \
	| relpipe-in-filesystem --file path --file size \
	| relpipe-tr-guile \
		--relation 'files.*' \
		--where '(and (string-suffix? ".svg" $path) (= $size 1984) )' \
	| relpipe-out-tabular]]></m:pre>
	
		<p>Which is quite rare and we have only two such icons:</p>

		<m:pre jazyk="text"><![CDATA[filesystem:
 ╭──────────────────────────────────────────────────────────────────┬────────────────╮
 │ path                                                    (string) │ size (integer) │
 ├──────────────────────────────────────────────────────────────────┼────────────────┤
 │ /usr/share/icons/Humanity-Dark/status/22/krb-no-valid-ticket.svg │           1984 │
 │ /usr/share/icons/Humanity/status/22/krb-no-valid-ticket.svg      │           1984 │
 ╰──────────────────────────────────────────────────────────────────┴────────────────╯
Record count: 2]]></m:pre>

		<p>
			We can nest ANDs and ORs and other functions as deep as we need and build even very complex queries.
			Prentheses nesting is fun, isn't it?
		</p>


		
		
		
	</text>

</stránka>