examples: Complex filtering with Guile v_0
authorFrantišek Kučera <franta-hg@frantovo.cz>
Thu, 07 Feb 2019 11:52:32 +0100
branchv_0
changeset 245 4919c8098008
parent 244 d4f401b5f90c
child 246 fde0cd94fde6
examples: Complex filtering with Guile
relpipe-data/examples-guile-filtering.xml
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/relpipe-data/examples-guile-filtering.xml	Thu Feb 07 11:52:32 2019 +0100
@@ -0,0 +1,202 @@
+<stránka
+	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
+	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
+	
+	<nadpis>Complex filtering with Guile</nadpis>
+	<perex>filtering records with AND, OR and functions</perex>
+	<m:pořadí-příkladu>01400</m:pořadí-příkladu>
+
+	<text xmlns="http://www.w3.org/1999/xhtml">
+		
+		<p>
+			For simple filtering, we can use <code>relpipe-tr-grep</code>.
+			But what if we need to write some complex query that contains AND and OR operators?
+			What if we need e.g. compare numbers – not only match texts against regular expressions?
+			There is a tool capable to do this and much more: <code>relpipe-tr-guile</code>!
+		</p>
+		
+		<p>
+			<a href="https://www.gnu.org/software/guile/">Guile</a> is the GNU implementation of Scheme language (something like Lisp and also full of parenthesis).
+			The <code>relpipe-tr-guile</code> uses GNU Guile as a library, puts data in the Guile context and evaluates Guile expressions and then reads data from the Guile context back and generates relational output from them.
+			Good news are that it is not necessary to know Lisp/Scheme to use this tool. For the first steps, it can be used just as a query language – like SQL, just a bit Polish.
+		</p>
+		
+		<h2>Filtering numbers</h2>
+		
+		<p>
+			We are looking for „satanistic“ icons in our filesystem – those that have size = 666 bytes.
+		</p>
+		
+		<m:pre jazyk="bash"><![CDATA[$ find /usr/share/icons/ -type f -print0 \
+	| relpipe-in-filesystem \
+	| relpipe-tr-guile --relation 'files.*' --where '(= $size 666)' \
+	| relpipe-out-tabular]]></m:pre>
+	
+		<p>Well, well… here we are:</p>
+		
+		<m:pre jazyk="text"><![CDATA[filesystem:
+ ╭───────────────────────────────────────────────────────────────────────┬───────────────┬────────────────┬────────────────┬────────────────╮
+ │ path                                                         (string) │ type (string) │ size (integer) │ owner (string) │ group (string) │
+ ├───────────────────────────────────────────────────────────────────────┼───────────────┼────────────────┼────────────────┼────────────────┤
+ │ /usr/share/icons/elementary-xfce/actions/24/tab-new.png               │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/elementary-xfce/apps/16/clock.png                    │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/elementary-xfce/mimes/22/x-office-spreadsheet.png    │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/Tango/22x22/apps/office-calendar.png                 │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/Tango/16x16/actions/process-stop.png                 │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/breeze/actions/24/align-vertical-center.svg          │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/breeze/devices/22/camera-photo.svg                   │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/oxygen/base/48x48/actions/tab-detach.png             │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/oxygen/base/32x32/actions/insert-horizontal-rule.png │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/breeze-dark/actions/24/align-vertical-center.svg     │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/breeze-dark/devices/22/camera-photo.svg              │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/gnome/22x22/status/weather-overcast.png              │ f             │            666 │ root           │ root           │
+ │ /usr/share/icons/gnome/16x16/actions/go-home.png                      │ f             │            666 │ root           │ root           │
+ ╰───────────────────────────────────────────────────────────────────────┴───────────────┴────────────────┴────────────────┴────────────────╯
+Record count: 13]]></m:pre>
+
+		<p>The <code>--relation 'files.*'</code> is a regular expression that says which relations should be processed in Guile – others are passed through unchanged.</p>
+		
+		<p>
+			The <code>--where '(= $size 666)'</code> is our condition. 
+			The Polish<m:podČarou>see <a href="https://en.wikipedia.org/wiki/Polish_notation">Polish notation</a></m:podČarou> thing means that we write <code>= $size 666</code> instead of <code>$size = 666</code>.
+			It seems a bit weird but it makes sense – the <code>=</code> is a function that compares two numbers and returns a boolean value – 
+			so we just call this function and pass <code>$size</code> and <code>666</code> arguments to it.
+			And because it is a function, there are <code>(</code>parentheses<code>)</code>.
+		</p>
+		
+		<p>
+			Relational attributes are mapped to Guile variables with same name, just prefixed with <code>$</code>.
+			(we considered <code>
+				<abbr title="Bitcoin">₿</abbr>
+			</code> symbol, but <code>$</code> seems to be still more common on keyboards in 2019)
+			While relational attribute name is an arbitrary string, Guile variable names have some limitations, thus not all attributes can be mapped – those with spaces and some special characters are currently unsupported (this will be fixed in later versions by some kind of encoding/escaping).
+		</p>
+		
+		<p>
+			We can also look for 
+			<code>--where '(&gt; $size 100)'</code> which means „size is greater than 100“
+			or
+			<code>--where '(&lt; $size 100)'</code> which means „size is smaller than 100“.
+			The <code>&gt;=</code> and <code>&lt;=</code> also work as expected.
+		</p>
+		
+		<h2>Filtering strings</h2>
+		
+		<p>
+			Scheme is strongly typed language and we have to use proper functions/operators for each type.
+			For strings, it is <code>string=</code> instead of <code>=</code> function:
+		</p>
+		
+		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
+	| relpipe-tr-guile --relation 'fstab' --where '(string= $type "btrfs")' \
+	| relpipe-out-tabular]]></m:pre>
+	
+		<p>The Btrfs filesystems in our <code>fstab</code>:</p>
+
+		<m:pre jazyk="text"><![CDATA[fstab:
+ ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬──────────────────┬────────────────┬────────────────╮
+ │ scheme (string) │ device                      (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │
+ ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼──────────────────┼────────────────┼────────────────┤
+ │ UUID            │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home                │ btrfs         │ relatime         │              0 │              2 │
+ ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴──────────────────┴────────────────┴────────────────╯
+Record count: 1]]></m:pre>
+
+		<p>
+			There is also <code>string-prefix?</code> which evaluates whether the first string is a prefix of the second string:
+		</p>
+
+		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
+	| relpipe-tr-guile --relation 'fstab' --where '(string-prefix? "/mnt" $mount_point)' \
+	| relpipe-out-tabular]]></m:pre>
+		
+		<p>So we can find filesystems mounted somewhere under <code>/mnt</code>:</p>
+
+		<m:pre jazyk="bash"><![CDATA[fstab:
+ ╭─────────────────┬───────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮
+ │ scheme (string) │ device       (string) │ mount_point (string) │ type (string) │ options                      (string) │ dump (integer) │ pass (integer) │
+ ├─────────────────┼───────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤
+ │                 │ /dev/sde              │ /mnt/data            │ ext4          │ relatime,user_xattr,errors=remount-ro │              0 │              2 │
+ │                 │ /dev/mapper/sdf_crypt │ /mnt/private         │ xfs           │ relatime                              │              0 │              2 │
+ ╰─────────────────┴───────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯
+Record count: 2]]></m:pre>
+
+		<p>
+			There are much more functions – can be found in the <a href="https://www.gnu.org/software/guile/manual/guile.html">Guile documentation</a>
+			– like case-insensitive variants (e.g. <code>string-ci=</code>) or regular expression search (<code>string-match</code>).
+		</p>
+
+
+		<h2>AND and OR</h2>
+		
+		<p>
+			Like in SQL, we can join multiple conditions together with logical operators AND and OR.
+			In Guile/Scheme these operators are also functions – they are written in the same <code>(</code>fashion<code>)</code>.
+		</p>
+		
+		<p>
+			So we can e.g. look for icons that are „satanistic“ or „Orwellian“:
+		</p>
+		
+		<m:pre jazyk="bash"><![CDATA[find /usr/share/icons/ -type f -print0 \
+	| relpipe-in-filesystem --file path --file size \
+	| relpipe-tr-guile --relation 'files.*' --where '(or (= $size 666) (= $size 1984) )' \
+	| relpipe-out-tabular]]></m:pre>
+	
+		<p>Files with sizes 666 bytes or 1984 bytes:</p>
+
+		<m:pre jazyk="text"><![CDATA[filesystem:
+ ╭───────────────────────────────────────────────────────────────────────┬────────────────╮
+ │ path                                                         (string) │ size (integer) │
+ ├───────────────────────────────────────────────────────────────────────┼────────────────┤
+ │ /usr/share/icons/elementary-xfce/actions/48/mail-mark-important.png   │           1984 │
+ │ /usr/share/icons/elementary-xfce/actions/24/tab-new.png               │            666 │
+ │ /usr/share/icons/elementary-xfce/apps/16/clock.png                    │            666 │
+ │ /usr/share/icons/elementary-xfce/mimes/22/x-office-spreadsheet.png    │            666 │
+ │ /usr/share/icons/Humanity-Dark/status/22/krb-no-valid-ticket.svg      │           1984 │
+ │ /usr/share/icons/Tango/22x22/apps/office-calendar.png                 │            666 │
+ │ /usr/share/icons/Tango/16x16/actions/process-stop.png                 │            666 │
+ │ /usr/share/icons/breeze/actions/24/align-vertical-center.svg          │            666 │
+ │ /usr/share/icons/breeze/devices/22/camera-photo.svg                   │            666 │
+ │ /usr/share/icons/oxygen/base/48x48/actions/tab-detach.png             │            666 │
+ │ /usr/share/icons/oxygen/base/32x32/actions/insert-horizontal-rule.png │            666 │
+ │ /usr/share/icons/Humanity/status/22/krb-no-valid-ticket.svg           │           1984 │
+ │ /usr/share/icons/breeze-dark/actions/24/align-vertical-center.svg     │            666 │
+ │ /usr/share/icons/breeze-dark/devices/22/camera-photo.svg              │            666 │
+ │ /usr/share/icons/gnome/48x48/status/user-busy.png                     │           1984 │
+ │ /usr/share/icons/gnome/22x22/status/weather-overcast.png              │            666 │
+ │ /usr/share/icons/gnome/16x16/actions/go-home.png                      │            666 │
+ ╰───────────────────────────────────────────────────────────────────────┴────────────────╯
+Record count: 17]]></m:pre>
+
+		<p>Or we can look for icons that are in SVG format and (at the same time) Orwellian:</p>
+		
+		<m:pre jazyk="bash"><![CDATA[find /usr/share/icons/ -type f -print0 \
+	| relpipe-in-filesystem --file path --file size \
+	| relpipe-tr-guile \
+		--relation 'files.*' \
+		--where '(and (string-suffix? ".svg" $path) (= $size 1984) )' \
+	| relpipe-out-tabular]]></m:pre>
+	
+		<p>Which is quite rare and we have only two such icons:</p>
+
+		<m:pre jazyk="text"><![CDATA[filesystem:
+ ╭──────────────────────────────────────────────────────────────────┬────────────────╮
+ │ path                                                    (string) │ size (integer) │
+ ├──────────────────────────────────────────────────────────────────┼────────────────┤
+ │ /usr/share/icons/Humanity-Dark/status/22/krb-no-valid-ticket.svg │           1984 │
+ │ /usr/share/icons/Humanity/status/22/krb-no-valid-ticket.svg      │           1984 │
+ ╰──────────────────────────────────────────────────────────────────┴────────────────╯
+Record count: 2]]></m:pre>
+
+		<p>
+			We can nest ANDs and ORs and other functions as deep as we need and build even very complex queries.
+			Prentheses nesting is fun, isn't it?
+		</p>
+
+
+		
+		
+		
+	</text>
+
+</stránka>