# HG changeset patch # User František Kučera <franta-hg@frantovo.cz> # Date 1549536752 -3600 # Node ID 4919c8098008a9935e4777312b38aee2b20b2313 # Parent d4f401b5f90cca3b15cff0fdcfc14213f55b961d examples: Complex filtering with Guile diff -r d4f401b5f90c -r 4919c8098008 relpipe-data/examples-guile-filtering.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-guile-filtering.xml Thu Feb 07 11:52:32 2019 +0100 @@ -0,0 +1,202 @@ +<stránka + xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana" + xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro"> + + <nadpis>Complex filtering with Guile</nadpis> + <perex>filtering records with AND, OR and functions</perex> + <m:pořadí-příkladu>01400</m:pořadí-příkladu> + + <text xmlns="http://www.w3.org/1999/xhtml"> + + <p> + For simple filtering, we can use <code>relpipe-tr-grep</code>. + But what if we need to write some complex query that contains AND and OR operators? + What if we need e.g. compare numbers – not only match texts against regular expressions? + There is a tool capable to do this and much more: <code>relpipe-tr-guile</code>! + </p> + + <p> + <a href="https://www.gnu.org/software/guile/">Guile</a> is the GNU implementation of Scheme language (something like Lisp and also full of parenthesis). + The <code>relpipe-tr-guile</code> uses GNU Guile as a library, puts data in the Guile context and evaluates Guile expressions and then reads data from the Guile context back and generates relational output from them. + Good news are that it is not necessary to know Lisp/Scheme to use this tool. For the first steps, it can be used just as a query language – like SQL, just a bit Polish. + </p> + + <h2>Filtering numbers</h2> + + <p> + We are looking for „satanistic“ icons in our filesystem – those that have size = 666 bytes. + </p> + + <m:pre jazyk="bash"><![CDATA[$ find /usr/share/icons/ -type f -print0 \ + | relpipe-in-filesystem \ + | relpipe-tr-guile --relation 'files.*' --where '(= $size 666)' \ + | relpipe-out-tabular]]></m:pre> + + <p>Well, well… here we are:</p> + + <m:pre jazyk="text"><![CDATA[filesystem: + ╭───────────────────────────────────────────────────────────────────────┬───────────────┬────────────────┬────────────────┬────────────────╮ + │ path (string) │ type (string) │ size (integer) │ owner (string) │ group (string) │ + ├───────────────────────────────────────────────────────────────────────┼───────────────┼────────────────┼────────────────┼────────────────┤ + │ /usr/share/icons/elementary-xfce/actions/24/tab-new.png │ f │ 666 │ root │ root │ + │ /usr/share/icons/elementary-xfce/apps/16/clock.png │ f │ 666 │ root │ root │ + │ /usr/share/icons/elementary-xfce/mimes/22/x-office-spreadsheet.png │ f │ 666 │ root │ root │ + │ /usr/share/icons/Tango/22x22/apps/office-calendar.png │ f │ 666 │ root │ root │ + │ /usr/share/icons/Tango/16x16/actions/process-stop.png │ f │ 666 │ root │ root │ + │ /usr/share/icons/breeze/actions/24/align-vertical-center.svg │ f │ 666 │ root │ root │ + │ /usr/share/icons/breeze/devices/22/camera-photo.svg │ f │ 666 │ root │ root │ + │ /usr/share/icons/oxygen/base/48x48/actions/tab-detach.png │ f │ 666 │ root │ root │ + │ /usr/share/icons/oxygen/base/32x32/actions/insert-horizontal-rule.png │ f │ 666 │ root │ root │ + │ /usr/share/icons/breeze-dark/actions/24/align-vertical-center.svg │ f │ 666 │ root │ root │ + │ /usr/share/icons/breeze-dark/devices/22/camera-photo.svg │ f │ 666 │ root │ root │ + │ /usr/share/icons/gnome/22x22/status/weather-overcast.png │ f │ 666 │ root │ root │ + │ /usr/share/icons/gnome/16x16/actions/go-home.png │ f │ 666 │ root │ root │ + ╰───────────────────────────────────────────────────────────────────────┴───────────────┴────────────────┴────────────────┴────────────────╯ +Record count: 13]]></m:pre> + + <p>The <code>--relation 'files.*'</code> is a regular expression that says which relations should be processed in Guile – others are passed through unchanged.</p> + + <p> + The <code>--where '(= $size 666)'</code> is our condition. + The Polish<m:podČarou>see <a href="https://en.wikipedia.org/wiki/Polish_notation">Polish notation</a></m:podČarou> thing means that we write <code>= $size 666</code> instead of <code>$size = 666</code>. + It seems a bit weird but it makes sense – the <code>=</code> is a function that compares two numbers and returns a boolean value – + so we just call this function and pass <code>$size</code> and <code>666</code> arguments to it. + And because it is a function, there are <code>(</code>parentheses<code>)</code>. + </p> + + <p> + Relational attributes are mapped to Guile variables with same name, just prefixed with <code>$</code>. + (we considered <code> + <abbr title="Bitcoin">₿</abbr> + </code> symbol, but <code>$</code> seems to be still more common on keyboards in 2019) + While relational attribute name is an arbitrary string, Guile variable names have some limitations, thus not all attributes can be mapped – those with spaces and some special characters are currently unsupported (this will be fixed in later versions by some kind of encoding/escaping). + </p> + + <p> + We can also look for + <code>--where '(> $size 100)'</code> which means „size is greater than 100“ + or + <code>--where '(< $size 100)'</code> which means „size is smaller than 100“. + The <code>>=</code> and <code><=</code> also work as expected. + </p> + + <h2>Filtering strings</h2> + + <p> + Scheme is strongly typed language and we have to use proper functions/operators for each type. + For strings, it is <code>string=</code> instead of <code>=</code> function: + </p> + + <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \ + | relpipe-tr-guile --relation 'fstab' --where '(string= $type "btrfs")' \ + | relpipe-out-tabular]]></m:pre> + + <p>The Btrfs filesystems in our <code>fstab</code>:</p> + + <m:pre jazyk="text"><![CDATA[fstab: + ╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬──────────────────┬────────────────┬────────────────╮ + │ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │ + ├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼──────────────────┼────────────────┼────────────────┤ + │ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │ + ╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴──────────────────┴────────────────┴────────────────╯ +Record count: 1]]></m:pre> + + <p> + There is also <code>string-prefix?</code> which evaluates whether the first string is a prefix of the second string: + </p> + + <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \ + | relpipe-tr-guile --relation 'fstab' --where '(string-prefix? "/mnt" $mount_point)' \ + | relpipe-out-tabular]]></m:pre> + + <p>So we can find filesystems mounted somewhere under <code>/mnt</code>:</p> + + <m:pre jazyk="bash"><![CDATA[fstab: + ╭─────────────────┬───────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮ + │ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │ + ├─────────────────┼───────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤ + │ │ /dev/sde │ /mnt/data │ ext4 │ relatime,user_xattr,errors=remount-ro │ 0 │ 2 │ + │ │ /dev/mapper/sdf_crypt │ /mnt/private │ xfs │ relatime │ 0 │ 2 │ + ╰─────────────────┴───────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯ +Record count: 2]]></m:pre> + + <p> + There are much more functions – can be found in the <a href="https://www.gnu.org/software/guile/manual/guile.html">Guile documentation</a> + – like case-insensitive variants (e.g. <code>string-ci=</code>) or regular expression search (<code>string-match</code>). + </p> + + + <h2>AND and OR</h2> + + <p> + Like in SQL, we can join multiple conditions together with logical operators AND and OR. + In Guile/Scheme these operators are also functions – they are written in the same <code>(</code>fashion<code>)</code>. + </p> + + <p> + So we can e.g. look for icons that are „satanistic“ or „Orwellian“: + </p> + + <m:pre jazyk="bash"><![CDATA[find /usr/share/icons/ -type f -print0 \ + | relpipe-in-filesystem --file path --file size \ + | relpipe-tr-guile --relation 'files.*' --where '(or (= $size 666) (= $size 1984) )' \ + | relpipe-out-tabular]]></m:pre> + + <p>Files with sizes 666 bytes or 1984 bytes:</p> + + <m:pre jazyk="text"><![CDATA[filesystem: + ╭───────────────────────────────────────────────────────────────────────┬────────────────╮ + │ path (string) │ size (integer) │ + ├───────────────────────────────────────────────────────────────────────┼────────────────┤ + │ /usr/share/icons/elementary-xfce/actions/48/mail-mark-important.png │ 1984 │ + │ /usr/share/icons/elementary-xfce/actions/24/tab-new.png │ 666 │ + │ /usr/share/icons/elementary-xfce/apps/16/clock.png │ 666 │ + │ /usr/share/icons/elementary-xfce/mimes/22/x-office-spreadsheet.png │ 666 │ + │ /usr/share/icons/Humanity-Dark/status/22/krb-no-valid-ticket.svg │ 1984 │ + │ /usr/share/icons/Tango/22x22/apps/office-calendar.png │ 666 │ + │ /usr/share/icons/Tango/16x16/actions/process-stop.png │ 666 │ + │ /usr/share/icons/breeze/actions/24/align-vertical-center.svg │ 666 │ + │ /usr/share/icons/breeze/devices/22/camera-photo.svg │ 666 │ + │ /usr/share/icons/oxygen/base/48x48/actions/tab-detach.png │ 666 │ + │ /usr/share/icons/oxygen/base/32x32/actions/insert-horizontal-rule.png │ 666 │ + │ /usr/share/icons/Humanity/status/22/krb-no-valid-ticket.svg │ 1984 │ + │ /usr/share/icons/breeze-dark/actions/24/align-vertical-center.svg │ 666 │ + │ /usr/share/icons/breeze-dark/devices/22/camera-photo.svg │ 666 │ + │ /usr/share/icons/gnome/48x48/status/user-busy.png │ 1984 │ + │ /usr/share/icons/gnome/22x22/status/weather-overcast.png │ 666 │ + │ /usr/share/icons/gnome/16x16/actions/go-home.png │ 666 │ + ╰───────────────────────────────────────────────────────────────────────┴────────────────╯ +Record count: 17]]></m:pre> + + <p>Or we can look for icons that are in SVG format and (at the same time) Orwellian:</p> + + <m:pre jazyk="bash"><![CDATA[find /usr/share/icons/ -type f -print0 \ + | relpipe-in-filesystem --file path --file size \ + | relpipe-tr-guile \ + --relation 'files.*' \ + --where '(and (string-suffix? ".svg" $path) (= $size 1984) )' \ + | relpipe-out-tabular]]></m:pre> + + <p>Which is quite rare and we have only two such icons:</p> + + <m:pre jazyk="text"><![CDATA[filesystem: + ╭──────────────────────────────────────────────────────────────────┬────────────────╮ + │ path (string) │ size (integer) │ + ├──────────────────────────────────────────────────────────────────┼────────────────┤ + │ /usr/share/icons/Humanity-Dark/status/22/krb-no-valid-ticket.svg │ 1984 │ + │ /usr/share/icons/Humanity/status/22/krb-no-valid-ticket.svg │ 1984 │ + ╰──────────────────────────────────────────────────────────────────┴────────────────╯ +Record count: 2]]></m:pre> + + <p> + We can nest ANDs and ORs and other functions as deep as we need and build even very complex queries. + Prentheses nesting is fun, isn't it? + </p> + + + + + + </text> + +</stránka>