author | František Kučera <franta-hg@frantovo.cz> |
Mon, 21 Feb 2022 00:43:11 +0100 | |
branch | v_0 |
changeset 329 | 5bc2bb8b7946 |
parent 258 | 2868d772c27e |
permissions | -rw-r--r-- |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
1 |
<stránka |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
2 |
xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana" |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
3 |
xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro"> |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
4 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
5 |
<nadpis>Complex filtering with AWK</nadpis> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
6 |
<perex>filtering records with AND, OR and functions</perex> |
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
7 |
<m:pořadí-příkladu>02100</m:pořadí-příkladu> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
8 |
|
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
9 |
<text xmlns="http://www.w3.org/1999/xhtml"> |
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
10 |
|
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
11 |
<p> |
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
12 |
If we need more complex filtering than <code>relpipe-tr-grep</code> can offer, we can write an AWK transformation. |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
13 |
Then we can use AND and OR operators and functions like regular expression matching or numerical formulas. |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
14 |
</p> |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
15 |
|
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
16 |
<p> |
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
17 |
The tool <code>relpipe-tr-awk</code> calls real AWK program (usually GNU AWK) installed on our system and passes data of given relation to it. |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
18 |
Thus we can use any AWK feature in our pipeline while processing relational data. |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
19 |
Relational attributes are mapped to AWK variables, so we can reference them by their names instead of mere field numbers. |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
20 |
</p> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
21 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
22 |
<p> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
23 |
The <code>--for-each</code> option is used for both filtering (instead of <code>--where</code>) |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
24 |
and arbitrary code execution (for data modifications, adding records, computations or intentional side effects). |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
25 |
In AWK, filtering conditions are surrounded by <code>(…)</code> and actions by <code>{…}</code>. |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
26 |
Both can be combined together and multiple expressions can be separated by <code>;</code> semicolon. |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
27 |
The <code>record()</code> function should be called instead of AWK <code>print</code> (which should never be used directly). |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
28 |
Calling <code>record()</code> is not necessary, when only filtering is done (and there are no data modifications). |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
29 |
</p> |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
30 |
|
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
31 |
<h2>Filtering numbers</h2> |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
32 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
33 |
<p>With AWK we can filter records using standard numeric operators like ==, <, >, >= etc.</p> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
34 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
35 |
<m:pre jazyk="bash"><![CDATA[find -print0 | relpipe-in-filesystem \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
36 |
| relpipe-tr-awk \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
37 |
--relation '.*' \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
38 |
--for-each '(size > 2000)' \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
39 |
| relpipe-out-tabular]]></m:pre> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
40 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
41 |
<p>and e.g. list files with certain sizes:</p> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
42 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
43 |
<pre><![CDATA[filesystem: |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
44 |
╭──────────────────────┬───────────────┬────────────────┬────────────────┬────────────────╮ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
45 |
│ path (string) │ type (string) │ size (integer) │ owner (string) │ group (string) │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
46 |
├──────────────────────┼───────────────┼────────────────┼────────────────┼────────────────┤ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
47 |
│ ./relpipe-tr-awk.cpp │ f │ 2880 │ hacker │ hacker │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
48 |
│ ./CLIParser.h │ f │ 5264 │ hacker │ hacker │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
49 |
│ ./AwkHandler.h │ f │ 17382 │ hacker │ hacker │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
50 |
╰──────────────────────┴───────────────┴────────────────┴────────────────┴────────────────╯ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
51 |
Record count: 3]]></pre> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
52 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
53 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
54 |
<h2>Filtering strings</h2> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
55 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
56 |
<p>String values can be searched for certain regular expression:</p> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
57 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
58 |
<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
59 |
| relpipe-tr-awk \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
60 |
--relation '.*' \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
61 |
--for-each '(mount_point ~ /cdrom/)' \ |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
62 |
| relpipe-out-tabular]]></m:pre> |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
63 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
64 |
<p>e.g. <code>fstab</code> records having <code>cdrom</code> in the <code>mount_point</code>:</p> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
65 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
66 |
<pre><![CDATA[fstab: |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
67 |
╭─────────────────┬─────────────────┬──────────────────────┬───────────────┬──────────────────┬────────────────┬────────────────╮ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
68 |
│ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
69 |
├─────────────────┼─────────────────┼──────────────────────┼───────────────┼──────────────────┼────────────────┼────────────────┤ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
70 |
│ │ /dev/sr0 │ /media/cdrom0 │ udf,iso9660 │ user,noauto │ 0 │ 0 │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
71 |
╰─────────────────┴─────────────────┴──────────────────────┴───────────────┴──────────────────┴────────────────┴────────────────╯ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
72 |
Record count: 1]]></pre> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
73 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
74 |
<p>Case-insensitive search can be switched on by adding:</p> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
75 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
76 |
<pre>--define IGNORECASE integer 1</pre> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
77 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
78 |
<h2>AND and OR</h2> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
79 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
80 |
<p>We can combine multiple conditions using <code>||</code> and <code>&&</code> logical operators:</p> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
81 |
|
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
82 |
<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \ |
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
83 |
| relpipe-tr-awk \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
84 |
--relation '.*' \ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
85 |
--for-each '(type == "btrfs" || pass == 1)' \ |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
86 |
| relpipe-out-tabular]]></m:pre> |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
87 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
88 |
<p>and build arbitrary complex filters</p> |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
89 |
|
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
90 |
<pre><![CDATA[fstab: |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
91 |
╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬───────────────────────────────────────┬────────────────┬────────────────╮ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
92 |
│ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (integer) │ pass (integer) │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
93 |
├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼───────────────────────────────────────┼────────────────┼────────────────┤ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
94 |
│ UUID │ 29758270-fd25-4a6c-a7bb-9a18302816af │ / │ ext4 │ relatime,user_xattr,errors=remount-ro │ 0 │ 1 │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
95 |
│ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
96 |
╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴───────────────────────────────────────┴────────────────┴────────────────╯ |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
97 |
Record count: 2]]></pre> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
98 |
|
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
99 |
<p>Nested <code>(…)</code> work as expected.</p> |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
100 |
|
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
101 |
<p> |
258
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
102 |
And AWK can do much more – it offers plenty of functions and language constructs that we can use in our transformations. |
2868d772c27e
Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents:
245
diff
changeset
|
103 |
Comperhensive documentation can be found here: <a href="https://www.gnu.org/software/gawk/manual/">Gawk: Effective AWK Programming</a>. |
245
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
104 |
</p> |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
105 |
|
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
106 |
</text> |
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
107 |
|
4919c8098008
examples: Complex filtering with Guile
František Kučera <franta-hg@frantovo.cz>
parents:
diff
changeset
|
108 |
</stránka> |