|
1 <stránka |
|
2 xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana" |
|
3 xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro"> |
|
4 |
|
5 <nadpis>Doing projection and restriction using cut and grep</nadpis> |
|
6 <perex>SELECT mount_point FROM fstab WHERE type IN ('btrfs', 'xfs')</perex> |
|
7 <m:pořadí-příkladu>01000</m:pořadí-příkladu> |
|
8 |
|
9 <text xmlns="http://www.w3.org/1999/xhtml"> |
|
10 |
|
11 <p> |
|
12 While reading classic pipelines involving <code>grep</code> and <code>cut</code> commands |
|
13 we must notice that there is some similarity with simple SQL queries looking like: |
|
14 </p> |
|
15 |
|
16 <m:pre jazyk="SQL">SELECT "some", "cut", "fields" FROM stdin WHERE grep_matches(whole_line);</m:pre> |
|
17 |
|
18 <p> |
|
19 And that is true: <code>grep</code> does restriction<m:podČarou> |
|
20 <a href="https://en.wikipedia.org/wiki/Selection_(relational_algebra)">selecting</a> only certain records from the original relation according to their match with given conditions</m:podČarou> |
|
21 and <code>cut</code> does projection<m:podČarou>limited subset of what <a href="https://en.wikipedia.org/wiki/Projection_(relational_algebra)">projection</a> means</m:podČarou>. |
|
22 Now we can do these relational operations using our relational tools called <code>relpipe-tr-grep</code> and <code>relpipe-tr-cut</code>. |
|
23 </p> |
|
24 |
|
25 <p> |
|
26 Assume that we need only <code>mount_point</code> fields from our <code>fstab</code> where <code>type</code> is <code>btrfs</code> or <code>xfs</code> |
|
27 and we want to do something (a shell script block) with these directory paths. |
|
28 </p> |
|
29 |
|
30 <m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \ |
|
31 | relpipe-tr-grep 'fstab' 'type' '^btrfs|xfs$' \ |
|
32 | relpipe-tr-cut 'fstab' 'mount_point' \ |
|
33 | relpipe-out-nullbyte \ |
|
34 | while read -r -d '' m; do |
|
35 echo "$m"; |
|
36 done]]></m:pre> |
|
37 |
|
38 <p> |
|
39 The <code>relpipe-tr-cut</code> tool has similar syntax to its <em>grep</em> and <em>sed</em> siblings and also uses the power of regular expressions. |
|
40 In this case it modifies on-the-fly the <code>fstab</code> relation and drops all its attributes except the <code>mount_point</code> one. |
|
41 </p> |
|
42 |
|
43 <p> |
|
44 Then we pass the data to the Bash <code>while</code> cycle. |
|
45 In such simple scenario (just <code>echo</code>), we could use <code>xargs</code> as in examples above, |
|
46 but in this syntax, we can write whole block of shell commands for each record/value and do more complex actions with them. |
|
47 </p> |
|
48 |
|
49 <h2>More projections with relpipe-tr-cut</h2> |
|
50 |
|
51 <p> |
|
52 Assume that we have a simple relation containing numbers: |
|
53 </p> |
|
54 |
|
55 <m:pre jazyk="bash"><![CDATA[seq 0 8 \ |
|
56 | tr \\n \\0 \ |
|
57 | relpipe-in-cli generate-from-stdin numbers 3 a integer b integer c integer \ |
|
58 > numbers.rp]]></m:pre> |
|
59 |
|
60 <p>and second one containing letters:</p> |
|
61 |
|
62 <m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate letters 2 a string b string A B C D > letters.rp]]></m:pre> |
|
63 |
|
64 <p>We saved them into two files and then combined them into a single file. We will work with them as they are a single stream of relations:</p> |
|
65 |
|
66 <m:pre jazyk="bash"><![CDATA[cat numbers.rp letters.rp > both.rp; |
|
67 cat both.rp | relpipe-out-tabular]]></m:pre> |
|
68 |
|
69 <p>Will print:</p> |
|
70 |
|
71 <pre><![CDATA[numbers: |
|
72 ╭─────────────┬─────────────┬─────────────╮ |
|
73 │ a (integer) │ b (integer) │ c (integer) │ |
|
74 ├─────────────┼─────────────┼─────────────┤ |
|
75 │ 0 │ 1 │ 2 │ |
|
76 │ 3 │ 4 │ 5 │ |
|
77 │ 6 │ 7 │ 8 │ |
|
78 ╰─────────────┴─────────────┴─────────────╯ |
|
79 Record count: 3 |
|
80 letters: |
|
81 ╭─────────────┬─────────────╮ |
|
82 │ a (string) │ b (string) │ |
|
83 ├─────────────┼─────────────┤ |
|
84 │ A │ B │ |
|
85 │ C │ D │ |
|
86 ╰─────────────┴─────────────╯ |
|
87 Record count: 2]]></pre> |
|
88 |
|
89 <p>We can put away the <code>a</code> attribute from the <code>numbers</code> relation:</p> |
|
90 |
|
91 <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|c' | relpipe-out-tabular</m:pre> |
|
92 |
|
93 <p>and leave the <code>letters</code> relation unaffected:</p> |
|
94 |
|
95 <pre><![CDATA[numbers: |
|
96 ╭─────────────┬─────────────╮ |
|
97 │ b (integer) │ c (integer) │ |
|
98 ├─────────────┼─────────────┤ |
|
99 │ 1 │ 2 │ |
|
100 │ 4 │ 5 │ |
|
101 │ 7 │ 8 │ |
|
102 ╰─────────────┴─────────────╯ |
|
103 Record count: 3 |
|
104 letters: |
|
105 ╭─────────────┬─────────────╮ |
|
106 │ a (string) │ b (string) │ |
|
107 ├─────────────┼─────────────┤ |
|
108 │ A │ B │ |
|
109 │ C │ D │ |
|
110 ╰─────────────┴─────────────╯ |
|
111 Record count: 2]]></pre> |
|
112 |
|
113 <p>Or we can remove <code>a</code> from both relations resp. keep there only attributes whose names match <code>'b|c'</code> regex:</p> |
|
114 |
|
115 <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut '.*' 'b|c' | relpipe-out-tabular</m:pre> |
|
116 |
|
117 <p>Instead of <code>'.*'</code> we could use <code>'numbers|letters'</code> and in this case it will give the same result:</p> |
|
118 |
|
119 <pre><![CDATA[numbers: |
|
120 ╭─────────────┬─────────────╮ |
|
121 │ b (integer) │ c (integer) │ |
|
122 ├─────────────┼─────────────┤ |
|
123 │ 1 │ 2 │ |
|
124 │ 4 │ 5 │ |
|
125 │ 7 │ 8 │ |
|
126 ╰─────────────┴─────────────╯ |
|
127 Record count: 3 |
|
128 letters: |
|
129 ╭─────────────╮ |
|
130 │ b (string) │ |
|
131 ├─────────────┤ |
|
132 │ B │ |
|
133 │ D │ |
|
134 ╰─────────────╯ |
|
135 Record count: 2]]></pre> |
|
136 |
|
137 <p>All the time, we are reducing the attributes. But we can also multiply them or change their order:</p> |
|
138 |
|
139 <m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|a|c' 'b' 'a' 'a' | relpipe-out-tabular</m:pre> |
|
140 |
|
141 <p> |
|
142 n.b. the order in <code>'b|a|c'</code> does not matter and if such regex matches, it preserves the original order of the attributes; |
|
143 but if we use multiple regexes to specify attributes, their order and count matters: |
|
144 </p> |
|
145 |
|
146 <pre><![CDATA[numbers: |
|
147 ╭─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────╮ |
|
148 │ a (integer) │ b (integer) │ c (integer) │ b (integer) │ a (integer) │ a (integer) │ |
|
149 ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤ |
|
150 │ 0 │ 1 │ 2 │ 1 │ 0 │ 0 │ |
|
151 │ 3 │ 4 │ 5 │ 4 │ 3 │ 3 │ |
|
152 │ 6 │ 7 │ 8 │ 7 │ 6 │ 6 │ |
|
153 ╰─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────╯ |
|
154 Record count: 3 |
|
155 letters: |
|
156 ╭─────────────┬─────────────╮ |
|
157 │ a (string) │ b (string) │ |
|
158 ├─────────────┼─────────────┤ |
|
159 │ A │ B │ |
|
160 │ C │ D │ |
|
161 ╰─────────────┴─────────────╯ |
|
162 Record count: 2]]></pre> |
|
163 |
|
164 <p> |
|
165 The <code>letters</code> relation stays rock steady and <code>relpipe-tr-cut 'numbers'</code> does not affect it in any way. |
|
166 </p> |
|
167 |
|
168 </text> |
|
169 |
|
170 </stránka> |