relpipe-data/examples-grep-cut-fstab.xml
branchv_0
changeset 244 d4f401b5f90c
parent 241 f71d300205b7
child 301 7029e6c47700
equal deleted inserted replaced
243:9c1d0c5ed599 244:d4f401b5f90c
       
     1 <stránka
       
     2 	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
       
     3 	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
       
     4 	
       
     5 	<nadpis>Doing projection and restriction using cut and grep</nadpis>
       
     6 	<perex>SELECT mount_point FROM fstab WHERE type IN ('btrfs', 'xfs')</perex>
       
     7 	<m:pořadí-příkladu>01000</m:pořadí-příkladu>
       
     8 
       
     9 	<text xmlns="http://www.w3.org/1999/xhtml">
       
    10 		
       
    11 		<p>
       
    12 			While reading classic pipelines involving <code>grep</code> and <code>cut</code> commands
       
    13 			we must notice that there is some similarity with simple SQL queries looking like:
       
    14 		</p>
       
    15 		
       
    16 		<m:pre jazyk="SQL">SELECT "some", "cut", "fields" FROM stdin WHERE grep_matches(whole_line);</m:pre>
       
    17 		
       
    18 		<p>
       
    19 			And that is true: <code>grep</code> does restriction<m:podČarou>
       
    20 				<a href="https://en.wikipedia.org/wiki/Selection_(relational_algebra)">selecting</a> only certain records from the original relation according to their match with given conditions</m:podČarou>
       
    21 			and <code>cut</code> does projection<m:podČarou>limited subset of what <a href="https://en.wikipedia.org/wiki/Projection_(relational_algebra)">projection</a> means</m:podČarou>.
       
    22 			Now we can do these relational operations using our relational tools called <code>relpipe-tr-grep</code> and <code>relpipe-tr-cut</code>.
       
    23 		</p>
       
    24 		
       
    25 		<p>
       
    26 			Assume that we need only <code>mount_point</code> fields from our <code>fstab</code> where <code>type</code> is <code>btrfs</code> or <code>xfs</code>
       
    27 			and we want to do something (a shell script block) with these directory paths.
       
    28 		</p>
       
    29 		
       
    30 		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
       
    31 	| relpipe-tr-grep 'fstab' 'type' '^btrfs|xfs$' \
       
    32 	| relpipe-tr-cut 'fstab' 'mount_point' \
       
    33 	| relpipe-out-nullbyte \
       
    34 	| while read -r -d '' m; do
       
    35 		echo "$m";
       
    36 	done]]></m:pre>
       
    37 	
       
    38 		<p>
       
    39 			The <code>relpipe-tr-cut</code> tool has similar syntax to its <em>grep</em> and <em>sed</em> siblings and also uses the power of regular expressions.
       
    40 			In this case it modifies on-the-fly the <code>fstab</code> relation and drops all its attributes except the <code>mount_point</code> one.
       
    41 		</p>
       
    42 		
       
    43 		<p>
       
    44 			Then we pass the data to the Bash <code>while</code> cycle.
       
    45 			In such simple scenario (just <code>echo</code>), we could use <code>xargs</code> as in examples above,
       
    46 			but in this syntax, we can write whole block of shell commands for each record/value and do more complex actions with them.
       
    47 		</p>
       
    48 		
       
    49 		<h2>More projections with relpipe-tr-cut</h2>
       
    50 		
       
    51 		<p>
       
    52 			Assume that we have a simple relation containing numbers:
       
    53 		</p>
       
    54 	
       
    55 		<m:pre jazyk="bash"><![CDATA[seq 0 8 \
       
    56 	| tr \\n \\0 \
       
    57 	| relpipe-in-cli generate-from-stdin numbers 3 a integer b integer c integer \
       
    58 	> numbers.rp]]></m:pre>
       
    59 
       
    60 		<p>and second one containing letters:</p>
       
    61 
       
    62 		<m:pre jazyk="bash"><![CDATA[relpipe-in-cli generate letters 2 a string b string A B C D > letters.rp]]></m:pre>
       
    63 
       
    64 		<p>We saved them into two files and then combined them into a single file. We will work with them as they are a single stream of relations:</p>
       
    65 		
       
    66 		<m:pre jazyk="bash"><![CDATA[cat numbers.rp letters.rp > both.rp;
       
    67 cat both.rp | relpipe-out-tabular]]></m:pre>
       
    68 		
       
    69 		<p>Will print:</p>
       
    70 		
       
    71 		<pre><![CDATA[numbers:
       
    72  ╭─────────────┬─────────────┬─────────────╮
       
    73  │ a (integer) │ b (integer) │ c (integer) │
       
    74  ├─────────────┼─────────────┼─────────────┤
       
    75  │           0 │           1 │           2 │
       
    76  │           3 │           4 │           5 │
       
    77  │           6 │           7 │           8 │
       
    78  ╰─────────────┴─────────────┴─────────────╯
       
    79 Record count: 3
       
    80 letters:
       
    81  ╭─────────────┬─────────────╮
       
    82  │ a  (string) │ b  (string) │
       
    83  ├─────────────┼─────────────┤
       
    84  │ A           │ B           │
       
    85  │ C           │ D           │
       
    86  ╰─────────────┴─────────────╯
       
    87 Record count: 2]]></pre>
       
    88 
       
    89 		<p>We can put away the <code>a</code> attribute from the <code>numbers</code> relation:</p>
       
    90 		
       
    91 		<m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|c' | relpipe-out-tabular</m:pre>
       
    92 		
       
    93 		<p>and leave the <code>letters</code> relation unaffected:</p>
       
    94 		
       
    95 		<pre><![CDATA[numbers:
       
    96  ╭─────────────┬─────────────╮
       
    97  │ b (integer) │ c (integer) │
       
    98  ├─────────────┼─────────────┤
       
    99  │           1 │           2 │
       
   100  │           4 │           5 │
       
   101  │           7 │           8 │
       
   102  ╰─────────────┴─────────────╯
       
   103 Record count: 3
       
   104 letters:
       
   105  ╭─────────────┬─────────────╮
       
   106  │ a  (string) │ b  (string) │
       
   107  ├─────────────┼─────────────┤
       
   108  │ A           │ B           │
       
   109  │ C           │ D           │
       
   110  ╰─────────────┴─────────────╯
       
   111 Record count: 2]]></pre>
       
   112 
       
   113 		<p>Or we can remove <code>a</code> from both relations resp. keep there only attributes whose names match <code>'b|c'</code> regex:</p>
       
   114 
       
   115 		<m:pre jazyk="bash">cat both.rp | relpipe-tr-cut '.*' 'b|c' | relpipe-out-tabular</m:pre>
       
   116 		
       
   117 		<p>Instead of <code>'.*'</code> we could use <code>'numbers|letters'</code> and in this case it will give the same result:</p>
       
   118 		
       
   119 		<pre><![CDATA[numbers:
       
   120  ╭─────────────┬─────────────╮
       
   121  │ b (integer) │ c (integer) │
       
   122  ├─────────────┼─────────────┤
       
   123  │           1 │           2 │
       
   124  │           4 │           5 │
       
   125  │           7 │           8 │
       
   126  ╰─────────────┴─────────────╯
       
   127 Record count: 3
       
   128 letters:
       
   129  ╭─────────────╮
       
   130  │ b  (string) │
       
   131  ├─────────────┤
       
   132  │ B           │
       
   133  │ D           │
       
   134  ╰─────────────╯
       
   135 Record count: 2]]></pre>
       
   136 
       
   137 		<p>All the time, we are reducing the attributes. But we can also multiply them or change their order:</p>
       
   138 		
       
   139 		<m:pre jazyk="bash">cat both.rp | relpipe-tr-cut 'numbers' 'b|a|c' 'b' 'a' 'a' | relpipe-out-tabular</m:pre>
       
   140 		
       
   141 		<p>
       
   142 			n.b. the order in <code>'b|a|c'</code> does not matter and if such regex matches, it preserves the original order of the attributes;
       
   143 			but if we use multiple regexes to specify attributes, their order and count matters:
       
   144 		</p>
       
   145 		
       
   146 		<pre><![CDATA[numbers:
       
   147  ╭─────────────┬─────────────┬─────────────┬─────────────┬─────────────┬─────────────╮
       
   148  │ a (integer) │ b (integer) │ c (integer) │ b (integer) │ a (integer) │ a (integer) │
       
   149  ├─────────────┼─────────────┼─────────────┼─────────────┼─────────────┼─────────────┤
       
   150  │           0 │           1 │           2 │           1 │           0 │           0 │
       
   151  │           3 │           4 │           5 │           4 │           3 │           3 │
       
   152  │           6 │           7 │           8 │           7 │           6 │           6 │
       
   153  ╰─────────────┴─────────────┴─────────────┴─────────────┴─────────────┴─────────────╯
       
   154 Record count: 3
       
   155 letters:
       
   156  ╭─────────────┬─────────────╮
       
   157  │ a  (string) │ b  (string) │
       
   158  ├─────────────┼─────────────┤
       
   159  │ A           │ B           │
       
   160  │ C           │ D           │
       
   161  ╰─────────────┴─────────────╯
       
   162 Record count: 2]]></pre>
       
   163 
       
   164 		<p>
       
   165 			The <code>letters</code> relation stays rock steady and <code>relpipe-tr-cut 'numbers'</code> does not affect it in any way.
       
   166 		</p>
       
   167 		
       
   168 	</text>
       
   169 
       
   170 </stránka>