<stránka
xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
<nadpis>Integrating Relational pipes with GNU Recutils</nadpis>
<perex>using recfile format as input and output + filtering</perex>
<m:pořadí-příkladu>01900</m:pořadí-příkladu>
<text xmlns="http://www.w3.org/1999/xhtml">
<p>
Recfile is the native format of <a href="https://www.gnu.org/software/recutils/">GNU Recutils</a>.
Recfiles are text files that contain records of various types.
They are human-editable and serve as simple databases.
<m:name/> support input and output in this format since v0.11.
</p>
<p>
We can convert any relational data to the recfile format by using <code>relpipe-out-recfile</code> – e.g. our <code>fstab</code> will look like this:
</p>
<m:pre jazyk="text" src="examples/relpipe-out-fstab.rec.txt"/>
<p>
Then we can edit this data (e.g. in GNU Emacs which has mode for this format) or store it in a version control system like Mercurial or Git.
Because it is a text format (like XML, which is also supported and good for this purpose),
we can efficiently track changes in data across versions, do <code>diff</code> or (with some care) even <code>patch</code>.
And we can use whole GNU Recutils toolchain while working with such data.
</p>
<p>
Obligatory example of filtering our <code>fstab</code>:
</p>
<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab | relpipe-out-recfile | recsel -e "type = 'btrfs' || type = 'xfs'"]]></m:pre>
<p>Will give us a recfile:</p>
<m:pre jazyk="text"><![CDATA[scheme: UUID
device: a2b5f230-a795-4f6f-a39b-9b57686c86d5
mount_point: /home
type: btrfs
options: relatime
dump: 0
pass: 2
scheme:
device: /dev/mapper/sdf_crypt
mount_point: /mnt/private
type: xfs
options: relatime
dump: 0
pass: 2]]></m:pre>
<p>And we can convert it back to the relational format using <code>relpipe-in-recfile</code>:</p>
<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
| relpipe-out-recfile \
| recsel -e "type = 'btrfs' || type = 'xfs'" \
| relpipe-in-recfile \
| relpipe-out-tabular]]></m:pre>
<p>and print as a table in our terminal:</p>
<m:pre jazyk="text"><![CDATA[recfile:
╭─────────────────┬──────────────────────────────────────┬──────────────────────┬───────────────┬──────────────────┬───────────────┬───────────────╮
│ scheme (string) │ device (string) │ mount_point (string) │ type (string) │ options (string) │ dump (string) │ pass (string) │
├─────────────────┼──────────────────────────────────────┼──────────────────────┼───────────────┼──────────────────┼───────────────┼───────────────┤
│ UUID │ a2b5f230-a795-4f6f-a39b-9b57686c86d5 │ /home │ btrfs │ relatime │ 0 │ 2 │
│ │ /dev/mapper/sdf_crypt │ /mnt/private │ xfs │ relatime │ 0 │ 2 │
╰─────────────────┴──────────────────────────────────────┴──────────────────────┴───────────────┴──────────────────┴───────────────┴───────────────╯
Record count: 2]]></m:pre>
<p>
n.b. in the v0.11 the conversion to recfiles and back is not 100% lossless (unlike XML)
because <m:name/> support only three data types (string, unsigned integer and boolean) in this version;
this will be improved in later releases (more data types are planned before v1.0)
</p>
<p>
Because some web browsers or tools can store the original URL in extended attributes while downloading a file,
we can use <code>recsel</code> to find files downloaded from some particular domain:
</p>
<m:pre jazyk="bash"><![CDATA[find -print0 | relpipe-in-filesystem \
--file path \
--file size \
--file type \
--xattr xdg.origin.url --as url \
| relpipe-out-recfile \
| recsel -e 'url ~ "^https?://([^/]*\.)?archive\.org/"']]></m:pre>
<p>
<m:name/> can be also used together with <a href="https://sql-dk.globalcode.info/">SQL-DK</a> (in 2019-03-05 development version)
to pipe data from big relational databases like PostgreSQL or MariaDB to other formats like recfiles.
Having a script:
</p>
<m:pre jazyk="bash" src="examples/sql-dk_pg_1.sh" odkaz="ano"/>
<p>
We can convert result sets from any SQL queries to relational format and then work with such data without connection to the original database.
Thus we can cache (<em>materialize</em>) the results locally in a file and use them even offline.
Or we can run the SQL query each time and have fresh data:
</p>
<m:pre jazyk="text"><![CDATA[sql-dk_pg_1.sh | relpipe-out-recfile]]></m:pre>
<p>Will result in:</p>
<m:pre jazyk="text" src="examples/sql-dk_pg_1.rec.txt"/>
<p>Or we can view the data in classic tabular way using <code>relpipe-out-tabular</code>:</p>
<m:pre jazyk="text" src="examples/sql-dk_pg_1.tabular.txt"/>
<p>
Materialized (or fresh) data from the database can be further transformed
using <code>relpipe-tr-*</code> commands like grep, sed, cut, guile,
or (through the recfile conversion) by the <code>recsel</code> command from GNU Recutils.
</p>
<p>
The <code>relpipe-in-recfile</code> will help with conversion of recfiles to various formats like XHTML,
pretty-printing or with xargs-like processing
(using <code>relpipe-out-nullbyte</code> and regular <code>xargs</code> or <code>read_nullbyte</code> function
as described in the <m:a href="examples-out-bash">Writing an output filter in Bash</m:a> example).
Thus we can have data-driven Bash scripts based on our recfiles.
</p>
</text>
</stránka>