relpipe-data/specification.xml
author František Kučera <franta-hg@frantovo.cz>
Mon, 21 Feb 2022 00:43:11 +0100
branchv_0
changeset 329 5bc2bb8b7946
parent 302 e536a3aaee77
permissions -rw-r--r--
Release v0.18

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Specification</nadpis>
	<perex>Specification of the Relational pipes data format</perex>
	<pořadí>20</pořadí>

	<text xmlns="http://www.w3.org/1999/xhtml">
		<p>
			Currently only fragments of the specification are published
			and the incompatible changes might (and will) come before the v1.0.0 release.
			Please stay tuned for this stable version which will deliver specification such complete and precise 
			that independent implementation of the format will be possible.
		</p>
		
		<h2>
			<m:name/> data format
		</h2>
			
		<h3>Stream structure structure</h3>
		
		
		<m:tabulka>
			id	name	description
			0x1D	start	start of the relation
			0x1E	record	start of the record
		</m:tabulka>
		
		
		<h3>Data types</h3>
		
		<p>Currently, there are only three data types.</p>
		
		<m:tabulka>
			typeId	code	description
			0x01	boolean	logical value, true/false
			0x02	integer	signed integer number of arbitrary length
			0x03	string	character string in UTF-8
		</m:tabulka>
		
		<h2>Libraries</h2>
		
		<h3>relpipe-lib-writer</h3>
		
		<p>Wraps an output stream (usually STDOUT), accepts method calls (relations, attributes) and generates <m:name/> data on the stream.</p>
		
		<h3>relpipe-lib-reader</h3>
		
		<p>
			Wraps an input stream (usually STDIN). The caller creates and sets handlers (zero or more) using <code>addHandler()</code> and then calls <code>process()</code> method.
			During this method call, the reader reads the input and calls the handlers.
			Handlers receive relations and attributes.
		</p>
		
		<h2>Tools</h2>
		
		<h3>relpipe-in-cli</h3>
		<p>
			A tool that generates a single relation. If we want more relations in a single stream, we just call this command multiple times:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[
(relpipe-in-cli ... ; relpipe-in-cli ... ; relpipe-in-cli ... ) | relpipe-out-tabular
]]></m:pre>

		<p>Or concatenate several files or do a combination of both files and commands.</p>
		
		<p>This command accept these arguments:</p>
		
		<ul>
			<li>relation name</li>
			<li>attribute count</li>
			<li>names of attributes</li>
			<li>types of attributes</li>
			<li>attribute values</li>
		</ul>
		
		<p>
			These data might be passed as CLI arguments on the command line or as null-byte (<code>\0</code>) separated list of values on STDIN.
			Both ways can be combined e.g. pass relation name and metadata as CLI arguments and the data on STDIN.
			The tool simply starts with CLI arguments (if any) and continues with values from STDIN (if any).
		</p>
		
		<p>
			This tool is a good entry point to the <m:name/> world because it requires no programming and construction of the argument list or <code>\0</code> separated list can be done in any language or environment.
			Tools like <code>perl</code> or <code>tr</code> can convert almost any data to this form and pass it to <code>relpipe-in-cli</code>.
		</p>
		
		
		<h3>relpipe-in-csv</h3>
		
		<p>
			A tool that parses a CSV (comma-separated values) input and generates a single relation from it.
			Values might be "quoted".
			If quoted value contains a quote literal, it is escaped by doubling.
			Encoding must be UTF-8.
			Line ends might be LF or CRLF.
		</p>
		
		
		<h3>relpipe-in-fstab</h3>
		
		<p>
			A tool that parses <code>fstab</code> (or <code>mtab</code>) file containing a list of devices, mount-points and their options and generates a single relation from it.
		</p>
		
		<p>
			If executed on TTY, it reads data from the default location: <code>/etc/fstab</code>.
			If executed with STDIN attached to a file or another command (in a pipe), it reads data from this stream.
		</p>
		
		<h3>relpipe-in-xml</h3>
		
		<p>
			A tool that reads XML data in the format generated by <code>relpipe-out-xml</code>
			and converts it back to the relational format.
			Can be used together with XSLT procesor XQuery engine or other XML generator in two basic scenarios:
		</p>
				
		<ul>
			<li>as an input filter: to convert other formats to relational data</li>
			<li>as a transformation: in pipeline: <code>relpipe-out-xml | some-xml-processor | relpipe-in-xml</code></li>
		</ul>
		
		<p>
			XQuery is very powerful language that can do various transformations, filtering and even JOIN and UNION operations.
			So its power is comparable to relational databases.
			In future <m:name/> releases, there will be also an SQL transformation tool, where these operations will be defined in classic SQL syntax.
		</p>
		
		<h3>relpipe-in-filesystem</h3>
		
		<p>
			A tool that reads <code>\0</code> separated list of file paths and generates relation
			with metadata of given files.
			Can read basic metadata like file path, name, size, owner…
			and also extended attributes (xattr).
		</p>
		
		<h3>relpipe-out-tabular</h3>
		
		<p>
			A tool that formats relational data as tables with unicode borders and ANSI colors.
			Good for viewing relational data in a terminal or redirecting such view to a file or clipboard (to be pasted anywhere, where fixed-width font is used).
		</p>
		
		<h3>relpipe-out-gui</h3>
		
		<p>
			A tool that views relational data in a GUI window. Relations are displayed as panels with tables.
			Particular implementation might offer also additional features like chart drawing or simple statistics (sum, max, min, avg, percentiles etc.)
		</p>
		
		<h3>relpipe-out-nullbyte</h3>
		
		<p>
			A tool that converts relational data to a list of null-byte (<code>\0</code>) separated values.
			Makes sense only for a single relation because boundaries between relations will be lost if there are more of them.
			Also attribute names and types are lost by default.
		</p>
		<p>
			Is suitable for passing a single relation to <code>xargs --null --max-args=X</code> (where X is the attribute count)
			or other command that accepts values separated by a null-byte.
		</p>
		
		<h3>relpipe-out-xml</h3>
		
		<p>
			A tool that converts relational data to its XML representation.
			Is useful for further processing in other tools (e.g. XSLT procesor that generates XHTML org other XML-based format)
			or for storage of relational data in a text form (good for version control systems, diff, manual editation or review).
		</p>
		
		<h3>relpipe-out-ods</h3>
		
		<p>
			A tool that converts relational data to its ODS (OpenDocument) representation resp. its <em>flat</em> variant (a single XML file instead of a ZIP containing many files).
			This OpenDocument output can be opened in tools like LibreOffice and further processed (add charts, calculations etc.).
		</p>
		
		<h3>relpipe-out-csv</h3>
		
		<p>
			A tool that convers relational data into the CSV format (comma-separated values).
			Makes sense only for a single relation because boundaries between relations will be lost if there are more of them.
		</p>
		
		<h3>relpipe-tr-validator</h3>
		
		<p>
			A tool that behaves like <code>cat</code>, <code>dd</code> or <code>pv</code> commands i.e. reads data from STDIN and outputs the same data on STDOUT.
			But compared to these tools, data are parsed and then converted back to the <m:name/> format.
			Thus if the input was not in this format, the process fails (exit code != 0).
			Errors (if any) are reported on STDERR.
		</p>
		
		<h3>relpipe-tr-sed</h3>
		
		<p>
			A tool that modifies attribute values according to given regular expression and replacement string.
			Works with given relations and attributes only (others stay untouched).
			Relation and attribute name are also specified as a regular expression.
			Thus single <code>relpipe-tr-sed</code> run may modify multiple relations or attributes.
		</p>
		
		<h3>relpipe-tr-grep</h3>
		
		<p>
			A tool that drops records (does restriction) according to their match to given regular expression.
			Relation and attribute name are also specified as a regular expression.
			Thus single <code>relpipe-tr-grep</code> run may modify multiple relations or do match on multiple attributes.
		</p>
		
		<h3>relpipe-tr-cut</h3>
		
		<p>
			A tool that drops or multiplies attributes or changes their order (does projection) according to their name match to given regular expression.
			Relation name is also specified as a regular expression and there might be multiple regexes specifying desired attributes.
			Thus single <code>relpipe-tr-cut</code> run may modify multiple relations or pick multiple attributes from them.
		</p>
		
	</text>

</stránka>