relpipe/relpipe-web: relpipe-data/roadmap.xml@70e7eb578cfa


<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Roadmap</nadpis>
	<perex>Vision of future versions</perex>
	<pořadí>14</pořadí>

	<text xmlns="http://www.w3.org/1999/xhtml">
		<p>
			Releases before v1.0.0 are development and are not intended for production use.
			Releases after v1.0.0 should follow the rules of <a href="http://semver.org/">Semantic versioning</a>.
		</p>
		
		<p>
			Released versions are described on the <m:a href="download">download</m:a> page.
		</p>
		
		<h2>v0.19, v0.20, v0.21 etc.</h2>
		
		<p>
			Releases for discussion and verification of the format and API design.
		</p>
		
		<p>
			This phase (before v1.0.0) might seem quite long.
			But it is important to verify the ideas and design in various scenarios, on various use cases.
			The general idea and the big picture are quite clear and stable.
			However there are many technical details that need to be carefully tuned.
		</p>
		
		<h3>Data types</h3>
		<ul>
			<li>fractions</li>
			
			<li>arrays/streams of bytes (octets)</li>
			<li>arrays of other types</li>
			<li>arrays of other types with support for NULL values</li>
			<li>nested relations</li>
			<li>date and time</li>
			<li>floating-point numbers (IEEE 754)</li>
			<li>support NULL values (introduce bitmaps)</li>
			<li>support long strings (chunked, similar to octet streams)</li>
			
			<li>fixed size integers</li>
			<li>precise BigDecimal</li>
			<li>string in UTF-16</li>
			<li>string in ASCII</li>
			<li>string in ISO 8859-1</li>
			<li>string in ISO 8859-2</li>
		</ul>
		
		<h3>Inputs</h3>
		<p>Probably no new ones before v1.0.0.</p>
		
		<h3>Transformations</h3>
		<ul>
			<li>relpipe-tr-streamlet: based on the same interface as used by --streamlet in relpipe-in-filesystem</li>
		</ul>
		
		<h3>Outputs</h3>
		<p>Probably no new ones before v1.0.0.</p>
		
		<h3>Other tasks</h3>
		<ul>
			<li>publish documentation of the stable API and file format</li>
			<li>publish automated complex tests (specification vs. implementation compliance)</li>
			<li>verify the format from the performance point of view</li>
			<li>relpipe-lib-writer: several modes of output buffering (auto, relation, record, value) or manual control</li>
			<li>improve parsing (corrupted input may currently lead to huge memory allocations), more fuzzing</li>
			<li>code clean-up  and refactoring, move some reusable parts to common libraries</li>
			<li>test the build with another compiler and tune the code</li>
			<li>pkg-config: version numbers, debug vs. release</li>
			<li>packaging for Guix SD and .deb and .rpm distributions, Snapcraft, Flatpak etc.</li>
			<li>more examples, screenshots, videos, asciinema etc.</li>
		</ul>
		
		<h2>v1.0</h2>
		
		<p>
			First version for production use. 
			Brings no new features, just stabilized result of previous development.
			Stable must be:
		</p>
		
		<ul>
			<li>format specification</li>
			<li>relpipe-lib-writer</li>
			<li>relpipe-lib-reader</li>
			<li>relpipe-in-cli</li>
			<li>relpipe-out-tabular</li>
			<li>relpipe-in-xml</li>
			<li>relpipe-out-xml</li>
			<li>relpipe-in-csv</li>
			<li>relpipe-out-csv</li>
			<li>relpipe-out-ods</li>
		</ul>
		
		<p>
			Other parts might be released as stable later.
		</p>
		<p>
			After this point, all components (the format specification, particular libraries and particular tools) 
			will be versioned independently and the compatibility matrix will be maintained.
		</p>
		
		<h2>Further versions</h2>
		<p>
			Plans for next decades:
		</p>
		
		<h3>Data types</h3>
		<ul>
			<li>intervals of various types</li>
			<li>IPv4 address and subnet</li>
			<li>IPv6 address and subnet</li>
			<li>UUID</li>
			<li>e-mail</li>
			<li>URL / URI</li>
			<li>geographic locations</li>
			<li>OID: Object identifier</li>
		</ul>
		<h3>Inputs</h3>
		<p>Systems, commands:</p>
		<ul>
			<li>network information: ip, iptables, netstat, ss, dhcp-lease-list</li>
			<li>network interaction: ping, host, wget, curl</li>
			<li>system information: ps, lsof</li>
			<li>versioning systems (Mercurial, Git, Subversion, Monotone, Bazaar)</li>
			<li>D-Bus</li>
			<li>POSIX MQ</li>
		</ul>
			
		<p>Formats:</p>
		<ul>
			<li>regular expression (regex groups → attributes)</li>
			<li>ODS (LibreOffice)</li>
			<li>iCalendar</li>
			<li>vCard</li>
			<li>MIME (e-mail messages)</li>
			<li>YAML, JSON, INI, TOML etc. (probably through alt2xml + in streamlets)</li>
			<li>Fsdb</li>
			<li>Java .properties</li>
			<li>Gettext / .po files</li>
			<li>STDIO log: captured STDIN, STDOUT and STDERR of other process with precise timing of all events</li>
			<li>pcap / tcpdump</li>
			<li>Inverse tabular</li>
			<li>Wikipedia / MediaWiki: harvest tables from given article</li>
			<li>(X)HTML tables generic import (probably implemented in XQuery)</li>
		</ul>
		<h3>Transformations</h3>
		<ul>
			<li>iconv: character encoding converter</li>
			<li>rename: relations, attributes</li>
			<li>filter/skip: relations, attributes</li>
			<li>add constant attribute</li>
			<li>WHERE-like filter</li>
			<li>JOIN</li>
			<li>ORDER BY</li>
			<li>UNION, UNION ALL, explicit or implicit (relation interleaving)</li>
			<li>split single relation into multiple relations according to an attribute value</li>
			<li>union multiple relations and add an attribute with original relation name</li>
			<li>XPath, XSLT, XQuery</li>
			<li>statistics: compute/add aggregated values like min(), max(), avg(), sum() or percentiles</li>
			<li>function calls (probably should be part od SQL or XPath)</li>
			<li>pack and unpack: single record with arrays vs. multiple records with scalars</li>
			<li>Lua, Perl, Python</li>
			<li>system command executor</li>
		</ul>
		<h3>Outputs</h3>
		<ul>
			<li>GUI in GTK</li>
			<li>(La)TeX</li>
			<li>XML tree</li>
			<li>iCalendar</li>
			<li>vCard</li>
			<li>tar: directories and files <!--relations and records = directories; attribute values = files--></li>
			<li>YAML, JSON, INI</li>
			<li>Fsdb</li>
			<li>Java .properties</li>
			<li>Gettext / .po files</li>
			<li>GraphViz: .dot files</li>
			<li>SQL: CREATE TABLE, INSERT</li>
			<li>SMTP, IMAP, files, HTTP, SOAP, ZeroMQ, TCP, UDP</li>
			<li>D-Bus</li>
			<li>POSIX MQ</li>
			<li>source code / literals in Java, C, C++, Bash</li>
		</ul>
		
		<h3>Libraries and tools</h3>
		<p>Readers (SAX-like parsers) and writers (generators) for:</p>
		<ul>
			<li>C++</li>
			<li>C</li>
			<li>Java</li>
			<li>D</li>
			<li>Rust</li>
			<li>Go</li>
			<li>Python</li>
			<li>Perl</li>
			<li>PHP</li>
		</ul>
		<p>Other libraries and tools:</p>
		<ul>
			<li>ORM API: mapping between classes/objects and relations/records</li>
			<li>JDBC and ODBC drivers</li>
			<li>schemas (XSD-like)</li>
			<li>generators, compilers, validators, comparators</li>
			<li>transformer helpers</li>
			<li>Ragel input helpers</li>
			<li>PEG input helpers</li>
			<li>repair tools (for corrupted data)</li>
			<li>pull parsers</li>
			<li>SAX and DOM parsers (read relational data like it was XML)</li>
			<li>visual editor for pipeline design</li>
		</ul>
		
		
	</text>

</stránka>
author	František Kučera <franta-hg@frantovo.cz>
	Mon, 21 Feb 2022 01:21:22 +0100
branch	v_0
changeset 330	70e7eb578cfa
parent 329	5bc2bb8b7946
permissions	-rw-r--r--