relpipe-data/roadmap.xml
author František Kučera <franta-hg@frantovo.cz>
Sat, 12 Sep 2020 13:20:21 +0200
branchv_0
changeset 315 d4c2968a391f
parent 297 192b0059a6c4
child 317 fce3d6290c40
permissions -rw-r--r--
roadmap: relpipe-tr-sql already uses ODBC

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Roadmap</nadpis>
	<perex>Vision of future versions</perex>
	<pořadí>14</pořadí>

	<text xmlns="http://www.w3.org/1999/xhtml">
		<p>
			Releases before v1.0.0 are development and are not intended for production use.
			Releases after v1.0.0 should follow the rules of <a href="http://semver.org/">Semantic versioning</a>.
		</p>
		
		<p>
			Released versions are described on the <m:a href="download">download</m:a> page.
		</p>
		
		<h2>v0.17, v0.18, v0.19 etc.</h2>
		
		<p>
			Releases for discussion and verification of the format and API design.
		</p>
		
		<h3>Data types</h3>
		<ul>
			<li>fractions</li>
			
			<li>arrays/streams of bytes (octets)</li>
			<li>arrays of other types</li>
			<li>arrays of other types with support for NULL values</li>
			<li>nested relations</li>
			<li>date and time</li>
			<li>floating-point numbers (IEEE 754)</li>
			<li>support NULL values (introduce bitmaps)</li>
			<li>support long strings (chunked, similar to octet streams)</li>
			
			<li>fixed size integers</li>
			<li>precise BigDecimal</li>
			<li>string in UTF-16</li>
			<li>string in ASCII</li>
			<li>string in ISO 8859-1</li>
			<li>string in ISO 8859-2</li>
		</ul>
		
		<h3>Inputs</h3>
		<p>Probably no new ones before v1.0.0.</p>
		
		<h3>Transformations</h3>
		<ul>
			<li>relpipe-tr-streamlet: based on the same interface as used by --streamlet in relpipe-in-filesystem</li>
		</ul>
		
		<h3>Outputs</h3>
		<p>Probably no new ones before v1.0.0.</p>
		
		<h3>Other tasks</h3>
		<ul>
			<li>publish documentation of the stable API and file format</li>
			<li>publish automated complex tests (specification vs. implementation compliance)</li>
			<li>verify the format from the performance point of view</li>
			<li>improve parsing (corrupted input may currently lead to huge memory allocations), more fuzzing</li>
			<li>code clean-up  and refactoring, move some reusable parts to common libraries</li>
			<li>test the build with another compiler and tune the code</li>
			<li>pkg-config: version numbers, debug vs. release</li>
			<li>packaging for Guix SD and .deb and .rpm distributions, Snapcraft, Flatpak etc.</li>
		</ul>
		
		<h2>v1.0</h2>
		
		<p>
			First version for production use. 
			Brings no new features, just stabilized result of previous development.
			Stable must be:
		</p>
		
		<ul>
			<li>format specification</li>
			<li>relpipe-lib-writer</li>
			<li>relpipe-lib-reader</li>
			<li>relpipe-in-cli</li>
			<li>relpipe-out-tabular</li>
			<li>relpipe-in-xml</li>
			<li>relpipe-out-xml</li>
			<li>relpipe-in-csv</li>
			<li>relpipe-out-csv</li>
			<li>relpipe-out-ods</li>
		</ul>
		
		<p>
			Other parts might be released as stable later.
		</p>
		<p>
			After this point, all components (the format specification, particular libraries and particular tools) 
			will be versioned independently and the compatibility matrix will be maintained.
		</p>
		
		<h2>Further versions</h2>
		<p>
			Plans for next decades:
		</p>
		
		<h3>Data types</h3>
		<ul>
			<li>intervals of various types</li>
			<li>IPv4 address and subnet</li>
			<li>IPv6 address and subnet</li>
			<li>UUID</li>
			<li>e-mail</li>
			<li>URL / URI</li>
			<li>geographic locations</li>
			<li>OID: Object identifier</li>
		</ul>
		<h3>Inputs</h3>
		<p>Systems, commands:</p>
		<ul>
			<li>network information: ip, iptables, netstat, ss, dhcp-lease-list</li>
			<li>network interaction: ping, host, wget, curl</li>
			<li>system information: ps, lsof</li>
			<li>versioning systems (Mercurial, Git, Subversion, Monotone, Bazaar)</li>
			<li>D-Bus</li>
			<li>POSIX MQ</li>
		</ul>
			
		<p>Formats:</p>
		<ul>
			<li>regular expression (regex groups → attributes)</li>
			<li>ODS (LibreOffice)</li>
			<li>iCalendar</li>
			<li>vCard</li>
			<li>MIME (e-mail messages)</li>
			<li>YAML, JSON, INI, TOML etc. (probably through alt2xml + in streamlets)</li>
			<li>Fsdb</li>
			<li>Java .properties</li>
			<li>Gettext / .po files</li>
			<li>STDIO log: captured STDIN, STDOUT and STDERR of other process with precise timing of all events</li>
			<li>pcap / tcpdump</li>
			<li>Inverse tabular</li>
			<li>Wikipedia / MediaWiki: harvest tables from given article</li>
			<li>(X)HTML tables generic import (probably implemented in XQuery)</li>
		</ul>
		<h3>Transformations</h3>
		<ul>
			<li>iconv: character encoding converter</li>
			<li>rename: relations, attributes</li>
			<li>filter/skip: relations, attributes</li>
			<li>add constant attribute</li>
			<li>WHERE-like filter</li>
			<li>JOIN</li>
			<li>ORDER BY</li>
			<li>UNION, UNION ALL, explicit or implicit (relation interleaving)</li>
			<li>split single relation into multiple relations according to an attribute value</li>
			<li>union multiple relations and add an attribute with original relation name</li>
			<li>XPath, XSLT, XQuery</li>
			<li>statistics: compute/add aggregated values like min(), max(), avg(), sum() or percentiles</li>
			<li>function calls (probably should be part od SQL or XPath)</li>
			<li>pack and unpack: single record with arrays vs. multiple records with scalars</li>
			<li>Lua, Perl, Python</li>
			<li>system command executor</li>
		</ul>
		<h3>Outputs</h3>
		<ul>
			<li>GUI in GTK</li>
			<li>(La)TeX</li>
			<li>XML tree</li>
			<li>iCalendar</li>
			<li>vCard</li>
			<li>tar: directories and files <!--relations and records = directories; attribute values = files--></li>
			<li>YAML, JSON, INI</li>
			<li>Fsdb</li>
			<li>Java .properties</li>
			<li>Gettext / .po files</li>
			<li>GraphViz: .dot files</li>
			<li>SQL: CREATE TABLE, INSERT</li>
			<li>SMTP, IMAP, files, HTTP, SOAP, ZeroMQ, TCP, UDP</li>
			<li>D-Bus</li>
			<li>POSIX MQ</li>
			<li>source code / literals in Java, C, C++, Bash</li>
		</ul>
		
		<h3>Libraries and tools</h3>
		<p>Readers (SAX-like parsers) and writers (generators) for:</p>
		<ul>
			<li>C++</li>
			<li>C</li>
			<li>Java</li>
			<li>D</li>
			<li>Rust</li>
			<li>Go</li>
			<li>Python</li>
			<li>Perl</li>
			<li>PHP</li>
		</ul>
		<p>Other libraries and tools:</p>
		<ul>
			<li>ORM API: mapping between classes/objects and relations/records</li>
			<li>JDBC and ODBC drivers</li>
			<li>schemas (XSD-like)</li>
			<li>generators, compilers, validators, comparators</li>
			<li>transformer helpers</li>
			<li>Ragel input helpers</li>
			<li>PEG input helpers</li>
			<li>repair tools (for corrupted data)</li>
			<li>pull parsers</li>
			<li>SAX and DOM parsers (read relational data like it was XML)</li>
			<li>visual editor for pipeline design</li>
		</ul>
		
		
	</text>

</stránka>