relpipe-data/release-v0.15.xml
author František Kučera <franta-hg@frantovo.cz>
Mon, 21 Feb 2022 00:43:11 +0100
branchv_0
changeset 329 5bc2bb8b7946
parent 299 dd7aeff5ef0c
permissions -rw-r--r--
Release v0.18
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
23
0d2729ed16ed zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents: 18
diff changeset
     1
<stránka
0d2729ed16ed zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents: 18
diff changeset
     2
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
0d2729ed16ed zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents: 18
diff changeset
     3
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
0d2729ed16ed zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents: 18
diff changeset
     4
	
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
     5
	<nadpis>Release v0.15</nadpis>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
     6
	<perex>new public release of Relational pipes</perex>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
     7
	<m:release>v0.15</m:release>
4
1bb39595a51c genrování hlavní nabídky #1
František Kučera <franta-hg@frantovo.cz>
parents: 2
diff changeset
     8
2
ab9099ff88fa vkládání zápatí, jmenné prostory, saxon
František Kučera <franta-hg@frantovo.cz>
parents: 1
diff changeset
     9
	<text xmlns="http://www.w3.org/1999/xhtml">
ab9099ff88fa vkládání zápatí, jmenné prostory, saxon
František Kučera <franta-hg@frantovo.cz>
parents: 1
diff changeset
    10
		<p>
256
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
    11
			We are pleased to introduce you the new development version of <m:name/>.
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    12
			This release brings two big new features: streamlets and parallel processing + several smaller improvements.
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
    13
		</p>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
    14
		
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
    15
		<ul>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
    16
			<li>
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    17
				<strong>SLEB128</strong>: variable-length integers are now signed (i.e. can be even negative!) and encoded as SLEB128</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    18
			<li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    19
				<strong>streamlets in relpipe-in-filesystem</strong>: see details below</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    20
			<li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    21
				<strong>parallel processing in relpipe-in-filesystem</strong>: see details below</li>
250
d16336d1c61f Release v0.10
František Kučera <franta-hg@frantovo.cz>
parents: 241
diff changeset
    22
			<li>
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    23
				<strong>multiple modes in relpipe-in-xmltable</strong>: see details below</li>
264
d39cfc926f95 XMLTable, SQL, v0.13
František Kučera <franta-hg@frantovo.cz>
parents: 258
diff changeset
    24
			<li>
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    25
				<strong>XInclude in relpipe-in-xmltable</strong>: use <code>--xinclude true</code> to process XIncludes before converting XML to relations</li>
282
ec02133045a3 Release v0.14 – SQL, AWK, Bash completion, GPLv3
František Kučera <franta-hg@frantovo.cz>
parents: 276
diff changeset
    26
			<li>
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    27
				<strong>relpipe-lib-protocol → relpipe-lib-common</strong>: this module was renamed and converted to a shared library, it will contain some common functions instead of just the header files</li>
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
    28
		</ul>
256
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
    29
		
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
    30
		<p>
256
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
    31
			See the <m:a href="examples">examples</m:a> and <m:a href="screenshots">screenshots</m:a> pages for details.
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    32
		</p>
256
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
    33
		
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
    34
		<p>
299
dd7aeff5ef0c fix typo: relasease → release
František Kučera <franta-hg@frantovo.cz>
parents: 294
diff changeset
    35
			Please note that this is still a development release and thus the API (libraries, CLI arguments, formats) might and will change.
329
5bc2bb8b7946 Release v0.18
František Kučera <franta-hg@frantovo.cz>
parents: 299
diff changeset
    36
			Any suggestions, ideas and bug reports are welcome in our <m:a href="contact">mail box</m:a>.
256
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
    37
		</p>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    38
		
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    39
		<h2>Streamlets</h2>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    40
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    41
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    42
			<em>Streamlet</em> is a small stream that inflows into the main stream, fuse with it and (typically) brings new attributes.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    43
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    44
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    45
			From the technical point of view, streamlets are something between classic <m:a href="classic-example">filters</m:a> and functions.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    46
			Unlike a function, the streamlet can be written in any programming language and runs as a separate process.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    47
			Unlike a filter, the streamlet does not relplace whole stream with a new one, but reads certain attributes from the original stream and adds some new ones back.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    48
			Common feature of filters and streamlets is that both continually read the input and continually deliver outputs, so the memory requirements are usually constant and „infinite“ streams might be processed.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    49
			And unlike ordinary commands (executed e.g. using <code>xargs</code> or a shell loop over a set of files), the streamlet does not <code>fork()</code> and <code>exec()</code> on each input file – the single streamlet process is reused for all records in the stream which is much more efficient (especially if there is some expensive initialization phase).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    50
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    51
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    52
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    53
			Because streamlets are small scripts or compiled programs, they can be used for extending <m:name/> with minimal effort.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    54
			A streamlet can be e.g. few-lines Bash script – or on the other hand: a more powerful C++ or Java program.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    55
			Currently we have templates/examples written in Bash, C++ and Java. But it is possible to use any scripting or programming language.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    56
			The streamlet communicates with its parent (who manages the whole stream) through a simple <a href="img/streamlet-release-v0.15.png">message-based protocol</a>.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    57
			Full documentation will be published when stable (before v1.0.0) as a part of the public API.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    58
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    59
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    60
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    61
			The first module where streamlets have been implemented is <code>relpipe-in-filesystem</code>.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    62
			Streamlets in this module get a single input attribute (the file path) and add various file metadata to the stream.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    63
			We have e.g. streamlets that compute hashes (SHA-256 etc.), extract metadata from image files (PNG, JPEG etc.) or PDF documents (title, author… or even the full content in plain-text), <m:a href="streamlets-preview">OCR</m:a>-recognized text from images,
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    64
			count lines of code, extract portions of XML files using XPath or some metadata from JAR/ZIP files.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    65
			The streamlets are a way how to keep <code>relpipe-in-filesystem</code> simple with small code footpring while making it extensible and thus powerful.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    66
			We are not going to face the question: „Should we add this nice feature (+) and thus also this library dependency (-)? Would it be bloatware or not?“.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    67
			We (or the users) can add any feature through streamlets while the core <code>relpipe-in-filesystem</code> will stay simple and nobody (who does not need that feature) will not suffer from the growing complexity.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    68
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    69
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    70
			But streamlets are not limited to <code>relpipe-in-filesystem</code> – they are a general concept and there will be <code>relpipe-tr-streamlet</code> module.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    71
			Such streamlets will get any set of input attributes (not only file names) defined by the user and compute values based on them.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    72
			Such streamlet can e.g. modify a text attribute, compute a sum of numeric attributes, encrypt or decrypt values or interact with some external systems.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    73
			Writing a streamlet is easier than writing a transformation (like <code>relpipe-tr-*</code>) and it is more than OK to write simple single-purpose ad-hoc streamlets.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    74
			It is like writing simple shell scripts or functions.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    75
			Examples of really simple streamlets are: <code>inode</code> (Bash) and <code>pid</code> (C++).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    76
			It requires implementing only two functions: first one returns names and types of the output attributes and the second one returns that attributes for a record.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    77
			However, the streamlets might be parametrized through options, might return dynamic number of output attributes and might provide complex logic.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    78
			Some streamlets will become a stable part of the <m:name/> specification and API (<code>xpath</code> and <code>hash</code> seems to be such ones).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    79
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    80
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    81
			One of open questions is whether to have streamlets in <code>relpipe-in-filesystem</code> when we have <code>relpipe-tr-streamlet</code>.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    82
			<em>One tool should do one thing</em> and <em>we should not duplicate the effort</em>…
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    83
			But it still makes some sense because the file streamlets are specific kind of streamlets and e.g. Bash completion should suggest them if we work with files but not with other data.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    84
			And it is also nice to have all metadata collecting on the same level in a single command (i.e. <code>--streamlet</code> beside <code>--file</code> and <code>--xattr</code>)
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    85
			than having to collect basic and extended file attributes using single command and collect other file metadata using different command.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    86
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    87
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    88
		<h2>Parallel processing</h2>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    89
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    90
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    91
			There are two kinds of parallelism: over attributes and over records.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    92
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    93
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    94
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    95
			Because streamlets are forked processes, they are quite naturally parallelized over attributes.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    96
			We can e.g. compute SHA-1 hash in one streamlet and SHA-256 hash in another streamlet and we will utilize two CPU cores (or we can ask one streamlet to compute both SHA-1 and SHA-256 hashes and then we will utilize only one CPU core).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    97
			The <code>relpipe-in-filesystem</code> tool simply 1) feeds all streamlet instances with the current file name, 2) streamlets work in parallel and then 3) the tool collects results from all streamlets.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    98
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
    99
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   100
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   101
			But it would not be enough. Today, we usually have more CPU cores than heavy attributes (like hashes).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   102
			So we need to process multiple records in parallel.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   103
			The first design proposal (not implemented) was that the tool will simply distribute the file names to STDINs of particular streamlet processes in the round-robin fashion and processes will write to the common STDOUT (just with a lock for synchronization to keep the records atomic – the <m:name/> data format is specifically designed for such use).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   104
			This will be really simple and somehow helpful (better than nothing).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   105
			But this design has a significant flaw: the tool is not aware of how busy particular streamlet processes are and will feed them with tasks (file names) equally.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   106
			So it will work satisfactorily only in case that all tasks have similar difficultness.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   107
			This is unfortunately not the usual case because e.g. computing a hash of a big file takes much more time than computing a hash of a small file.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   108
			Thus some streamlet processes will be overloaded while other will be idle and in the end whole group will be waiting for the overloaded ones (and only one or few CPU cores will be utilized).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   109
			So this is not a good way to go.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   110
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   111
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   112
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   113
			The solution is using a queue. The tool will feed the tasks (file names in the <code>relpipe-in-filesystem</code> case) to the queue
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   114
			and the streamlet processes will fetch them from the queue as soon as they are idle.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   115
			So we will utilize all the CPU cores all the time (obviously if we have more records than CPU cores, which is usually true).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   116
			Because our target platform are POSIX operating systems (and primary one is GNU/Linux), we choose POSIX MQ as the queue.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   117
			POSIX MQ is a nice and simple technology, it is standardized and really classic. It does not require any broker process or any third-party library so it does not bring additional dependencies – it is provided directly by the OS.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   118
			However, fallback is still possible:
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   119
			a) if we set <code>--parallel 1</code> (which is default behavior), it will run directly in a single process without the queue; 
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   120
			b) the POSIX MQ have quite simple API so it is possible to write an adapter and port the tool to another system that does not have POSIX MQ and still enjoy the parallelism (or simply reimplement this API using shared memory and a semaphore).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   121
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   122
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   123
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   124
			We could add another queue to the output side and use it for serialization of the stream (which flows to the single STDOUT/FD).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   125
			But it is not necessary (thanks to the <m:name/> format design) and would add just some overhead.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   126
			So on the output side, we use just a POSIX semaphore (and a lock/guard based on it).
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   127
			Thus the tool still has no other dependencies than the standard library and the operating system.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   128
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   129
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   130
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   131
			If we still have idle CPU cores or machines and need even more parallelism, streamlets can fork their own sub-processes, use threads or some technology like MPI or OpenMP.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   132
			However, simple parallel processing of records (<code>--parallel N</code>) is usually more than suitable and efficiently utilize our hardware.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   133
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   134
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   135
		<h2>XPath modes</h2>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   136
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   137
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   138
			Both <code>relpipe-in-xmltable</code> and the <code>xpath</code> streamlet uses XPath language to extract values from XML documents.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   139
			There are several modes of value extraction:
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   140
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   141
		<ul>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   142
			<li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   143
				<code>string</code>: this is default option, simply the text content
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   144
			</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   145
			<li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   146
				<code>boolean</code>: the value converted to a boolean in the XPath fashion;
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   147
				<!--
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   148
				can be used also to check whether given file is valid XML:
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   149
				<code>- -streamlet xpath - -option attribute . - -option mode boolean - -as valid_xml</code>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   150
				-->
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   151
			</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   152
			<li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   153
				<code>raw-xml</code>: a portion of original XML document; 
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   154
				this is a way to put multiple values or any structured data in a single attribute;
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   155
				if the XPath points to multiple nodes, it can still be returned as a valid XML document using a configurable wrapper node (so we can return e.g. all headlines from a document)
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   156
			</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   157
			<li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   158
				<code>line-number</code>: number of the line where given node was found;
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   159
				this can be used for referencing particular place in the document</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   160
			<li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   161
				<code>xpath</code>: XPath pointing to particular node; 
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   162
				it would be a different XPath expression than the original one (which might point to a set of nodes);
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   163
				this can also be used for referencing particular place in the document</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   164
		</ul>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   165
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   166
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   167
			Both tools share the naming convention and are configured in a similar way – using e.g. <code>relpipe-in-xmltable --mode raw-xml</code> or <code>--streamlet xpath --option mode raw-xml</code>.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   168
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   169
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   170
		<h2>Feature overview</h2>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   171
		
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   172
		<h3>Data types</h3>
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   173
		<ul>
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   174
			<li m:since="v0.8">boolean</li>
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   175
			<li m:since="v0.15">variable-length signed integer (SLEB128)</li>
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   176
			<li m:since="v0.8">string in UTF-8</li>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   177
		</ul>
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   178
		<h3>Inputs</h3>
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   179
		<ul>
256
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
   180
			<li m:since="v0.11">Recfile</li>
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   181
			<li m:since="v0.9">XML</li>
264
d39cfc926f95 XMLTable, SQL, v0.13
František Kučera <franta-hg@frantovo.cz>
parents: 258
diff changeset
   182
			<li m:since="v0.13">XMLTable</li>
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   183
			<li m:since="v0.9">CSV</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   184
			<li m:since="v0.9">file system</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   185
			<li m:since="v0.8">CLI</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   186
			<li m:since="v0.8">fstab</li>
282
ec02133045a3 Release v0.14 – SQL, AWK, Bash completion, GPLv3
František Kučera <franta-hg@frantovo.cz>
parents: 276
diff changeset
   187
			<li m:since="v0.14">SQL script</li>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   188
		</ul>
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   189
		<h3>Transformations</h3>
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   190
		<ul>
264
d39cfc926f95 XMLTable, SQL, v0.13
František Kučera <franta-hg@frantovo.cz>
parents: 258
diff changeset
   191
			<li m:since="v0.13">sql: filtering and transformations using the SQL language</li>
258
2868d772c27e Release v0.12 – AWK
František Kučera <franta-hg@frantovo.cz>
parents: 256
diff changeset
   192
			<li m:since="v0.12">awk: filtering and transformations using the classic AWK tool and language</li>
250
d16336d1c61f Release v0.10
František Kučera <franta-hg@frantovo.cz>
parents: 241
diff changeset
   193
			<li m:since="v0.10">guile: filtering and transformations defined in the Scheme language using GNU Guile</li>
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   194
			<li m:since="v0.8">grep: regular expression filter, removes unwanted records from the relation</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   195
			<li m:since="v0.8">cut: regular expression attribute cutter (removes or duplicates attributes and can also DROP whole relation)</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   196
			<li m:since="v0.8">sed: regular expression replacer</li>
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   197
			<li m:since="v0.8">validator: just a pass-through filter that crashes on invalid data</li>
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   198
			<li m:since="v0.8">python: highly experimental</li>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   199
		</ul>
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   200
		<h3>Streamlets</h3>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   201
		<ul>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   202
			<li m:since="v0.15">xpath (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   203
			<li m:since="v0.15">hash (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   204
			<li m:since="v0.15">jar_info (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   205
			<li m:since="v0.15">mime_type (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   206
			<li m:since="v0.15">exiftool (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   207
			<li m:since="v0.15">pid (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   208
			<li m:since="v0.15">cloc (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   209
			<li m:since="v0.15">exiv2 (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   210
			<li m:since="v0.15">inode (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   211
			<li m:since="v0.15">lines_count (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   212
			<li m:since="v0.15">pdftotext (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   213
			<li m:since="v0.15">pdfinfo (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   214
			<li m:since="v0.15">tesseract (example, unstable)</li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   215
		</ul>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   216
		<h3>Outputs</h3>
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   217
		<ul>
256
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
   218
			<li m:since="v0.11">ASN.1 BER</li>
822ffd23d679 Release v0.11
František Kučera <franta-hg@frantovo.cz>
parents: 250
diff changeset
   219
			<li m:since="v0.11">Recfile</li>
241
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   220
			<li m:since="v0.9">CSV</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   221
			<li m:since="v0.8">tabular</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   222
			<li m:since="v0.8">XML</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   223
			<li m:since="v0.8">nullbyte</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   224
			<li m:since="v0.8">GUI in Qt</li>
f71d300205b7 Release v0.9
František Kučera <franta-hg@frantovo.cz>
parents: 219
diff changeset
   225
			<li m:since="v0.8">ODS (LibreOffice)</li>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   226
		</ul>
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   227
		
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   228
		<h2>New examples</h2>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   229
		<ul>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   230
			<li><m:a href="examples-parallel-hashes">Computing hashes in parallel</m:a></li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   231
			<li><m:a href="examples-runnable-jars">Finding runnable JARs</m:a></li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   232
			<li><m:a href="examples-xhtml-filesystem-xpath">Collecting statistics from XHTML pages</m:a></li>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   233
		</ul>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   234
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   235
		<h2>Backward incompatible changes</h2>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   236
		
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   237
		<p>
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   238
			The data format has changed: SLEB128 is now used for encoding numbers.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   239
			If the data format was used only on-thy-fly, no additional steps are required during upgrade.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   240
			If the data format was used for persistence (streams redirected to files), recommended upgrade procedure is:
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   241
			convert files to XML using old version of <code>relpipe-out-xml</code> and then convert it from XML back using new version of <code>relpipe-in-xml</code>.
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   242
		</p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   243
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   244
		<h2>Installation</h2>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   245
		
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   246
		<p>
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   247
			Instalation was tested on Debian GNU/Linux 10.2.
219
a94eb371f77e Release v0.8
František Kučera <franta-hg@frantovo.cz>
parents: 218
diff changeset
   248
			The process should be similar on other distributions.
a94eb371f77e Release v0.8
František Kučera <franta-hg@frantovo.cz>
parents: 218
diff changeset
   249
		</p>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   250
		
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   251
		<m:pre src="examples/release-v0.15.sh" jazyk="bash" odkaz="ano"/>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   252
		
219
a94eb371f77e Release v0.8
František Kučera <franta-hg@frantovo.cz>
parents: 218
diff changeset
   253
		<p>
a94eb371f77e Release v0.8
František Kučera <franta-hg@frantovo.cz>
parents: 218
diff changeset
   254
			<m:name/> are modular thus you can download and install only parts you need (the libraries are needed always).
a94eb371f77e Release v0.8
František Kučera <franta-hg@frantovo.cz>
parents: 218
diff changeset
   255
			Tools <code>out-gui.qt</code> and <code>tr-python</code> require additional libraries and are not built by default.
a94eb371f77e Release v0.8
František Kučera <franta-hg@frantovo.cz>
parents: 218
diff changeset
   256
		</p>
151
5697a01db388 roadmap
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   257
		
87
25dec6931f18 Lepší odsazení, tabulátory.
František Kučera <franta-hg@frantovo.cz>
parents: 23
diff changeset
   258
	</text>
4
1bb39595a51c genrování hlavní nabídky #1
František Kučera <franta-hg@frantovo.cz>
parents: 2
diff changeset
   259
294
abbc9bcfbcc4 Release v0.15 – streamlets, parallel processing
František Kučera <franta-hg@frantovo.cz>
parents: 282
diff changeset
   260
</stránka>