relpipe-data/principles.xml
author František Kučera <franta-hg@frantovo.cz>
Mon, 21 Feb 2022 01:21:22 +0100
branchv_0
changeset 330 70e7eb578cfa
parent 329 5bc2bb8b7946
permissions -rw-r--r--
Added tag relpipe-v0.18 for changeset 5bc2bb8b7946
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
23
0d2729ed16ed zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents: 18
diff changeset
     1
<stránka
0d2729ed16ed zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents: 18
diff changeset
     2
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
0d2729ed16ed zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents: 18
diff changeset
     3
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
0d2729ed16ed zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents: 18
diff changeset
     4
	
147
c004a45502b3 new pages: principles, roadmap, faq
František Kučera <franta-hg@frantovo.cz>
parents: 139
diff changeset
     5
	<nadpis>Principles</nadpis>
c004a45502b3 new pages: principles, roadmap, faq
František Kučera <franta-hg@frantovo.cz>
parents: 139
diff changeset
     6
	<perex>Basic ideas, principles and rules behind the Relational pipes</perex>
c004a45502b3 new pages: principles, roadmap, faq
František Kučera <franta-hg@frantovo.cz>
parents: 139
diff changeset
     7
	<pořadí>12</pořadí>
4
1bb39595a51c genrování hlavní nabídky #1
František Kučera <franta-hg@frantovo.cz>
parents: 2
diff changeset
     8
2
ab9099ff88fa vkládání zápatí, jmenné prostory, saxon
František Kučera <franta-hg@frantovo.cz>
parents: 1
diff changeset
     9
	<text xmlns="http://www.w3.org/1999/xhtml">
148
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    10
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    11
		<h2>Sane software</h2>
2
ab9099ff88fa vkládání zápatí, jmenné prostory, saxon
František Kučera <franta-hg@frantovo.cz>
parents: 1
diff changeset
    12
		<p>
204
58c40f213028 principles: Sane Software Manifesto is already published as a draft
František Kučera <franta-hg@frantovo.cz>
parents: 188
diff changeset
    13
			<m:name/> (both the specification and the reference implementation) should be developed according to the <a href="https://sane-software.globalcode.info/">Sane software manifesto</a> (draft).
321
e32e2e308de4 improve English
František Kučera <franta-hg@frantovo.cz>
parents: 310
diff changeset
    14
			Many of the principles mentioned below are part of <em>being sane</em>. 
148
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    15
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    16
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    17
		<h2>Free software and open specification</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    18
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    19
		<p>
321
e32e2e308de4 improve English
František Kučera <franta-hg@frantovo.cz>
parents: 310
diff changeset
    20
			<m:name/> are and always will be a <a href="https://www.gnu.org/philosophy/free-sw.html">free software</a> and the specification of the format, tools and libraries will be open.
148
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    21
			It must not be impaired by software patents or other similar restrictions.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    22
			In our country, we do not accept the existence of patents at all.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    23
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    24
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    25
		<h2>Divide and conquer</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    26
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    27
			Each program should do one thing and do it well. We should separate these three tasks:
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    28
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    29
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    30
		<ul>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    31
			<li>data acquisition / creation</li>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    32
			<li>data transformation</li>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    33
			<li>data presentation</li>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    34
		</ul>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    35
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    36
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    37
			A single program should not combine two or more of these tasks. Or should at least allow to run in mode which does only one of them.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    38
			Thus we should be able to combine various programs together and get various presentations of the same data regardless the presentation features of the program that created the data.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    39
			We should be able to add another transformation on the path between the data origin and the data destination. For example filter out some unwanted data or modify or enhance the values.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    40
			Or we should be able to generate some mock/testing data and pass it through the original pipeline (sequence of transformations and the output filter) instead of the live data.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    41
			We should be free in how we combine the tools together.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    42
			We should be able to build even pipelines that was not expected by the authors of particulars tools we used.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    43
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    44
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    45
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    46
			Authors should focus on their task only – e.g. <em>interaction with the Kernel and capturing the inotify events</em> and should not bother about the presentation of the captured data.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    47
			There might be many output formats that makes sense (CSV, XML, table, YAML, \0 separated values etc.),
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    48
			but we should keep it <abbr title="Don't repeat yourself">DRY</abbr> and don't implement every format in every tool.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    49
			It would be a waste of time and also a source of errors, because when developing some additional format (which is not our core business) only <em>by the way</em> we would probably do it wrong. 
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    50
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    51
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    52
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    53
		<h2>Inputs, outputs and transformations as reusable libraries</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    54
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    55
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    56
			Parts of the <m:name/> implementation might be used as a library instead of as a filter in a pipeline.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    57
			This is not a primary purpose of our software, but sometimes it might be useful.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    58
			In such scenario the data are never serialized in the <m:name/> format but flows through a single process and its method/function calls.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    59
			For instance, if we need a tabular or CSV output in our program, we could adopt the code from the <m:name/> implementation as a library and call it internally without generating data in the <m:name/> format.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    60
			This might bring some performance benefits.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    61
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    62
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    63
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    64
			This is not a recommended approach, but should be possible.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    65
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    66
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    67
		<p>
321
e32e2e308de4 improve English
František Kučera <franta-hg@frantovo.cz>
parents: 310
diff changeset
    68
			However, in any case, we should provide also an option of producing <em>raw</em> data in the <m:name/> format and allow others to convert it to any other format according to their needs.
148
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    69
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    70
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    71
		<h2>Specification-first, contract-first</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    72
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    73
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    74
			The starting point for any developer should be the <m:a href="specification">specification</m:a> that defines the contract and the interface between the system components.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    75
			It should cover the data format and also the tools (inputs, transformers and outputs).
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    76
			The specification must be verified by creating a reference implementation in at least one programming language.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    77
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    78
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    79
		<h2>Small code footprint and modular design</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    80
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    81
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    82
			The length of the program measured in source lines of code (SLOC) should be as small as possible.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    83
			Of course, the goal is not putting multiple statements on a single line.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    84
			We should avoid unnecessary complexity (see <a href="https://en.wikipedia.org/wiki/Cyclomatic_complexity">Cyclomatic complexity</a> – but the SLOC are easier to count and give also quite relevant information).
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    85
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    86
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    87
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    88
			Modular design allows users to include (download, compile, run) only the portions of software they need.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    89
			If the user needs e.g. regular expressions and XML output to be happy, he should not be forced to include also the code for CSV, YAML, JSON and PDF.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    90
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    91
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    92
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    93
			Sane software is minimalistic in this way, which means that it is easy to audit, debug or modify.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    94
			Looking for a bug (or even a backdoor) or looking for the place where to add the new feature
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    95
			is much easier in a software that has hundreds or tousands of SLOC than in a software consisting of hundreds of thousands or even millions of SLOC.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    96
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    97
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    98
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
    99
			The developer who wants to generate (or consume on the other side) relational data, should include only circa few hundreds of SLOC.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   100
			This is the amount of code that could be read through in an hour or two.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   101
			<!--
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   102
			Thus implementing the relational output to an existing program should be matter of few hours.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   103
			-->
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   104
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   105
		
329
5bc2bb8b7946 Release v0.18
František Kučera <franta-hg@frantovo.cz>
parents: 321
diff changeset
   106
		<h2 id="optional_complexity">Optional complexity</h2>
231
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   107
		
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   108
		<p>
310
aeda3cb4528d examples: Querying an RDF triplestore using SPARQL
František Kučera <franta-hg@frantovo.cz>
parents: 231
diff changeset
   109
			We are not scared by things like XML, SQL, RDF, Java or even C++ and we do not hate them.
231
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   110
			There are use cases where their complexity is reasonable and makes sense.
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   111
			But on the other hand, there are many scenarios, where such complexity is not necessary or is even harmful.
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   112
			This leads us to the conclusion: <em>the complexity must be optional</em>.
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   113
		</p>
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   114
		<p>
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   115
			Thus <m:name/> data format is independent of above-mentioned <em>complex</em> technologies
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   116
			and our implementation is divided into many separate modules (tools).
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   117
			So the user could download, compile and run only the parts he really needs.
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   118
		</p>
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   119
		<p>
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   120
			<m:name/> can serve as a <em>bridge</em> between the <em>complex world</em> and the <em>simple world</em>.
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   121
		</p>
ea49ee7a73c9 principles: Optional complexity
František Kučera <franta-hg@frantovo.cz>
parents: 210
diff changeset
   122
		
148
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   123
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   124
		<h2>Sane dependencies</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   125
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   126
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   127
			The libraries and the tools should not depend on any libraries other than the standard library of given programming language.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   128
			In the best case, of course.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   129
			This might be in coflict with the previous rule and then it is the question what is lesser harm.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   130
			It definitely makes no sense to write e.g. XML or YAML parser ourselves as a part of our tool.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   131
			Using high quality and well tested library is the only sane option.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   132
			But what about XML output? We can develop a reliable XML generator on few lines of code because we can implement only the subset of the standard that we need.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   133
			Writing such code is much more sane than including some bulky library that has several orders of magnitude more lines of code than our program.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   134
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   135
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   136
		<h2>Concise data serialization</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   137
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   138
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   139
			The <m:name/> data format should be concise – the data should be represented by reasonably small amount of bytes.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   140
			The format should support large amounts of small values and also sparse data (structures with many NULL/missing values) without wasting too much space.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   141
			The data that are not written don't need to be compressed and thus have the best compression ratio.
87
25dec6931f18 Lepší odsazení, tabulátory.
František Kučera <franta-hg@frantovo.cz>
parents: 23
diff changeset
   142
		</p>
148
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   143
		
188
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   144
		<h2>Streaming</h2>
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   145
		
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   146
		<p>
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   147
			Relational tools should process streams of data and should hold only necessary data in the memory
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   148
			i.e. the tool should produce the output (the first record) as soon as possible while still reading the input (following records).
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   149
			Thus the memory usage does not depend on the volume of processed data.
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   150
		</p>
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   151
		
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   152
		<p>
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   153
			However, there are cases where such streaming is not feasible e.g. if we need to compute some statistics or a column widths while printing a table in the terminal.
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   154
			In such situation, we must read the whole relation and only then generate the output.
210
f0a2916368e2 small fixes and improvements
František Kučera <franta-hg@frantovo.cz>
parents: 204
diff changeset
   155
			But we should still be able to do streaming on the relations level i.e. if there are more relation, we always hold only one of them in the memory.
188
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   156
		</p>
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   157
		
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   158
		<p>
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   159
			This rule is important not only from the performance point of view but also for user experience.
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   160
			The user should see the output as soon as possible i.e. the longer running processes will produce result continuously instead of flushing everything at the end.
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   161
			This is also good for debugging and <em>looking inside the things</em>. 
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   162
		</p>
5b0fab48d59e principles: streaming
František Kučera <franta-hg@frantovo.cz>
parents: 150
diff changeset
   163
		
148
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   164
		<h2>Unambiguity</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   165
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   166
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   167
			There should be only one way to represent a single value.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   168
			For example the booleans can be written as <code>00</code> (false) or <code>01</code> (true) and every other value (<code>02..FF</code>) should be invalid/unsupported.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   169
			Exceptions might occur if there are relevant reasons, but they should be rare.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   170
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   171
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   172
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   173
		<h2>Multiple files concatenation</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   174
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   175
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   176
			It should be possible to concatenate multiple files or streams of relational data as easy as we can concatenate multiple text files
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   177
			(given that such text files have same character encoding, have no BOM at the beginning and have a newline at the end).
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   178
			If we can do:
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   179
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   180
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   181
		<m:pre jazyk="bash"><![CDATA[
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   182
(cat file-1.txt; echo "some additional middle data"; cat file-2.txt) | wc -l
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   183
]]></m:pre>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   184
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   185
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   186
			We should also be able to do:
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   187
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   188
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   189
		<m:pre jazyk="bash"><![CDATA[
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   190
(cat file-1.rp; relpipe-in-fstab; cat file-2.rp) | relpipe-out-xml
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   191
]]></m:pre>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   192
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   193
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   194
			Also, it should be possible to append (<code>&gt;&gt;</code>) new records to the last relation without modifying the already written data.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   195
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   196
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   197
		<h2>Work primarily with STDIO</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   198
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   199
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   200
			The tools should work primarily and by default with the standard input and standard output (STDIN and STDOUT).
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   201
			Reading/writing from/to files or network should be (if present) a secondary and optional scenario.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   202
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   203
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   204
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   205
			Standard error output (STDERR) should be used for errors/warnings/logs. By default, it should not produce any output, if everything goes well.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   206
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   207
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   208
		<h2>Tools might be TTY-aware</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   209
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   210
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   211
			The input and output tools processing relational data might adapt their behaviour according to the fact whether their input resp. output is a terminal (TTY).
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   212
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   213
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   214
			If the output is a TTY, it means that the output is displayed to the user, 
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   215
			so the tool might e.g. colorize its output or do some other human-friendly formatting – 
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   216
			which makes no sense, if the output is directed to a file or piped to another program.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   217
			Example:
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   218
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   219
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   220
		<m:pre jazyk="bash"><![CDATA[
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   221
# This would print a table with fancy colors using ANSI sequences:
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   222
relpipe-in-fstab | relpipe-out-tabular
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   223
			
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   224
# This would store the same table in a file but without any colors:
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   225
relpipe-in-fstab | relpipe-out-tabular > table.txt]]></m:pre>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   226
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   227
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   228
			If the input is a TTY, it means that the user is typing the values.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   229
			In such situation, the tool might accept another input format (text, human-friendly) or use some default file location instead.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   230
			Example:
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   231
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   232
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   233
		<m:pre jazyk="bash"><![CDATA[
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   234
# This would read the /etc/fstab (which is the default location):
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   235
relpipe-in-fstab | relpipe-out-tabular
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   236
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   237
# Those would read the /etc/mtab instead:
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   238
cat /etc/mtab | relpipe-in-fstab | relpipe-out-tabular
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   239
relpipe-in-fstab < /etc/mtab | relpipe-out-tabular]]></m:pre>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   240
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   241
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   242
			However, the behaviour should be modified in visual and expectable manner only.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   243
			It should not e.g. switch from XML to YAML.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   244
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   245
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   246
		<h2>Use --long-options</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   247
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   248
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   249
			Tools should accept arguments (if any) as <code>--long-options</code>.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   250
			When looking at a script, it should be clear – at first sight – what it does.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   251
			Which would not be if some cryptic short options like <code>-a -x -Z</code> were used.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   252
			In order to save our keyboards, there are features like <em>Bash completion</em>.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   253
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   254
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   255
		
150
7d7d4e1f293f principles: Be exact and reliable
František Kučera <franta-hg@frantovo.cz>
parents: 148
diff changeset
   256
		<h2>Be exact and reliable</h2>
7d7d4e1f293f principles: Be exact and reliable
František Kučera <franta-hg@frantovo.cz>
parents: 148
diff changeset
   257
		
7d7d4e1f293f principles: Be exact and reliable
František Kučera <franta-hg@frantovo.cz>
parents: 148
diff changeset
   258
		<p>
7d7d4e1f293f principles: Be exact and reliable
František Kučera <franta-hg@frantovo.cz>
parents: 148
diff changeset
   259
			<m:name/> should convey data without corrupting or waywardly modifying them.
7d7d4e1f293f principles: Be exact and reliable
František Kučera <franta-hg@frantovo.cz>
parents: 148
diff changeset
   260
			Implementation details (e.g. how values are encoded in the stream) should not affect transferred data and the user.
7d7d4e1f293f principles: Be exact and reliable
František Kučera <franta-hg@frantovo.cz>
parents: 148
diff changeset
   261
		</p>
7d7d4e1f293f principles: Be exact and reliable
František Kučera <franta-hg@frantovo.cz>
parents: 148
diff changeset
   262
		
148
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   263
		<h2>Fail-fast, be strict</h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   264
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   265
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   266
			Because the relational data will be created by machines instead of being manually typed by erring humans,
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   267
			we should fail-fast on an error. We should be strict and require valid inputs only.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   268
			Any error should be revealed as soon as possible and fixed.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   269
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   270
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   271
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   272
			There might be tools or options for recovering corrupted data (caused e.g. by a failing HDD or a faulty network or a buggy software).
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   273
			But the recovery mode is not the default one.
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   274
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   275
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   276
		<p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   277
			We demand reliable systems – not random and accidential behaviour caused by software guessing <em>What might probably these bytes mean?</em>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   278
		</p>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   279
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   280
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   281
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   282
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   283
		
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   284
		<h2></h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   285
		<h2></h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   286
		<h2></h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   287
		<h2></h2>
d51787006954 principles
František Kučera <franta-hg@frantovo.cz>
parents: 147
diff changeset
   288
		
87
25dec6931f18 Lepší odsazení, tabulátory.
František Kučera <franta-hg@frantovo.cz>
parents: 23
diff changeset
   289
	</text>
4
1bb39595a51c genrování hlavní nabídky #1
František Kučera <franta-hg@frantovo.cz>
parents: 2
diff changeset
   290
1
a05c6f3cbc3e základ, první verze
František Kučera <franta-hg@frantovo.cz>
parents:
diff changeset
   291
</stránka>