author | František Kučera <franta-hg@frantovo.cz> |
Fri, 25 Oct 2019 13:00:58 +0200 | |
branch | v_0 |
changeset 275 | 1cdb74e845d0 |
parent 226 | fc68cd31db78 |
child 311 | f677eba0c86c |
permissions | -rw-r--r-- |
23
0d2729ed16ed
zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents:
18
diff
changeset
|
1 |
<stránka |
0d2729ed16ed
zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents:
18
diff
changeset
|
2 |
xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana" |
0d2729ed16ed
zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents:
18
diff
changeset
|
3 |
xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro"> |
0d2729ed16ed
zkouška interního odkazu
František Kučera <franta-hg@frantovo.cz>
parents:
18
diff
changeset
|
4 |
|
147
c004a45502b3
new pages: principles, roadmap, faq
František Kučera <franta-hg@frantovo.cz>
parents:
139
diff
changeset
|
5 |
<nadpis>FAQ</nadpis> |
c004a45502b3
new pages: principles, roadmap, faq
František Kučera <franta-hg@frantovo.cz>
parents:
139
diff
changeset
|
6 |
<perex>Frequently asked questions</perex> |
c004a45502b3
new pages: principles, roadmap, faq
František Kučera <franta-hg@frantovo.cz>
parents:
139
diff
changeset
|
7 |
<pořadí>16</pořadí> |
4
1bb39595a51c
genrování hlavní nabídky #1
František Kučera <franta-hg@frantovo.cz>
parents:
2
diff
changeset
|
8 |
|
2
ab9099ff88fa
vkládání zápatí, jmenné prostory, saxon
František Kučera <franta-hg@frantovo.cz>
parents:
1
diff
changeset
|
9 |
<text xmlns="http://www.w3.org/1999/xhtml"> |
149 | 10 |
|
2
ab9099ff88fa
vkládání zápatí, jmenné prostory, saxon
František Kučera <franta-hg@frantovo.cz>
parents:
1
diff
changeset
|
11 |
<p> |
149 | 12 |
<strong>When the stable version will be released?</strong> |
13 |
<br/> |
|
14 |
We don't know – there is no exact date. |
|
15 |
<m:name/> are something that should be released about twenty years ago. But real work started in 2018. |
|
16 |
So it is not a big difference whether it will be released this month or the next one. |
|
17 |
We understand the <em>release early, release often</em> rule. |
|
18 |
But it fits better to application software than to standards and APIs. |
|
151 | 19 |
Of course, we expect some evolution after the v1.0.0 release, but we need to stabilize and verify many things before the release in order to be able to maintain backward compatibility in future. |
149 | 20 |
</p> |
21 |
||
22 |
<p> |
|
275
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
23 |
<strong>When the project started?</strong> |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
24 |
<br/> |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
25 |
The first commit was in the July 2018. Before that, there was nonpublic prototype which started in the April 2018. |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
26 |
But the original ideas are much older. |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
27 |
Predecessors of <m:name/> were <a href="https://sql-api.globalcode.info/">SQL-API</a> (September 2014) and <a href="https://alt2xml.globalcode.info/">alt2xml</a> (January 2012). |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
28 |
The SQL-API was a prototype of an API for operating systems – it allowed SELECTing users, processes, fstab etc. |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
29 |
This prototype was based on PostgreSQL and Perl and is being replaced by particular modules of <m:name/>. |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
30 |
The alt2xml uses different data model (tree instead of relational) but despite that, it is based on the same ideas as <m:name/>: converting data from various formats to a uniform model, streaming and processing through reusable transformations (filters). |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
31 |
This tool might be developed in the future and can be used together with <m:name/> (e.g. read INI, JSON or Java properties files using alt2xml and pass them to the <code>relpipe-in-xmltable</code> and continue with the processing in the relational way). |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
32 |
</p> |
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
33 |
|
1cdb74e845d0
faq: When the project started?
František Kučera <franta-hg@frantovo.cz>
parents:
226
diff
changeset
|
34 |
<p> |
149 | 35 |
<strong>How can I help you?</strong> |
36 |
<br/> |
|
177 | 37 |
<ul> |
38 |
<li>Suggest more examples how <m:name/> can be used; especially how YOU would like to use it.</li> |
|
39 |
<li>We are looking for illustrations that would supplement our documentation and website.</li> |
|
40 |
<li> |
|
41 |
As an author of a program that generates or consumes some data, you could add relational input and output to your program. |
|
42 |
But please mention that we do not have v1.0 yet, so these features should be marked as experimental. |
|
43 |
The API might/will change. |
|
44 |
Other (and maybe better for now) option is to add input/output of values separated by null byte (<code>\0</code>). |
|
45 |
This "API" will be supported for sure and data are simply the attribute values. There are no record separators (we know the number of attributes, so they are not needed). |
|
46 |
Disadvantage of this approach is that the stream can contain only a single relation; and that the metadata are not embedded in the stream and must be passed separately. |
|
47 |
</li> |
|
48 |
<li>Review our source code and suggest improvements and fixes. Constructive criticism is always welcome. This is one of reasons why we publish our programs as free software.</li> |
|
49 |
<li>Native speakers could suggest improvements and corrections of our English texts.</li> |
|
50 |
</ul> |
|
149 | 51 |
</p> |
52 |
||
53 |
<p> |
|
164
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
54 |
<strong>Why do you speak about <em>relations</em> instead of <em>tables</em>?</strong> |
149 | 55 |
<br/> |
152
f876683324c2
roadmap: more formats
František Kučera <franta-hg@frantovo.cz>
parents:
151
diff
changeset
|
56 |
It might be uncommon terminology for someone, but <em>relations</em> and <em>attributes</em> symbolizes |
f876683324c2
roadmap: more formats
František Kučera <franta-hg@frantovo.cz>
parents:
151
diff
changeset
|
57 |
that we focus on substance of the data. Pure data are conveyed through the pipelines |
f876683324c2
roadmap: more formats
František Kučera <franta-hg@frantovo.cz>
parents:
151
diff
changeset
|
58 |
and the presentation of such data is only the last step. |
f876683324c2
roadmap: more formats
František Kučera <franta-hg@frantovo.cz>
parents:
151
diff
changeset
|
59 |
The data might be presented/visualized in many various forms. |
f876683324c2
roadmap: more formats
František Kučera <franta-hg@frantovo.cz>
parents:
151
diff
changeset
|
60 |
And tables (consisting of rows and columns) are only one of many possible options. |
149 | 61 |
</p> |
62 |
||
164
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
63 |
<m:tabulka> |
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
64 |
Relational SQL alternative terms |
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
65 |
relation table |
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
66 |
attribute column field |
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
67 |
record row tuple |
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
68 |
</m:tabulka> |
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
69 |
|
224
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
70 |
<p> |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
71 |
<strong>What about duplicate records?</strong> |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
72 |
<br/> |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
73 |
In the relational model, the records must be unique. |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
74 |
In <m:name/> there is no central authority that would prevent you from appending duplicate records to the relational stream. |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
75 |
It means that in some points in the relational pipeline there might occur data that do not fit the rules of the relational model. |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
76 |
The deduplication is generally not done on the output side of particular steps, but is postponed and done on the input side of steps, where uniqueness is important (e.g. JOIN or UNION). |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
77 |
You should not put duplicate records in the relational stream, but you can. |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
78 |
Duplicates can also occur after some transformations like <code>relpipe-tr-cut</code> (e.g. if you choose only <code>dump</code> or <code>type</code> attributes from your <code>fstab</code> and omit the primary/unique key field). |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
79 |
Such data are not considered invalid, but should be processed like there are no duplicates (if uniqueness is important for particular step) |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
80 |
or should be passed through if it is not in conflict with the goal of given step (e.g. calling <code>uppercase()</code> function on some field or doing UNION ALL). |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
81 |
Each tool must document how it handles duplicate records. |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
82 |
</p> |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
83 |
|
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
84 |
<p> |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
85 |
The reasons for this <em>transient tolerance of duplicate records</em> are two. |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
86 |
1) Performance: guaranteeing the uniqueness in every moment would negate streaming and would require holding whole relation in memory and always sorting the records. |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
87 |
2) Modularity: many tasks would have to be done by a single bulky tool that does everything e.g. if you want to cut only the <code>type</code> field from your <code>fstab</code> and then count statistics how many times particular filesystems are used. |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
88 |
</p> |
9ea7e5c65107
faq: duplicate records and the relational model
František Kučera <franta-hg@frantovo.cz>
parents:
200
diff
changeset
|
89 |
|
152
f876683324c2
roadmap: more formats
František Kučera <franta-hg@frantovo.cz>
parents:
151
diff
changeset
|
90 |
<!-- |
149 | 91 |
<p> |
92 |
<strong>?</strong> |
|
93 |
<br/> |
|
94 |
... |
|
87
25dec6931f18
Lepší odsazení, tabulátory.
František Kučera <franta-hg@frantovo.cz>
parents:
23
diff
changeset
|
95 |
</p> |
149 | 96 |
|
97 |
<p> |
|
98 |
<strong>Why don't build on XML? It is a standard since 1998 and there are many tools and libraries for it.</strong> |
|
99 |
<br/> |
|
100 |
XML is a great and mature (meta)format and its ecosystem is respectable and inspiring. |
|
101 |
But the XML does not conform to our <m:a href="principles">principles</m:a>, especially the ability to concatenate multiple files/streams and to append new records to an already existing relation. |
|
102 |
XML is also not concise. |
|
103 |
And the implementation of the XML parser in various environments would be <em>a bit more complex</em>. |
|
104 |
</p> |
|
105 |
<p> |
|
106 |
We prefer XML as an input and output format and look forward to cooperation with XML ecosystem (XSD, XPath, XSLT, XQuery etc.). |
|
107 |
Such steps might be at the beginning, at the end, or even in the middle of the relational pipeline. |
|
108 |
</p> |
|
109 |
||
110 |
<p> |
|
111 |
<strong>?</strong> |
|
112 |
<br/> |
|
113 |
... |
|
114 |
</p> |
|
115 |
--> |
|
116 |
||
163 | 117 |
<p> |
118 |
<strong>Why C++?</strong> |
|
119 |
<br/> |
|
120 |
Firstly, <m:name/> are a specification of a data format and as such are not bound to any programming language. |
|
164
56eb59640688
faq: relation vs. table
František Kučera <franta-hg@frantovo.cz>
parents:
163
diff
changeset
|
121 |
This specification is totally language- and platform- independent. |
163 | 122 |
</p> |
123 |
<p> |
|
124 |
The ideal/perfect language does not exist and our implementations will be written in various languages. |
|
200 | 125 |
We started our prototype and first real implementations in C++ for several reasons: |
163 | 126 |
</p> |
127 |
<ul> |
|
128 |
<li>It is mature and widespread: GCC runs almost everywhere and other compilers/toolchains are also available.</li> |
|
129 |
<li>Programs written in C++ starts immediately: very important for CLI tools.</li> |
|
130 |
<li>Can be seamlessly mixed with C and its libraries. Is good for interaction with the operating system.</li> |
|
131 |
<li>Modern C++ is a quite good language.</li> |
|
132 |
<li>We are not C++ gurus and C++ is not our first-choice language i.e. the fact that we are able to do implementation in C++ proves that the specification is simple enough to be reasonably implemented by an average software engineer in any other language :-)</li> |
|
133 |
</ul> |
|
134 |
||
135 |
<p>Implementation in other languages will follow. Java is the next one. Then probably Perl, Python, Rust, Go, PHP etc. (depends on community involvement).</p> |
|
136 |
||
149 | 137 |
<p> |
138 |
<strong>Have you seen <a href="https://xkcd.com/927/">XKCD 927</a>?</strong> |
|
139 |
<br/> |
|
140 |
Yes. And we liked it so much that we followed their instructions and created <m:name/>. |
|
141 |
</p> |
|
142 |
||
143 |
<p> |
|
144 |
<strong>Are <m:name/> compatible with cloud, IoT, SPA/PWA, AI, blockchain and mobile-first? Should our DevOps use it in our serverless hipster fintech app with strong focus on SEO, UX and machine learning?</strong> |
|
145 |
<br/> |
|
146 |
Go @#$%& yourself. We are pretty old school hackers and we enjoy our green screen terminals!<br/> |
|
147 |
Of course, you can use <m:name/> anywhere if it makes sense for you. |
|
148 |
<m:name/> are designed to be generic enough – i.e. not specific to any industry (banking, telecommunications, embedded etc.) nor platform. |
|
163 | 149 |
Data in this format are very concise, so can be used even in very small devices. |
149 | 150 |
Its native data structure is a relation (table) but it can also handle tree-structured data (i.e. any data). |
151 |
It is designed rather for streaming than for storage (but under some circumstances it is also meaningful to use it for storage). |
|
152 |
</p> |
|
153 |
||
182 | 154 |
<p> |
155 |
<strong>What about your hobbies?</strong> |
|
156 |
<br/> |
|
226
fc68cd31db78
move common XML code to relpipe-lib-xmlwriter (a header-only library)
František Kučera <franta-hg@frantovo.cz>
parents:
224
diff
changeset
|
157 |
It is a bit personal question, but I can unveil that I collect signed photos of Ally Sheedy, Winona Ryder and Richard Stallman. |
182 | 158 |
</p> |
159 |
||
87
25dec6931f18
Lepší odsazení, tabulátory.
František Kučera <franta-hg@frantovo.cz>
parents:
23
diff
changeset
|
160 |
</text> |
4
1bb39595a51c
genrování hlavní nabídky #1
František Kučera <franta-hg@frantovo.cz>
parents:
2
diff
changeset
|
161 |
|
1 | 162 |
</stránka> |