# HG changeset patch # User František Kučera # Date 1543254981 -3600 # Node ID 8c2e2dbee5cc3c85e9283aa1ea2d55a9d49158a4 # Parent 42bbbccd87f373cb9eaeccb5b311af6bb7a53654 format, structure and logical model – the relational model diff -r 42bbbccd87f3 -r 8c2e2dbee5cc relpipe-data/classic-example.xml --- a/relpipe-data/classic-example.xml Mon Nov 26 12:15:40 2018 +0100 +++ b/relpipe-data/classic-example.xml Mon Nov 26 18:56:21 2018 +0100 @@ -42,8 +42,8 @@ WHITE]]>

- So we have a list of colors of our dogs printed in upper-case. - In case we have several dogs of same color, we could avoid duplicates simply by adding | sort -u in the pipeline (after the cut part). + So we have a list of colors of our dogs printed in big letters. + In case we have several dogs of same color, we could avoid duplicates simply by adding | sort -u in the pipeline (after the cut step).

The great parts

diff -r 42bbbccd87f3 -r 8c2e2dbee5cc relpipe-data/index.xml --- a/relpipe-data/index.xml Mon Nov 26 12:15:40 2018 +0100 +++ b/relpipe-data/index.xml Mon Nov 26 18:56:21 2018 +0100 @@ -9,9 +9,8 @@

One of the great parts of the - - - culture is the inventionwhich is attributed to Doug McIlroy, see The Art of Unix Programming: Pipes, Redirection, and Filters + + culture is the inventionwhich is attributed to Doug McIlroy, see The Art of Unix Programming: Pipes, Redirection, and Filters of pipes and the ideasee The Art of Unix Programming: Basics of the Unix Philosophy that one program should do one thing and do it well.

@@ -47,7 +46,7 @@ Such single-purpose programs (often called filters) are much easier to create, test and optimize and their authors don't have to bother about the complexity of the final pipeline. They even don't have to know, how their programs will be used in the future by others. This is a great design principle that brings us advanced flexibility, reusability, efficiency and reliability. - Being in any role (author of a filter, builder of a pipeline etc.), we can always focus on our task only and do it well. + Being in any role (author of a filter, builder of a pipeline etc.), we can always focus on our task only and do it well.see cluelessness by Jaroslav Tulach in his Practical API Design. Confessions of a Java Framework Architect And we can collaborate with others even if we don't know about them and we don't know that we are collaborating. Now think about putting this together with the free software ideas... How very!

@@ -84,17 +83,45 @@ --> -

Bytes, text, structured data? XML, YAML, JSON, ASN.1

+

+ Now the question is: how the data passed through pipes should be formatted and structured. + There is wide spectrum of options from simple unstructured text files (just arrays of lines) + through various DSV + to formats like XML (YAML, JSON, ASN.1, Diameter, S-expressions etc.). + Simpler formats look temptingly but have many problems and limitations (see the Pitfalls section in the Classic pipeline example). + On the other hand, the advanced formats are capable to represent arbitrary object tree structures or even arbitrary graphs. + They offer unlimited possibilities – and this is their strength and weaknes at the same time. +

-

Rules:

+ - +

+ It is not about the shape of the brackets, apostrophes, quotes or text vs. binary. + It is not a technical question – it is in the semantic layer and human brain. + Generic formats and their arbitrary object trees/graphs are (for humans, not for computers) difficult to understand and work with + – compared to simpler structures like arrays, maps or matrixes. +

+

+ This is the reason why we have chosen the relational model as our logical model. + This model comes from 1969invented and described by Edgar F. Codd, + see Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks, Research Report, IBM from 1969 + and A Relational Model of Data for Large Shared Data Banks from 1970, + see also Relational model + + and through decades it has proven its qualities and viability. + This logical model is powerful enough to describe almost any data and – at the same time – it is still simple and easy to be understood by humans. +

+ +

+ Thus the are streams containing zero or more relations. + Each relation has a name, one or more attributes and zero or more records (tuples). + Each attribute has a name and a data-type. + Records contain attribute values. + We can imagine this stream as a sequence of tables (but the table is only one of many possible visual representations of such relational data). +

What are?

diff -r 42bbbccd87f3 -r 8c2e2dbee5cc relpipe-data/makra/unix.xsl --- a/relpipe-data/makra/unix.xsl Mon Nov 26 12:15:40 2018 +0100 +++ b/relpipe-data/makra/unix.xsl Mon Nov 26 18:56:21 2018 +0100 @@ -13,7 +13,7 @@ *NIX (formerly UNIX, now mostly GNU/Linux) *NIX - formerly UNIX, now mostly GNU/Linux + formerly UNIX, now mostly GNU/Linux *NIX