# HG changeset patch # User František Kučera # Date 1544117914 -3600 # Node ID 5b0fab48d59e0f4042074dd8aa83a143bfc85150 # Parent c952261978e8ada3093e44e652a1374cf6f9fb0c principles: streaming diff -r c952261978e8 -r 5b0fab48d59e relpipe-data/principles.xml --- a/relpipe-data/principles.xml Thu Dec 06 17:11:29 2018 +0100 +++ b/relpipe-data/principles.xml Thu Dec 06 18:38:34 2018 +0100 @@ -124,6 +124,26 @@ The data that are not written don't need to be compressed and thus have the best compression ratio.

+

Streaming

+ +

+ Relational tools should process streams of data and should hold only necessary data in the memory + i.e. the tool should produce the output (the first record) as soon as possible while still reading the input (following records). + Thus the memory usage does not depend on the volume of processed data. +

+ +

+ However, there are cases where such streaming is not feasible e.g. if we need to compute some statistics or a column widths while printing a table in the terminal. + In such situation, we must read the whole relation and only then generate the output. + But we should still be able to do streaming on the relations level e.i. if there are more relation, we always hold only one of them in the memory. +

+ +

+ This rule is important not only from the performance point of view but also for user experience. + The user should see the output as soon as possible i.e. the longer running processes will produce result continuously instead of flushing everything at the end. + This is also good for debugging and looking inside the things. +

+

Unambiguity