relpipe/relpipe-web: relpipe-data/examples-guile-aggregations.xml@5bc2bb8b7946


<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Aggregating data with Scheme</nadpis>
	<perex>counting records and computing sum</perex>
	<m:pořadí-příkladu>01700</m:pořadí-příkladu>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		<p>
			In <code>relpipe-tr-scheme</code> we can generate new records – not only modify records from the input.
			There is <code>--has-more-records</code> option which – if evaluated as true – says: „read one more record from the Scheme context and call me again“.
			We can also suppress all original records by <code>--where '#f'</code>.
			And we can also change the structure of the relation (see previous examples).
			Thus we can iterate through a relation but completely replace its structure and content.
		</p>
		
		<p>
			What it is good for? We can do aggregations – we can count records, compute sum, maximum, minimum or average value etc.
		</p>
		
		<m:pre jazyk="bash" src="examples/guile-file-count-size-sum.sh"/>
		
		<p>Usage example:</p>
		
		<m:pre jazyk="text"><![CDATA[$ ./guile-file-count-size-sum.sh /usr/share/icons/oxygen/
filesystem:
 ╭─────────────────┬───────────────╮
 │ count (integer) │ sum (integer) │
 ├─────────────────┼───────────────┤
 │            6260 │      31091700 │
 ╰─────────────────┴───────────────╯
Record count: 1]]></m:pre>

		<p>
			In SQL, the same result can be achieved by:
		</p>

		<m:pre jazyk="sql"><![CDATA[SELECT
	count(*) AS count,
	sum(size) AS sum
FROM filesystem;]]></m:pre>

		<p>
			This should be possible with <code>relpipe-tr-sql</code> in later versions.
			SQL is much more declarative and for many cases a better tool.
			In SQL we describe „how the result should look like“ instead of „how the result should be produced step by step“.			
		</p>
		
		<p>
			One day, there might also be a translator that parses SQL code and generates Scheme code,
			so we could have advantages of both worlds
			a) concise and declarative syntax of SQL and 
			b) streaming – which means no need for putting all the data in the RAM or on the disk.
		</p>


		
	</text>

</stránka>
author	František Kučera <franta-hg@frantovo.cz>
	Mon, 21 Feb 2022 00:43:11 +0100
branch	v_0
changeset 329	5bc2bb8b7946
parent 316	d7ae02390fac
permissions	-rw-r--r--