relpipe/relpipe-web: relpipe-data/examples-in-sql-selecting-existing-database.xml@de1b49ba06f1


<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>SELECTing from SQLite databases</nadpis>
	<perex>read from an existing SQLite file or update it</perex>
	<m:pořadí-příkladu>03300</m:pořadí-příkladu>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		<p>
			Both the <code>relpipe-tr-sql</code> and <code>relpipe-in-sql</code>
			support the option <code>--file</code> and can store data in a file instead of just in-memory.
			Primary purpose of this option is to offload to HDD and be able to process more data whithout consuming too much RAM.
			So such file is usually a temporary one and is deleted immediately after the transformation.
			But we can also direct it to an existing file and append new relations to it.
			Or we can create a new file and do not delete it – using the <code>--file-keep true</code> option.
		</p>
		
		<p>
			Thus we can use the <code>relpipe-in-sql</code> as a database client to access existing SQLite files and SELECT from them (and then convert the results to any supported format).
			An we can use the <code>relpipe-tr-sql</code> as an output filter which converts relational data to a SQLite file that can be queried later.
		</p>
		
		<p>
			SQLite is a popular format used by many applications to store structured data.
			Version control systems (VCS) like <a href="https://www.monotone.ca/">Monotone</a> or <a href="https://fossil-scm.org/">Fossil</a>
			use it as a repository format and put whole history, data and metadata into it.
		</p>
		
		<p>
			So if we clone a Fossil repository (the SQLite one in this case):
		</p>
		
		<m:pre jazyk="bash"><![CDATA[fossil clone http://www.sqlite.org/cgi/src sqlite.fossil]]></m:pre>
		
		<p>
			We can then query it using <m:name/> tools:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-sql \
	--file sqlite.fossil \
	--relation tickets "
		SELECT 
			type,
			status,
			count(*) AS count
		FROM ticket 
		GROUP BY type, status
		ORDER BY count DESC;
	" | relpipe-out-tabular]]></m:pre>
	
		<p>and get some aggregated statistics:</p>
		
		<m:pre jazyk="text"><![CDATA[tickets:
 ╭───────────────────┬─────────────────┬────────────────╮
 │ type     (string) │ status (string) │ count (string) │
 ├───────────────────┼─────────────────┼────────────────┤
 │ Code_Defect       │ Fixed           │ 394            │
 │ Code_Defect       │ Closed          │ 115            │
 │ Feature_Request   │ Closed          │ 36             │
 │ Feature_Request   │ Fixed           │ 16             │
 │ Build_Problem     │ Closed          │ 10             │
 │ Documentation     │ Fixed           │ 10             │
 │ Documentation     │ Closed          │ 8              │
 │ Incident          │ Closed          │ 4              │
 │ Build_Problem     │ Fixed           │ 3              │
 │ Performance_Issue │ Closed          │ 3              │
 │ Support_Request   │ Closed          │ 3              │
 │ Code_Defect       │ Open            │ 2              │
 │ Feature_Request   │ Open            │ 2              │
 │ Performance_Issue │ Fixed           │ 2              │
 │ Code_Defect       │ Deferred        │ 1              │
 │ Compiler_Warning  │ Closed          │ 1              │
 │ Compiler_Warning  │ Fixed           │ 1              │
 │ Not_A_Bug         │ Closed          │ 1              │
 │ Performance_Issue │ Open            │ 1              │
 │ Portability       │ Fixed           │ 1              │
 ╰───────────────────┴─────────────────┴────────────────╯
Record count: 20]]></m:pre>
		
		<p>
			Accessing internal data structures of another application might be bit tricky (if the application does not guarantee it as a stable API),
			but it can be used for ad-hoc digging and even in regular use, there are chances that such „API“ will not break for a reasonable period of time.
		</p>
		
		<p>
			However, best results can be achieved, if the database scheme is under our control or defined as a stable API by someone else.
		</p>
		
		<p>
			All relations or their subset can be dumped even without writing any SELECTs: just use <code>--copy '.*'</code>
			or e.g. <code>--copy 'tickets|tag|evet'</code> or <code>--copy-renamed 'tickets|tag|evet' 'copy_of_$0'</code> and get exported some relations under modified names (using regular expressions).
			Such export can be converted e.g. to Recfile or XML format using corresponding output filter.
		</p>
		
	</text>

</stránka>
author	František Kučera <franta-hg@frantovo.cz>
	Tue, 29 Oct 2019 16:40:23 +0100
branch	v_0
changeset 279	de1b49ba06f1
child 297	192b0059a6c4
permissions	-rw-r--r--