relpipe-data/examples-in-sql-selecting-existing-database.xml
author František Kučera <franta-hg@frantovo.cz>
Mon, 21 Feb 2022 00:43:11 +0100
branchv_0
changeset 329 5bc2bb8b7946
parent 297 192b0059a6c4
permissions -rw-r--r--
Release v0.18

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>SELECTing from SQLite databases</nadpis>
	<perex>read from an existing SQLite file or update it</perex>
	<m:pořadí-příkladu>03300</m:pořadí-příkladu>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		<p>
			Both the <code>relpipe-tr-sql</code> and <code>relpipe-in-sql</code>
			support options <code>--data-source-name</code> and <code>--data-source-string</code>
			that can be used for connecting to a different database than the in-memory SQLite one.
			This is useful if we want to offload to HDD and be able to process more data whithout consuming too much RAM.
			Or if we want to access an existing database instead of just transforming records from the STDIN to STDOUT.
			These options allow connecting to any DBMS.
		</p>
		
		<p>
			Thus we can use the <code>relpipe-in-sql</code> as a database client to access existing SQLite files and SELECT from them (and then convert the results to any supported format).
			And we can use the <code>relpipe-tr-sql</code> as an output filter which converts relational data to a SQLite file that can be queried later.
		</p>
		
		<p>
			SQLite is a popular format used by many applications to store structured data.
			Version control systems (VCS) like <a href="https://www.monotone.ca/">Monotone</a> or <a href="https://fossil-scm.org/">Fossil</a>
			use it as a repository format and put whole history, data and metadata into it.
		</p>
		
		<p>
			So if we clone a Fossil repository (the SQLite one in this case):
		</p>
		
		<m:pre jazyk="bash"><![CDATA[fossil clone http://www.sqlite.org/cgi/src sqlite.fossil]]></m:pre>
		
		<p>
			We can then query the database file <code>sqlite.fossil</code> using <m:name/> tools:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-sql \
	--data-source-string 'Driver=SQLite3;Database=file:sqlite.fossil' \
	--relation tickets "
		SELECT 
			type,
			status,
			count(*) AS count
		FROM ticket 
		GROUP BY type, status
		ORDER BY count DESC;
	" | relpipe-out-tabular]]></m:pre>
	
		<p>and get some aggregated statistics:</p>
		
		<m:pre jazyk="text"><![CDATA[tickets:
 ╭───────────────────┬─────────────────┬────────────────╮
 │ type     (string) │ status (string) │ count (string) │
 ├───────────────────┼─────────────────┼────────────────┤
 │ Code_Defect       │ Fixed           │ 394            │
 │ Code_Defect       │ Closed          │ 115            │
 │ Feature_Request   │ Closed          │ 36             │
 │ Feature_Request   │ Fixed           │ 16             │
 │ Build_Problem     │ Closed          │ 10             │
 │ Documentation     │ Fixed           │ 10             │
 │ Documentation     │ Closed          │ 8              │
 │ Incident          │ Closed          │ 4              │
 │ Build_Problem     │ Fixed           │ 3              │
 │ Performance_Issue │ Closed          │ 3              │
 │ Support_Request   │ Closed          │ 3              │
 │ Code_Defect       │ Open            │ 2              │
 │ Feature_Request   │ Open            │ 2              │
 │ Performance_Issue │ Fixed           │ 2              │
 │ Code_Defect       │ Deferred        │ 1              │
 │ Compiler_Warning  │ Closed          │ 1              │
 │ Compiler_Warning  │ Fixed           │ 1              │
 │ Not_A_Bug         │ Closed          │ 1              │
 │ Performance_Issue │ Open            │ 1              │
 │ Portability       │ Fixed           │ 1              │
 ╰───────────────────┴─────────────────┴────────────────╯
Record count: 20]]></m:pre>
		
		<p>
			Accessing internal data structures of another application might be bit tricky (if the application does not guarantee it as a stable API),
			but it can be used for ad-hoc digging and even in regular use, there are chances that such „API“ will not break for a reasonable period of time.
		</p>
		
		<p>
			However, best results can be achieved, if the database scheme is under our control or defined as a stable API by someone else.
		</p>
		
		<p>
			All relations or their subset can be dumped even without writing any SELECTs: just use <code>--copy '.*'</code>
			or e.g. <code>--copy 'tickets|tag|evet'</code> or <code>--copy-renamed 'tickets|tag|evet' 'copy_of_$0'</code> and get exported some relations under modified names (using regular expressions).
			Such export can be converted e.g. to Recfile or XML format using corresponding output filter.
		</p>
		
	</text>

</stránka>