relpipe-data/examples-tr-sqlite-custom-version.xml
author František Kučera <franta-hg@frantovo.cz>
Sat, 06 Jun 2020 13:21:38 +0200
branchv_0
changeset 299 dd7aeff5ef0c
parent 297 192b0059a6c4
permissions -rw-r--r--
fix typo: relasease → release

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Using custom version of SQLite (LD_PRELOAD)</nadpis>
	<perex>switch to a newer or modified version of library using a little hack</perex>
	<m:pořadí-příkladu>03600</m:pořadí-příkladu>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		<p>
			One of reasons why we prefer shared libraries (<code>.so</code>) rather than static linking,
			is that shared libraries are much more hacker-friendly and allow the user switching to a newer or modified library without recompiling the program.
		</p>
		
		<p>
			<strong>
				n.b. This method is obsolete since <m:a href="release-v0.16">v0.16</m:a> that does not use SQLite library directly 
				and uses arbitrary database driver (including SQLite one) through an abstraction layer (ODBC).
				This article is still valid as an example of the LD_PRELOAD hack and can be used with older versions of <m:name/>.
				Since v0.16 we can easily replace whole ODBC driver (and thus use also different version of the SQLite),
				there is no need for LD_PRELOAD hacking
				– we can just configure desired driver (the <code>.so</code> file) in the INI file or ad-hoc in the connection string.
			</strong>
		</p>
		
		<p>
			By default, <code>relpipe-tr-sql</code> links to the SQLite library available in our distribution (e.g. 3.22).
			As we can check:
		</p>
		
		<m:pre jazyk="text"><![CDATA[$ ldd $(which relpipe-tr-sql) | grep sqlite
    libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f4c73888000)]]></m:pre>
	
		<p>
			But what if we want to use some new features like <a href="https://www.sqlite.org/windowfunctions.html">window functions</a> that are available in later (3.25) library versions?
			Or what if we want to add our own custom modifications to this library?
			Do we have to recompile the <m:name/> tools?
		</p>
		
		<p>
			No, we can just plug the new/modified library in and use it instead of the distribution one.
		</p>
		
		<h2>Download and compile SQLite library</h2>
		
		<p>
			The build process of SQLite should be straightforward:
		</p>
		
		<m:pre jazyk="bash"><![CDATA[
# Switch to the user for such experiments (optional but recommended):
su - hacker

# Create directories, download and extract sources:
mkdir -p ~/src/sqlite
cd ~/src/sqlite
wget https://www.sqlite.org/2019/sqlite-autoconf-3300100.tar.gz
tar xvzf sqlite-autoconf-3300100.tar.gz
cd sqlite-autoconf-3300100/
			
# Optional: do some changes to the source code

# Build SQLite:
./configure
make

# Test it:
echo "SELECT 'hello world'" | ./sqlite3
]]></m:pre>

		<p>
			The desired shared libraries are located in the <code>.libs</code> directory.
		</p>
		
		<h2>Use this library in the <m:name/> tools</h2>
		
		<p>
			We have already build/installed <code>relpipe-tr-sql</code> which is linked to the SQLite library available in our distribution.
			Then switching to a custom version of the library is very easy – we just need to set an environment variable.
		</p>
		
		<m:pre jazyk="bash"><![CDATA[export LD_PRELOAD=~/src/sqlite/sqlite-autoconf-3300100/.libs/libsqlite3.so.0]]></m:pre>
		
		<p>And then <code>relpipe-tr-sql</code> will use the newer library version as we can check:</p>
		
		<m:pre jazyk="text"><![CDATA[$ ldd $(which relpipe-tr-sql) | grep sqlite
    /home/hacker/src/sqlite/sqlite-autoconf-3300100/.libs/libsqlite3.so.0 (0x00007f9979578000)]]></m:pre>
		
		<p>Now we can use new features like window functions:</p>
		
		<m:pre jazyk="bash"><![CDATA[relpipe-in-fstab \
	| relpipe-tr-sql \
		--relation "fs_types" "
			SELECT 
				mount_point, 
				type, 
				count(*) OVER (PARTITION BY type) AS same_type_count 
			FROM fstab" \
	| relpipe-out-tabular]]></m:pre>
	
		<p>And get result like this one:</p>
		<m:pre jazyk="text"><![CDATA[fstab:
 ╭──────────────────────┬───────────────┬──────────────────────────╮
 │ mount_point (string) │ type (string) │ same_type_count (string) │
 ├──────────────────────┼───────────────┼──────────────────────────┤
 │ /home                │ btrfs         │ 1                        │
 │ /                    │ ext4          │ 2                        │
 │ /mnt/data            │ ext4          │ 2                        │
 │ /media/cdrom0        │ udf,iso9660   │ 1                        │
 │ /mnt/private         │ xfs           │ 1                        │
 ╰──────────────────────┴───────────────┴──────────────────────────╯
Record count: 5]]></m:pre>
		
		
		<p>That would not be possible with older versions of the SQLite library – as we can check by unsetting the <code>LD_PRELOAD</code> variable:</p>
		<m:pre jazyk="bash"><![CDATA[unset LD_PRELOAD]]></m:pre>
		<p>Which returns us to the previous state where SQLite from our distribution was used. And then calling the same SQL query leads to an error.</p>
		
		<p>
			The <code>LD_PRELOAD</code> hack can be used with any other software – it is not specific to <m:name/>.
			Another example is the <a href="https://mouse.globalcode.info/v_0/spacenav-hack.xhtml">Spacenav Hack</a>  which bridges/translates two APIs of a library.
		</p>
		
		<p>
			n.b. if we do <code>export LD_PRELOAD</code> it will affect all programs started from given shell session
			and if we even put it in our <code>.bashrc</code>, it will affect all Bash sessions started later and programs started from them.
			Which might not be a desired behavior. So sometimes it is better to set the <code>LD_PRELOAD</code> variable only for a single command, not globally.
			This can be done through a custom wrapper script or an alias:
		</p>
		<m:pre jazyk="bash"><![CDATA[alias relpipe-tr-sql='LD_PRELOAD=~/src/sqlite/sqlite-autoconf-3300100/.libs/libsqlite3.so.0 relpipe-tr-sql']]></m:pre>
		<p>We can safely put this this line into our <code>.bashrc</code> without affecting any other software.</p>
				
		<h2>ABI compatibility</h2>
		
		<p>
			The prerequisite for such easy library swapping is the compatibility of the ABI (application binary interface).
			It means that we can change the library internals (the SQL language features in this case) but we must retain the compiled representation of the C functions compatible
			so the both parts (the library and the program) will still fit together. We can not e.g. remove a C function.
			And we should also not do any incompatible changes on the semantic level (although it could still link together, it would lead to unwanted results).
		</p>
		
		<p>
			In case of libraries that follow the <a href="https://semver.org/">Semantic versioning</a> (as required by <a href="https://sane-software.globalcode.info/">Sane software manifesto</a>) for their ABI,
			it is easy to say which versions are compatible and which would require recompiling the program or even changing its source code.
			If the <em>patch</em> or <em>minor</em> version numbers were changed, the libraries could be swapped this way.
			If the <em>major</em> version was changed, it would be probably necessary to also modify the software that uses this library.
		</p>
		
	</text>

</stránka>