|
1 <stránka |
|
2 xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana" |
|
3 xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro"> |
|
4 |
|
5 <nadpis>Accessing SQLite, PostgreSQL and MySQL through ODBC</nadpis> |
|
6 <perex>use various DBMS for SQL transformations or data access</perex> |
|
7 <m:pořadí-příkladu>04200</m:pořadí-příkladu> |
|
8 |
|
9 <text xmlns="http://www.w3.org/1999/xhtml"> |
|
10 |
|
11 <p> |
|
12 Since <m:a href="release-v0.16">v0.16</m:a> the <code>relpipe-tr-sql</code> module |
|
13 uses the ODBC abstraction layer and thus we can access data in any DBMS (database management system). |
|
14 Our program depends only on the generic API and the driver for particular DBMS is loaded dynamically depending on the configuration. |
|
15 </p> |
|
16 |
|
17 <blockquote> |
|
18 <p> |
|
19 ODBC (Open Database Connectivity) is an industry standard and provides API for accessing a DBMS. |
|
20 In late 80s several vendors (mostly from the Unix and database communities) established the SQL Access Group (SAG) |
|
21 and then specified the Call Level Interface (CLI). ODBC, which is based on CLI, was published in early 90s. |
|
22 ODBC is available on many operating systems and there are at least two free software implementations: |
|
23 <a href="http://www.unixodbc.org/">unixODBC</a> and <a href="http://www.iodbc.org/">iODBC</a>. |
|
24 </p> |
|
25 </blockquote> |
|
26 |
|
27 <p>For more information see the <m:a href="release-v0.16">v0.16 release notes</m:a>.</p> |
|
28 |
|
29 <h2>General concepts and configuration</h2> |
|
30 |
|
31 <p> |
|
32 <strong>ODBC</strong>: |
|
33 the API consisting of C functions; see the files <code>sql.h</code> and <code>sqlext.h</code> e.g. in unixODBC. |
|
34 </p> |
|
35 <p> |
|
36 <strong>Database driver</strong>: |
|
37 a shared library (an <code>.so</code> file) |
|
38 that implements the API and connects to particular DBMS (SQLite, PostgreSQL, MySQL, MariaDB, Firebird etc.); |
|
39 is usually provided by the authors of given DBMS, sometimes writen by a third-party |
|
40 </p> |
|
41 <p> |
|
42 <strong>Client</strong>: |
|
43 a program that calls the API in order to access a database; our <code>relpipe-tr-sql</code> is a client |
|
44 </p> |
|
45 <p> |
|
46 <strong>Data Source Name (DSN)</strong>: |
|
47 the name of a preconfigured data source – when connecting, we need to know only the DSN – all parameters |
|
48 (like server name, user name, password etc.) can be then looked-up in the configuration |
|
49 </p> |
|
50 <p> |
|
51 <strong>Connection string</strong>: |
|
52 a text string consisting of serialized parameters needed for connecting |
|
53 – we can specify all parameters ad-hoc in the connection string without creating any permanent configuration; |
|
54 a connection string can also refer to a DSN and add or override some parameters |
|
55 </p> |
|
56 |
|
57 <p> |
|
58 There is some global configuration in the <code>/etc</code> directory. |
|
59 In <code>/etc/odbcinst.ini</code> we can a find list of ODBC drivers. |
|
60 Thanks to it, we can refer to a driver by its name (e.g. <code>SQLite3</code>) |
|
61 instead of the path to the shared library (e.g. <code>/usr/lib/x86_64-linux-gnu/odbc/libsqlite3odbc.so</code>). |
|
62 In <code>/etc/odbc.ini</code> we can find a list of global (for given computer) data sources. |
|
63 It is uncommon to put complete configurations in this file, because anyone would be able to read the passwords, |
|
64 but we can provide here just a <i>template</i> with public parameters like server name, port etc. |
|
65 and user will supply his own user name and password in the connection string or in his personal configuration file. |
|
66 </p> |
|
67 |
|
68 <p> |
|
69 The <code>~/.odbc.ini</code> contains personal configuration of given user. |
|
70 There are usually data sources including the passwords. |
|
71 Thus this file must be readable only by given user (<code>chmod 600 ~/.odbc.ini</code>). |
|
72 Providing passwords in connection strings passed as CLI arguments is not a good practice due to security reasons: |
|
73 by default it is stored in the shell history and it is also visible to other users of the same machine in the list of running processes. |
|
74 </p> |
|
75 |
|
76 <p> |
|
77 The section name – in the <code>[]</code> brackets – is the DSN. |
|
78 Then there are parameters in form of <code>key=value</code> on each line. |
|
79 </p> |
|
80 |
|
81 |
|
82 <h2>CLI options</h2> |
|
83 |
|
84 <p> |
|
85 The <code>relpipe-tr-sql</code> and <code>relpipe-in-sql</code> support these relevant CLI options: |
|
86 </p> |
|
87 |
|
88 <ul> |
|
89 <li> |
|
90 <code>--list-data-sources</code>: |
|
91 lists available (configured) data sources in relational format (so we pipe the output to some output filter e.g. to <code>relpipe-out-tabular</code>) |
|
92 </li> |
|
93 <li> |
|
94 <code>--data-source-name</code>: |
|
95 specifies the DSN of a configured data source |
|
96 </li> |
|
97 <li> |
|
98 <code>--data-source-string</code>: |
|
99 specifies the connections string for ad-hoc connection without need of any configuration |
|
100 </li> |
|
101 </ul> |
|
102 |
|
103 <pre><![CDATA[$ relpipe-tr-sql --list-data-sources | relpipe-out-tabular |
|
104 data_source: |
|
105 ╭───────────────┬──────────────────────╮ |
|
106 │ name (string) │ description (string) │ |
|
107 ├───────────────┼──────────────────────┤ |
|
108 │ sqlite-memory │ SQLite3 │ |
|
109 │ relpipe │ PostgreSQL Unicode │ |
|
110 ╰───────────────┴──────────────────────╯ |
|
111 Record count: 2]]></pre> |
|
112 |
|
113 <p> |
|
114 Because output of this command is relational, we can further process it in our relational pipelines. |
|
115 This output is also used for the Bash-completion for suggesting the DSN. |
|
116 </p> |
|
117 |
|
118 <p> |
|
119 If neither <code>--data-source-name</code> nor <code>--data-source-string</code> option is provided, |
|
120 a temporary in-memory SQLite database is used as default. |
|
121 </p> |
|
122 |
|
123 <h2>SQLite</h2> |
|
124 |
|
125 <p>In Debian GNU/Linux and similar distributions we can install <a href="https://sqlite.org/">SQLite</a> ODBC driver by this command:</p> |
|
126 |
|
127 <pre>apt install libsqliteodbc</pre> |
|
128 |
|
129 <p>Which also installs the SQLite library that is all we need (because SQLite is a <i>serverless and self-contained</i> database).</p> |
|
130 |
|
131 <p> |
|
132 Then we can use the default in-memory temporary database or specify the connection string ad-hoc, |
|
133 <m:a href="examples-in-sql-selecting-existing-database">access existing SQLite databases</m:a> |
|
134 or <m:a href="examples-in-filesystem-tr-sql-indexing">create new ones</m:a> – e.g. this command: |
|
135 </p> |
|
136 |
|
137 <pre>… | relpipe-tr-sql --data-source-string 'Driver=SQLite3;Database=file:MyDatabase.sqlite'</pre> |
|
138 |
|
139 <p>will create the <code>MyDatabase.sqlite</code> file and fill it with relations that came from STDIN.</p> |
|
140 |
|
141 <p>For frequently used databases it is convenient to configure a data source in <code>~/.odbc.ini</code>:</p> |
|
142 |
|
143 <m:pre jazyk="ini"><![CDATA[[MyDatabase] |
|
144 Driver=SQLite3 |
|
145 Database=file:/home/hacker/MyDatabase.sqlite]]></m:pre> |
|
146 |
|
147 <p> |
|
148 and then connect to it simply using <code>--data-source-name MyDatabase</code> |
|
149 (both the option and the name will be suggested by Bash-completion). |
|
150 </p> |
|
151 |
|
152 <p> |
|
153 The <a href="http://www.ch-werner.de/sqliteodbc/html/index.html">SQLite ODBC driver</a> supports several parameters that are described in its documentation. |
|
154 One of them is <code>LoadExt</code> that loads SQLite extensions: |
|
155 </p> |
|
156 |
|
157 <m:pre jazyk="ini"><![CDATA[LoadExt=/home/hacker/libdemo.so]]></m:pre> |
|
158 |
|
159 <p> |
|
160 So we can write our own SQLite extension with custom functions or other features |
|
161 (<a href="https://blog.frantovo.cz/c/383/Komplexita%20softwaru%3A%20%C5%98e%C5%A1en%C3%AD%20a%C2%A0prevence#toc_sqlite">example</a>) |
|
162 or chose some existing one and load it into the SQLite connected through ODBC. |
|
163 </p> |
|
164 |
|
165 |
|
166 <h2>PostgreSQL</h2> |
|
167 |
|
168 <p>In Debian GNU/Linux and similar distributions we can install <a href="https://www.postgresql.org/">PostgreSQL</a> ODBC driver by this command:</p> |
|
169 |
|
170 <pre>apt install odbc-postgresql</pre> |
|
171 |
|
172 <p> |
|
173 PostgreSQL is very powerful DBMS (probably most advanced free software relational database system) |
|
174 and utilizes the client-server architecture. |
|
175 This means that we also needs a server (can be also installed through <code>apt</code> like the driver). |
|
176 </p> |
|
177 |
|
178 <p> |
|
179 Once we have a server – remote or local – we need to create a user (role). |
|
180 For SQL transformations we configure a dedicated role that has no persistent schema and uses the temporary one as default, |
|
181 which means that all relations we create are lost at the end of the session (when the <code>relpipe-tr-sql</code> command finishes), |
|
182 thus it behaves very similar to the SQLite in-memory database. |
|
183 </p> |
|
184 |
|
185 <m:pre jazyk="sql"><![CDATA[CREATE USER relpipe WITH PASSWORD 'someSecretPassword'; |
|
186 ALTER ROLE relpipe SET search_path TO 'pg_temp';]]></m:pre> |
|
187 |
|
188 <p> |
|
189 And then we <a href="https://odbc.postgresql.org/docs/config.html">configure</a> the ODBC data source: |
|
190 </p> |
|
191 |
|
192 <m:pre jazyk="ini"><![CDATA[[postgresql-temp] |
|
193 Driver=PostgreSQL Unicode |
|
194 Database=postgres |
|
195 Servername=localhost |
|
196 Port=5432 |
|
197 Username=relpipe |
|
198 Password=someSecretPassword]]></m:pre> |
|
199 |
|
200 <p> |
|
201 Now we can use advanced PostgreSQL features for transforming data in our pipelines. |
|
202 We can also configure a DSN for another database that contains some useful data and other database objects, |
|
203 call existing business functions installed in such database, load data to or from this DB etc. |
|
204 </p> |
|
205 |
|
206 |
|
207 <h2>MySQL</h2> |
|
208 |
|
209 <p> |
|
210 If the <code>libmyodbc</code> package is missing in our distribution, |
|
211 the ODBC driver for <a href="https://dev.mysql.com/downloads/connector/odbc/">MySQL</a> can be downloaded from their website. |
|
212 We can get a binary package (<code>.deb</code>, <code>.rpm</code> etc.) or source code. |
|
213 If we are compiling from sources, we do something like this: |
|
214 </p> |
|
215 |
|
216 <m:pre jazyk="bash"><![CDATA[cd mysql-connector-odbc-*-src/ |
|
217 mkdir build |
|
218 cd build |
|
219 cmake ../ -DWITH_UNIXODBC=1 |
|
220 make]]></m:pre> |
|
221 |
|
222 <p> |
|
223 We should use the driver in the same or similar version as the MySQL client library installed on our system. |
|
224 For example 8.x driver will not work with 5.x library. |
|
225 Successful compilation results in <code>libmyodbc*.so</code> files. |
|
226 </p> |
|
227 |
|
228 <p> |
|
229 Like PostgreSQL, also MySQL is a client-server, |
|
230 so we need a server where we create a database and some user account. |
|
231 As root through the <code>mysql mysql</code> command we execute: |
|
232 </p> |
|
233 |
|
234 <m:pre jazyk="sql"><![CDATA[CREATE DATABASE relpipe CHARACTER SET = utf8; |
|
235 CREATE USER 'relpipe'@'localhost' IDENTIFIED BY 'someSecretPassword'; |
|
236 GRANT ALL PRIVILEGES ON relpipe.* TO 'relpipe'@'localhost'; |
|
237 FLUSH PRIVILEGES;]]></m:pre> |
|
238 |
|
239 <p>As a normal user we add new data source to our <code>~/.odbc.ini</code> file:</p> |
|
240 |
|
241 <m:pre jazyk="ini"><![CDATA[[mysql-relpipe-localhost] |
|
242 Driver=/home/hacker/src/mysql/build/lib/libmyodbc5w.so |
|
243 Server=localhost |
|
244 Port=3306 |
|
245 Socket=/var/run/mysqld/mysqld.sock |
|
246 User=relpipe |
|
247 Password=someSecretPassword |
|
248 Database=relpipe |
|
249 InitStmt=SET SQL_MODE=ANSI_QUOTES; |
|
250 Charset=utf8]]></m:pre> |
|
251 |
|
252 <p> |
|
253 See that we have compiled the ODBC driver in our home directory |
|
254 and even without installing it anywhere and registering it in the <code>/etc/odbcinst.ini</code> file, |
|
255 we can simply refer to the <code>.so</code> file from our <code>~/.odbc.ini</code>. |
|
256 </p> |
|
257 |
|
258 <p> |
|
259 If we set <code>Server=localhost</code>, the client-server communication does not go through TCP/IP |
|
260 but rather through the unix domain socket specified in the <code>Socket</code> field. |
|
261 If we set <code>Server=127.0.0.1</code> or some remote IP address or domain name, the communication goes through TCP/IP on given port. |
|
262 </p> |
|
263 |
|
264 <p> |
|
265 The <code>SET SQL_MODE=ANSI_QUOTES;</code> init statement is important, |
|
266 because it tells MySQL server that it should support standard SQL "quoted" identifiers |
|
267 instead of that `weird` MySQL style. |
|
268 We use the standard SQL while creating the tables. |
|
269 </p> |
|
270 |
|
271 <p> |
|
272 There are many other parameters, quite well |
|
273 <a href="https://dev.mysql.com/doc/connector-odbc/en/connector-odbc-configuration-connection-parameters.html">documented</a>. |
|
274 </p> |
|
275 |
|
276 <p> |
|
277 Now we can use MySQL as the <i>SQL engine</i> for transformations in our pipelines |
|
278 and we can also access existing MySQL databases, |
|
279 load data to and from them |
|
280 or call functions and procedures installed on the server. |
|
281 </p> |
|
282 |
|
283 |
|
284 </text> |
|
285 |
|
286 </stránka> |