relpipe-data/examples-asn1-x509.xml
author František Kučera <franta-hg@frantovo.cz>
Mon, 21 Feb 2022 00:43:11 +0100
branchv_0
changeset 329 5bc2bb8b7946
permissions -rw-r--r--
Release v0.18

<stránka
	xmlns="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/strana"
	xmlns:m="https://trac.frantovo.cz/xml-web-generator/wiki/xmlns/makro">
	
	<nadpis>Exploring content of X.509 certificates</nadpis>
	<perex>open and query common SSL/TLS certificates or other ASN.1 data encoded in BER/DER/CER</perex>
	<m:pořadí-příkladu>05000</m:pořadí-příkladu>

	<text xmlns="http://www.w3.org/1999/xhtml">
		
		<p>
			X.509 certificates and keys used for SSL/TLS (HTTPS, POP3S, IMAPS etc.) are usually distributed as files either with <code>.pem</code> or <code>.der</code> extension.
			Or bundled together in a PKCS#12 container as a <code>.p12</code> file.
			The „text“ PEM format is often considered more „accessible“ or „friendly“ than the binary DER.
			However PEM is just Base64 encoded DER original and is actually less legible to the naked eye than DER,
			because in DER we can spot at least some strings like common and domain names or validity/expiration dates or recognize certain data structures in a HEX editor.
			Base64 just obfuscates everything. PEM can be easily copied through clipboard, which is probably the only advantage of this format (but it can also more likely leak).
		</p>
		
		<!-- Indeed, Base64 is evil. Use hexadecimal encoding where ASCII representaion of binary data is necessary. -->
		
		<p>
			So our first step is to get rid of the annoying Base64 pseudo-plain-text encoding – we use one of these commands:
		</p>

		<m:pre jazyk="text"><![CDATA[cat certificate.pem | grep -v ^--- | base64 -d              > certificate.der
cat certificate.pem | openssl x509 -inform PEM -outform DER > certificate.der]]></m:pre>

		<p>
			Telco veterans could now start reading the DER file with <code>hd</code> or <code>xxd</code>, jumping over the offsets and traversing the sequences and sets…
			However most people would appreciate some software that helps them parsing the ASN.1 BER encoding (the superset of DER and CER).
			Such software is e.g. Wireshark or dumpasn1. These programs are good for ad-hoc inspection or quick check.
		</p>
		
		<p>
			In <m:name/> <m:a href="release-v0.18">v0.18</m:a> we have (early and bit raw) support for ASN.1 BER encoding and thus we can get the structured data in a machine-readable form
			– which is good for further processing, conversion to other formats or use in scripts.
			Because the ASN.1 data model is not relational – actually it is a tree – this format is supported in the <code>relpipe-in-asn1table</code>
			command that is modelled after the well-known <code>XMLTable()</code> database function that allows translating arbitrary tree structures to relations using the XPath expressions.
			So in <code>relpipe-in-asn1table</code> we can write XPath expressions to query the ASN.1 tree data structures and extract relations, records and attributes
			from X.509 certificates, keys or other cryptographic artifacts, LDAP or SNMP packets or any other ASN.1 BER data.
		</p>
		
		<p>
			But how do we know what XPath expressions should we run?
			It is useful to see the XML representation of whole source data.
			There is a simple trick to do this – use <code>"/"</code> as the XPath for selecting records (is always selects the single record, single node – the root)
			and use <code>"."</code> as the XPath to select a single attribute (it always select the root element)
			and add <code>--mode raw-xml</code>, so we get the raw XML source instead of the text content of given elements.
			We do not have to write this routine by hand – just create a symlink to the example script:
		</p>
		<m:pre jazyk="bash"><![CDATA[ln -s …/relpipe-in-xmltable.cpp/examples/2xml.sh asn12xml # in ~/bin or somewhere]]></m:pre>
		<p>
			This example is generic and works also for other formats supported by the <code>relpipe-in-*table</code> commands.
		</p>
		
		<p> 
			Then we can analyze X.509 DER certificates stored on our disk or we can fetch some from live servers.
			The <code>openssl</code> command helps us with that:
		</p>
		
		
		<m:pre jazyk="bash"><![CDATA[fetch_x509_certificate() {
	echo \
		| openssl s_client -connect $1:${2:-443} 2>/dev/null \
		| openssl x509 -inform PEM -outform DER;
}]]></m:pre>

		<p>Now put both commands together in a pipeline:</p>
		
		<m:pre jazyk="bash"><![CDATA[fetch_x509_certificate "gnu.org" | asn12xml # HTTPS port (443) is used as default]]></m:pre>
		
		<p>and get this XML representation of the ASN.1 X.509 tree:</p>

		<m:pre jazyk="xml" src="examples/x509-gnu.org.xml"/>
		
		
		<p>Once we know the structure, we can easily hack together a function that extracts parts of the tree as relations:</p>
		
		<m:pre jazyk="bash"><![CDATA[parse_x509_certificate() {
	relpipe-in-asn1table \
		--relation 'validity' \
			--records '//sequence[date-time][1]' \
			--attribute 'from'  string 'date-time[1]' \
			--attribute 'to'    string 'date-time[2]' \
		--relation 'alternative_name' \
			--records '//sequence[oid="2.5.29.17"][1]/encapsulated/sequence/specific' \
			--attribute 'name'  string '.';
}]]></m:pre>

		<p>Everything put together:</p>
		
		<m:pre jazyk="bash"><![CDATA[fetch_x509_certificate "gnu.org" | parse_x509_certificate | relpipe-out-tabular]]></m:pre>
		
		<p>will print:</p>
		
		<m:pre jazyk="text" src="examples/x509-gnu.org.txt"/>
		
		<p>
			The function above is just a „hello world“ example.
			Please note that the XPath expressions need to be carefully crafted with respect to the given format in order to match exactly what we want.
		</p>
		
		<p>
			Instead of printing a table, we can use the <code>relpipe-out-nullbyte</code> tool + the <code>read_nullbyte</code> function
			and shell loop over the records (alternative names) and e.g. <code>ping</code> each domain or fetch given root web page using <code>wget</code> or <code>curl</code>.
			We can also write a simple script that checks the validity of our own certificates and notifies us in advance when some of them are going to expire.
		</p>
		
		<p>
			Later versions of <code>relpipe-in-asn1table</code> will probably support OID names, so it will not be necessary to use the numeric object identifiers.
		</p>
		
		<p>
			n.b. there is also the <code>relpipe-in-asn1</code> – this tool reads data generated by its counterpart, the <code>relpipe-out-asn1</code> (or other ASN.1 BER capable software)
			i.e. it is not as universal as <code>relpipe-in-asn1table</code>, it has simpler interface, needs no configuration and expects certain ASN.1 structures (relations serialized in BER format).
		</p>
		
	</text>

</stránka>