# HG changeset patch # User František Kučera # Date 1572392098 -3600 # Node ID 0b6b1781a0a5e76d694f67fc67aba86f1b6ab787 # Parent eccf2de78284a1a371b9d540694a3282d93c3bb5 examples: Indexing and searching the filesystem diff -r eccf2de78284 -r 0b6b1781a0a5 relpipe-data/examples-in-filesystem-tr-sql-indexing.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/relpipe-data/examples-in-filesystem-tr-sql-indexing.xml Wed Oct 30 00:34:58 2019 +0100 @@ -0,0 +1,143 @@ + + + Indexing and searching the filesystem + build an index of the filesystem and search it faster or offline using SQL + 03500 + + + +

+ Thanks to the relpipe-in-filesystem we can collect metadata (or even the file contents) + and store them for later use in an index file. + Such index is useful for faster access and for offline work (we can index e.g. an optical disc or external or network HDD). +

+ +

+ We can simply pipe the relational data into a file and use this file as the index. + Or we can use some other format. In this example, we will use an SQLite file as the index. +

+ +

+ First step is to collect the file metadata. We will index just a subset of our filesystem, + the /bin/ and /usr/bin/ directories: +

+ + + +

+ This index allows us to do fast searches and various analysis. + We can e.g. find 20 largest binaries: +

+ + + +

How very:

+ + + +

+ And we can collect additional metadata and append them to our index file. + In this example, we get lists of dynamically linked libraries using the ldd tool + for each binary and store the lists in our index: +

+ + (.*) \(/) { print "$ENV{f},$1\n"; }'; + done \ + | relpipe-in-csv \ + "dependency" \ + "program" string \ + "library" string \ + | relpipe-tr-sql --file bin.sqlite]]> + +

And then we can make a „popularity contest“ and find 20 most often used libraries:

+ + + +

Well, well… here we are:

+ + + + +

+ In future versions there might be an option to gather more file metadata like hashes, Exif etc. + But even in the current version, it is possible to gather any literally metadata using a custom script (as we have shown with ldd above). + Extended attributes are already supported (the --xattr option). +

+ +
+ +