Rasppi-search
From openmichigan
This may seem like a simple task given the "Google" world we find ourselves living in, but given a initial criteria for this deployment of the Raspberry Pi was lack of network connectivity, off-loading the search function or even the search index was not useful. Thus a local search of local content using local machine resources was required.
On linux, a natural candidate for searching file content for a specific text-string is the linux command "grep". For the OER collection, searching all files (movies, programs, etc.) is not necessary and reduces the consumption of system resources and the search-time latency. The "grep" command does not appear to easily support a "ignore these type of files" capability. In exploring other capabilities, I choose "ack-grep" as the search tool.
ack-grep
ack-grep versus grep
ack-grep documentation
There were 3 file types (thanks to Kathleen for pointing this out!) that were not successfully searched by either ack-grep or grep
- .docx
- .pptx
These files are stored in a compressed mode, thus a normal "search file for text-string" failed. At this point I was motivated to revisit my exploration of the Apache "solr" and "lucene" capability. Given that the search functions require a Java environment requiring a large memory and CPU commitment that was incompatiable with the Raspberry Pi environment, I went back to exploring other options. I discovered several "Search PDF tools" and after performance comparisons I chose to use pdfgrep. For reading *.docx and *.pptx files, I found some suggestive code here , that I altered to fit the Raspberry Pi environment, which is a PHP script named look.php.
[edit] Here is the Search Web Page:
[edit] and the search results page looks similar to this: