Full-Text Indexing PDFs in Javascript

I once worked for a company that sold access to legal and financial databases (as they call it, “intelligent information“). Most court records are PDFS available through PACER, a website developed specifically to distribute court records. Meaningful database products on this dataset require building a processing pipeline that can extract and index text from the […]

Improving the default Android Keyboard

My Android keyboard makes word suggestions as you type. The algorithm appears to be a frequency-based text look-up, although it occasionally picks up similar-sounding words. While usable, it has enough issues to be worth replacing. Android kindly lets you do this, and there are numerous apps to do so. To build a new keyboard, we […]

Fixing org.apache.solr.common.SolrException: Length Required

I received the following exception, after making no code changes: org.apache.solr.common.SolrException: Length Required The issue is that CommonsHttpSolrServer does not send a Content-Length header in updates. The root cause of my issue was switching the front-end proxy from Apache to Nginx, which apparently is more strict about headers.