Loading PDFs in PhantomJS using PDF.JS

PhantomJS is a neat webkit wrapper, allowing you to write cross-platform command-line Javascript utilities. Javascript scripting has been common in the Windows world for as long as I can remember through Windows Scripting Host, but PhantomJS provides access to many new libraries worth exploring. One such library is PDF.JS – a product of Mozilla Labs […]

,

Finding Matching Images in Python using Corner Detection

I’m working through Programming Computer Vision with Python: Tools and algorithms for analyzing images, which covers various mechanisms for determining corresponding methods to match points of interest between two interest. In the book, this eventually builds up to an instruction on how to reconstruct a panorama. The first technique for finding corresponding points of interest […]

Fixing the error “TypeError: ‘undefined’ is not a function (evaluating ‘globalScope[‘console’][‘log’].bind(globalScope[‘console’])’)”

Some libraries, like PDF.js, initialize their own logging function, which wraps console.log. If this runs in a context where function.bind does not exist, you’ll get the following error: TypeError: ‘undefined’ is not a function (evaluating ‘globalScope[‘console’][‘log’]. bind(globalScope[‘console’])’) Fixing this is actually quite simple- Mozilla provides a replacement function you can drop-in (not surprising, considering how […]

Importing Data from Solr to Postgres with Scala

I suspect most people who set up Solr indexes pull data from a second system into Solr; having written a previous example where I pulled git data into a Solr index, I copied this data into Postgres to allow comparing the behavior of two full-text indexers. This is a fairly simple process if you make […]

A History of Philadelphia Churches through Maps, Part II

In the first part of this series, I discussed how distribution of churches across the Philadelphia region ties to population density, suggesting that visual patterns in maps can be used to better understand slices of our history. This material isn’t particularly novel and tells stories that are fairly well known; my interest is driven in […]

Scraping Tabular Websites into a CSV file using PhantomJS

While there are many tools for scraping website content, two of my current favorites are PhantomJS (Javascript) and BeautifulSoup (Python). Many small scale problems are easily solved downloading files with wget, then post-processing – this works well, as the post-processing typically requires dozens of iterations to extract clean data. If you’re wondering about the legitimacy […]

Google +1’s and Search Rankings

There’s some debate on SeoMoz / Hacker News. I thought I’d share my experience, since it seems in conflict with the two main arguments. The first being what the correlation between Google+ 1’s and search ranking is, the second being that Google just wants everyone to write quality content. Here is a chart from the […]

,

Finding Image Boundaries in Python

I’m working my way through Programming Computer Vision with Python, a compact introduction to Computer Vision. Computer Vision is a fascinating subset of computer science that has recently pushed aggressively forward through a combination of Dept of Defense research in self-driving cars, video game development, and rapid improvements in computer hardware. I’m writing a series of […]

Book Review: The Joy of Clojure

I picked up The Joy of Clojure: Thinking the Clojure Way, to help me understand Clojure code, after discovering a couple experimental data manipulation projects that use Clojure. I’ve also seen one of Rich Hickey’s talks on Datomic, but haven’t been able to follow the examples, despite some past lisp experience. For those who aren’t […]