Entries by Gary

Optimizing WordPress Tag Pages

Normally I don’t like to write about “blogging,” but since website traffic generates some interesting data, it’s worth looking at it from a computer science perspective, to see the issues involved. By default, WordPress has two multi-valued fields associated with an article, “Categories” and “Tags.” Categories are treated as a closed, hierarchical set, and tags […]


NLP Analysis in Python using Modal Verbs

Modal verbs are auxiliary verbs which indicate semantic information about an action, i.e. likelihood (will, should) , permission (could, may), obligation (shall/must). One interesting concept to explore is whether the presence of these verbs varies over different types of text, and whether that means anything. “Natural Language Processing with Python” (read my review) has an […]

Making Maps with Tilemill

TileMill is a piece of map-making software for rendering beautiful maps. You can export the maps to MapBox, for a Google Maps feel or combine with a tool like D3.js for interactive infographics. There are a surprising number of data sources: weather, earthquake locations, crime statistics, and ship and plane locations. A lot of this is from federal and municipal agencies […]

Processing Command Line Arguments in Java

Rather than parsing command line arguments yourself, Apache has a nice library to do it for you, called Apache Commons CLI. It has a few different options, for various flavors of parsing, although this example demonstrates the two most common use cases (I think) – a flag, and setting a value. The nice thing about […]

Proxying HTTP requests with PHP

The following code will proxy requests to an external API. This has several advantages: Control over an API key Set caching headers to prevent overuse of an API Prevent issues with cross-domain scripting errors Limit the scope of what APIs can be called through your proxy $query = urlencode($_GET[’query’]); $url = ”;   $url = […]