Moving files and folders into hashed subfolders

The following will move a series of files into subfolders. It hashes the file names, building a two character, two folder deep hierarchy to split the files, e.g. a1/b2. The motivation for this is to split 500,000 folders into a manageable hierarchy, to avoid file system limits – in NTFS this is quite slow, and […]

Cobol v. Fortran

I thought it’d be interesting to compare how many people admit to knowing ancient programming languages on their LinkedIn pages. This is in a contrast to my post on the popularity of hip JVM languages Scala and Clojure. True to it’s reputation for scientific computation power, Fortran is primarily used by scientific organization – an […]


“Learning ExtJS 4” Review

“Learning ExtJS 4” is a good, practical introduction to ExtJS for beginner ExtJS developers who haven’t used the library, or who have used prior Ext versions. In a couple spots I found myself wishing for deep technical details, e.g. explanations the rationale behind architectural decisions the Ext team made, but that is consistent with the […]

Scala vs. Clojure

LinkedIn shows 42% growth (year over year?) people claiming Clojure as a skill – surprisingly beating out Scala’s 9%, a surprising feat for a lisp-variant. Turns out LinkedIn’s default view is misleading – Scala shows more new adopters (2.4k) vs  1.5k for Clojure. LinkedIn refuses to show counts for Java- but check out the number […]

, ,

Building a Terabyte-scale Math Platform

Cliff Click, 0xdata Click represents 0xdata, which is building a system that can handle R-style analysis at a large speed/scale, aimed at companies that do advertising or credit card fraud detection, where transaction volume is large, and where money is lost waiting for models to rebuild. Typically these data comes from a variety of sources, file […]

, ,

Philly ETE – Database as a Value

This was the first time I’ve seen Rich Hickey’s talk on Datomic, which lent great clarity to the product. As implemented, Datomic functions as an immutable database for philosophical reasons, although in practice it doesn’t manage it’s own storage, and may eventually support deletions to satisfy legal and compliance issues around privacy. This database technology […]


Druid – A Column Oriented Database

I attended a talk at Philly ETE by Metamarkets, a company doing real-time analytics for advertising. Having worked on a couple Oracle-based reporting projects, I entered with interest. Their system is built around dimensional modeling, although with atypically high volume inserts and low latency for updating reports. They attempted to build this system with a […]

Philly ETE – Metor.js Talk Summary

I went to a talk at Philly ETE on Meteor, a Javascript-based webapp development platform. The talk was given by Avital Oliver, one of the core meteor developers, on “smart packages,” which is what meteor calls extensions to their core product. While in it’s infancy, the framework appears to be built around rapid development iteration […]

Philly ETE 2013 – Day 1 Keynote Summary

Summary of talk by Claudia Perlich – Chief Scientist, m6d The Philly ETE Keynote address was a presentation on modelling advertising data by media 6 degrees, a company which uses data modelling techniques to improve advertisement conversion rates for large brands. The techniques presented showed that predictive models of purchasing behavior can be built in […]