Entries by admin

, ,

Building a Terabyte-scale Math Platform

Cliff Click, 0xdata Click represents 0xdata, which is building a system that can handle R-style analysis at a large speed/scale, aimed at companies that do advertising or credit card fraud detection, where transaction volume is large, and where money is lost waiting for models to rebuild. Typically these data comes from a variety of sources, file […]

, ,

Philly ETE – Database as a Value

This was the first time I’ve seen Rich Hickey’s talk on Datomic, which lent great clarity to the product. As implemented, Datomic functions as an immutable database for philosophical reasons, although in practice it doesn’t manage it’s own storage, and may eventually support deletions to satisfy legal and compliance issues around privacy. This database technology […]

Philly ETE – Metor.js Talk Summary

I went to a talk at Philly ETE on Meteor, a Javascript-based webapp development platform. The talk was given by Avital Oliver, one of the core meteor developers, on “smart packages,” which is what meteor calls extensions to their core product. While in it’s infancy, the framework appears to be built around rapid development iteration […]

Philly ETE 2013 – Day 1 Keynote Summary

Summary of talk by Claudia Perlich – Chief Scientist, m6d The Philly ETE Keynote address was a presentation on modelling advertising data by media 6 degrees, a company which uses data modelling techniques to improve advertisement conversion rates for large brands. The techniques presented showed that predictive models of purchasing behavior can be built in […]


Inspired by a client project with thousands of lines of poorly structured, badly written ExtJS code, I wrote a grep implementation to recursively search the contents of Javascript variables, available on github. This provides a single function “grep”, which can be used directly or added to the global namespace. It recursively searches objects – keys, […]

Improving the default Android Keyboard

My Android keyboard makes word suggestions as you type. The algorithm appears to be a frequency-based text look-up, although it occasionally picks up similar-sounding words. While usable, it has enough issues to be worth replacing. Android kindly lets you do this, and there are numerous apps to do so. To build a new keyboard, we […]

Solr CSV DataImportHandler sample

The following will import a two field CSV file into solr, assuming two columns, name and count. The name field is always quoted. <dataConfig> <dataSource name=”ds1″ type=”FileDataSource” /> <document> <entity name=”ngrams” processor=”LineEntityProcessor” url=”E:/Projects/Data/words-txt.csv” dataSource=”ds1″ transformer=”RegexTransformer”> <field column=”rawLine” regex=”^&quot;(.*)&quot;\t(.*)$” groupNames=”name,count” /> </entity> </document> </dataConfig>