Talk Summary: What is Acunu?

I recently attended a talk by a sales engineer for Acunu (, an analytics platform for Cassandra. I came away with a couple interesting notes: – The product aims to build data cubes for you in a “big data” scenario – Operating on the principle that disk space is cheap, they increment lots of counters […]

, ,

Six Join Implementations in Javascript

A join is an operation between two tables of data, combining the results by looking up keys from one table in a second table. While a simple operation in concept, there are many ways to do this and understanding the variations are important to understanding database behavior (for a discussion of how the algorithms are […]

, ,

Building a Terabyte-scale Math Platform

Cliff Click, 0xdata Click represents 0xdata, which is building a system that can handle R-style analysis at a large speed/scale, aimed at companies that do advertising or credit card fraud detection, where transaction volume is large, and where money is lost waiting for models to rebuild. Typically these data comes from a variety of sources, file […]

, ,

Philly ETE – Database as a Value

This was the first time I’ve seen Rich Hickey’s talk on Datomic, which lent great clarity to the product. As implemented, Datomic functions as an immutable database for philosophical reasons, although in practice it doesn’t manage it’s own storage, and may eventually support deletions to satisfy legal and compliance issues around privacy. This database technology […]


Druid – A Column Oriented Database

I attended a talk at Philly ETE by Metamarkets, a company doing real-time analytics for advertising. Having worked on a couple Oracle-based reporting projects, I entered with interest. Their system is built around dimensional modeling, although with atypically high volume inserts and low latency for updating reports. They attempted to build this system with a […]


Data Warehousing, NoSQL, and the Cloud

With the nascent advent of NoSql, cloud computing and slick new databases, we seem to have forgotten from whence we came. I went to a conference recently on the open source search product Solr/Lucene. One of the keynote speakers, Chief Data Scientist of HortonWorks, discussed what turned him to NoSQL databases, in this case, a […]

, , , ,

Building a Naive Bayes Classifier in the Browser using Map-Reduce

The last decade of Javascript performance improvements in the browser provide exciting possibilities for distributed computing. Like SETI and Folding@Home, client-side javascript could be used to build a distributed super-computer, although at the risk of compromising data security and consistency. New HTML5 APIs extend the vast range of Javascript libraries available; for instance, the audio […]

First Philadelphia Node.JS Meetup

I went to the first Philadelphia Node.JS meetup last night hosted by Zivtech. It was a mix of local developers at various stages in their careers. There were several informative impromptu presentations, including a demonstration of file transfer using web sockets and Node.JS and sending Apple IOS push notifications from Node.JS. Useful libraries: – […]