Posts

Query Stackoverflow by tag

Stackoverflow allows you to query their data with SQL (see https://data.stackexchange.com/stackoverflow/query/new). For example, you can get a list of all posts tagged with ‘scala’ like so: SELECT * FROM Posts p JOIN PostTags pt ON p.Id = pt.PostId JOIN Tags t ON pt.TagId = t.Id AND TagName = ‘scala’

RethinkDB: FULL OUTER JOIN example

If you want to do a full outer join in RethinkDB you can, although it will be noticeably slower than other queries. If this seems strange, this is a somewhat common use case in reporting or scraping, where you may have a subset of all available fields at any point. The way to execute this […]

RethinkDB: RIGHT JOIN Example

RethinkDB has a function called “outerJoin”, which actually does a left join. For example, this finds users and adds their audit trail history (if any): r.db(’test’) .table(’user_actions’) .outerJoin(r.table(’users’), (action, user) => action(’user_id’).eq(user(’id’))) .zip() If instead you want to start with audit history and add users (if they exist) you have to swap everything manually, since […]

RethinkDB: LEFT JOIN Example

In RethinkDB, a LEFT JOIN looks exactly like INNER JOIN: r.db(’test’) .table(’user_actions’) .outerJoin(r.table(’users’), (action, user) => action(’user_id’).eq(user(’id’))) .zip() A couple notes here: to achieve an expected result you need to add “.zip” at the end, otherwise you get a data structure with “left” and “right”. It’s also important to note that despite the name this […]

RethinkDB: INNER JOIN Example

One way to to do an inner join in RethinkDB with the eqJoin function: r.db(’test’) .table(’user_actions’) .filter({file_name: ‘test2.pdf’}) .eqJoin(’user_id’, r.table(’users’)) The biggest disadvantage to this API is that you can’t easily reverse the order of the tables. In this case, I’m taking users and adding audit information to them, and in SQL I would be […]

RethinkDB: coalesce example

In SQL, the coalesce or NVL function takes null values and replaces them with a defined value. To replicate this in RethinkDB, you can use the ‘branch’ function to test for null and remove it: r.db(’test’) .table(’users’) .map( (doc) => { return { first_name: null } } ).map( (doc) => { return { first_name: r.branch( […]

RethinkDB: CASE WHEN example

There are a couple ways to mimic the SQL “CASE WHEN” in RethinkDB. If the rows don’t have values in some columns, in SQL you’d default these with “CASE WHEN attr IS NULL … ELSE END”. To fix these, you can use map / merge: r.db(’test’).table(’users’) .map( (doc) => { return r.expr( { first_name: ”, […]