, , , , ,

Efficient storage of large IDs in Postgres

Imagine a system that uses large-ish hex string identifiers, e.g. similar to UUIDs. To make the math easy, lets say we make a table with a million have 20 character IDs: CREATE TABLE ids1 (id CHARACTER VARYING(20));   — takes 2091 ms INSERT INTO ids1 SELECT ‘0123456789ABCDEF0123’ FROM generate_series(1, 1000000) num We can get the […]

, , ,

Validating Application Performance in a cloud environment, using C#, Hangfire, and RethinkDB

The rise of “platform” sites (e.g. Heroku) enables developers to build and deploy web applications cheaply, without understanding operational problems. Typically these products let you purchase a combination of three things: web servers, databases, and background job execution. This typically works well, as long as you stay within their platform, and as long as their […]

, , ,

Comparing two SQL Server databases with Liquibase

Liquibase is a tool for managing database schemas (e.g. diffing schemas and writing migration scripts), and supports most major commercial and open-source relational databases. I’ll show how to use it to compare a migrated database to the original database, to find and fix discrepancies. To use it with SQL Server, you’ll first need to download […]

, ,

Auditing Data Modifications in Postgres

Implementing Auditing Storing every change to an application’s database allows for sophisticated forensic analysis- usage trends over time, as a long-range debugger or for implementing data correction features more typically found in version control software, like ‘cherry-pick’ or ‘revert’. Many products require this in the form of an audit trail, which in the simplest case […]

Postgres High Availability (Talk Notes, PGConf 2014)

I saw a talk by some operations engineers who work for ARIN at PGConf 2014. They presented their architecture for high availability, which uses CMAN, Corosync, and Pacemaker, which handle message queueing, a quorum system and Postgres communication, respectively (Corosync is functioning similar to Zookeeper, as I understand it). Their architecture was interesting for a […]

, ,

Testing the output of tuned Postgres queries

Tuning SQL queries is a useful skill, and while many people struggle to manage complex SQL, the work is actually a series of simple tricks. For instance, refactoring a query often brings about algorithmic improvements, and if you tune enough queries, finding mathematically equivalent forms becomes muscle memory. This includes operations like inlining common table […]

Tuning Postgres Queries

I have an application ported from using Oracle to Postgres – tuning queries is quite different between the two. While I know many of the Postgres devs have detailed blogs, I haven’t found many practical articles on the subject, so I’m writing up what I’ve found. I appreciate any further comments or resources people may […]

Querying Multiple Postgres Full-Text Indexes

Setup I’ve set up several scenarios to see how Postgres handles querying multiple full-text indexes. To demonstrate these scenarios I set up the following table and indexes, which has 1.1 million records: CREATE TABLE data2 ( author CHARACTER VARYING, id CHARACTER VARYING, email CHARACTER VARYING, company CHARACTER VARYING, DATE CHARACTER VARYING, message CHARACTER VARYING, github […]

GIN vs GiST For Faceted Search with Postgres Full Text Indexes

For this test, I set up a 1.1 million row table using data from github. If you’re not familiar with full-text in Postgres, they use the @@ operator to apply a query to a document, which looks like this: SELECT to_tsvector(’fat cats ate fat rats’) @@ to_tsquery(’fat & rat’); The order of the conditions doesn’t […]